Skip to content

Add :or branch support to the DBSP-standard engine#12

Merged
FiV0 merged 10 commits into
mainfrom
dbsp-or-branch
May 23, 2026
Merged

Add :or branch support to the DBSP-standard engine#12
FiV0 merged 10 commits into
mainfrom
dbsp-or-branch

Conversation

@FiV0

@FiV0 FiV0 commented May 21, 2026

Copy link
Copy Markdown
Owner

Summary

  • Extends the DBSP-standard incremental engine (hooray.dbsp) to support (or …) clauses with triple branches and arbitrary nesting. Implements the spec in specs/dbsp-or.md per the plan in specs/dbsp-or-plan.md.
  • Set-union semantics: each :or node compiles to a PlusOp chain of its branch streams fed into a DistinctOp. Nested :or is handled recursively (one Distinct per :or node, mirroring the descriptor tree) rather than flattened.
  • Reuses existing operator primitives (PlusOp, DistinctOp) — no Kotlin changes, no changes to hooray.query / hooray.incremental / hooray.core.

Approach

Seven tasks, each in its own commit:

T1 Tag every triple descriptor and plan node with :kind :triple; route assemble-pattern through a :kind dispatcher. Silent refactor.
T2 plan->circuit returns a flat :leaves vector (one entry per leaf triple, carrying :order) parallel to :inputs; push-deltas! walks :leaves instead of :patterns. Silent refactor.
T3 compile-pattern recognises [:or branches], recursively compiles branches (each :kind :triple or :kind :or), rejects :and/:not/:predicate/:fn branches with err/unsupported-ex.
T4 pattern-plan dispatches on :kind; :or recursively plans each branch with the same target so every branch's :out-vars matches the :or block's.
T5 assemble-or wires PlusOp chain + DistinctOp per :or node. Single-branch :or skips PlusOp (only Distinct); k-branch :or produces k-1 plus operators and exactly one distinct.
T6 End-to-end tests for flat :or: single-branch equivalence with bare triple, disjoint union, overlap-collapse via Distinct, outer join, 2-var :or in a chain, or-only query.
T7 End-to-end tests for nested :or: nested-equals-flat on per-tx delta multisets, nested with outer join, 3-level-deep nesting.

Out of scope

Branches inside :or other than triples (:and, :not, predicates, functions). Those still throw unsupported-ex at plan time. The :wcoj engine does not yet support or either, so no cross-engine equivalence harness was extended — the static hooray.query/query engine and hand-computed deltas serve as the test oracle.

Test plan

  • ./gradlew test — full suite green
  • ./gradlew build — clean
  • No file under org.hooray.*, hooray.query, hooray.incremental, or hooray.core modified
  • Manual smoke under (binding [h/*dbsp-version* :standard] (h/q-inc node q)) for or-bearing queries
  • Review T1/T2 silent refactors don't reorder leaf inputs (would surface as wrong deltas)
  • Confirm Open Questions in the spec (left-to-right fold, unconditional Distinct, redundant Distinct in nested or) reflect the chosen design

🤖 Generated with Claude Code

@FiV0 FiV0 force-pushed the dbsp-or-branch branch from 20ec43f to 54c5e2c Compare May 23, 2026 09:58
FiV0 and others added 8 commits May 23, 2026 12:00
Drops in the spec (specs/dbsp-or.md) and implementation plan
(specs/dbsp-or-plan.md) for or-branch support in the standard DBSP
engine, then lands T1: tag every triple descriptor returned by
compile-pattern and every triple plan node returned by pattern-plan
with :kind :triple, and route assemble-pattern through a case-based
dispatcher. Silent refactor; existing dbsp_test.clj coverage stays
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
assemble-triple now returns {:stream :handles :leaves} — flat
per-call vectors that an :or branch in a later phase will populate
with multiple entries. plan->circuit concatenates :handles into
:inputs and :leaves into a new top-level :leaves vector. push-deltas!
walks the leaves rather than (:patterns plan), so it no longer
assumes one input per top-level pattern. DbspQuery carries :leaves
alongside :inputs.

Silent refactor — no change to delta output. New assemble-leaves-test
pins the public shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
compile-pattern now recognises [:or branches] alongside [:triple ...].
Each branch is compiled recursively, so a branch can be either a
triple (:kind :triple) or a nested or (:kind :or) — nesting is
preserved, not flattened. :vars on an :or descriptor follows the
encounter order of its first branch (which matches the static engine's
contract that all branches share the same free-variable set).

Branches of any other clause-type (:and, :not, :predicate, :fn) throw
err/unsupported-ex naming the offending clause. The default case of
the top-level compile-pattern dispatch is unified with this same
unsupported-ex path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pattern-plan now branches on the descriptor's :kind: :triple keeps the
existing planning logic; :or recursively plans each branch with the
same target so every branch's :out-vars matches the :or block's. A
nested :or branch produces a nested :kind :or plan node, mirroring the
descriptor tree.

The outer plan function is untouched — it only reads :vars from
descriptors and :out-vars from plans, both of which are populated
uniformly by :triple and :or pattern-plan calls. Assembly for :or
plans lands in T5; this commit covers the plan layer in isolation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
assemble-pattern now has an :or arm. assemble-or recursively wires
each branch, folds the branch streams left-to-right with PlusOp, and
applies DistinctOp on the union to enforce set-union semantics. The
returned :handles and :leaves are flat vectors concatenated across
all branches in plan order, so nested :or nodes contribute all their
leaves to the top-level plan->circuit result.

Single-branch :or skips the PlusOp fold (reduce over an empty `rest`
returns the initial stream untouched) and just wires Distinct on the
sole branch's projection. k-branch :or therefore produces exactly
(k - 1) plus operators and exactly one distinct per :or node — and a
nested (or A (or B C)) produces two of each, mirroring the descriptor
tree, as called out in the spec's open question on redundant distincts.

End-to-end query execution under :standard for or-bearing queries is
unlocked by this commit; phase-3 tests (T6/T7) cover the runtime
behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six new deftests exercising :or through q-inc/transact/consume-delta!:

- single-branch (or B) delta matches the bare B delta
- two-branch :or returns the union of disjoint branches
- overlapping branches collapse via DistinctOp; partial retract emits
  no delta, full retract emits -1
- :or joined with an outer triple — adds and retracts
- 2-var :or joined into a chain (city + or [name | last-name])
- :or as the only :where pattern

All pass first run after T5 — the assemble-or wiring composes the
existing operators correctly without further fix-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new deftests covering recursive :or assembly at runtime:

- (or A (or B C)) produces the same per-transaction delta multiset as
  (or A B C), verified by replaying the same tx sequence against both
- nested :or joined with an outer triple returns the expected rows
- 3-level-deep nesting (or A (or B (or C D))) matches the flat form,
  exercising the recursive descent through assemble-or

All pass first run; no implementation change was needed for nested or
because compile-pattern, pattern-plan, and assemble-pattern all
dispatch on :kind and recurse through :or branches uniformly.

Completes the dbsp-or task list (T1–T7); the :standard engine now
supports `or` clauses with triple branches and arbitrary nesting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FiV0 FiV0 force-pushed the dbsp-or-branch branch from 54c5e2c to 52a4485 Compare May 23, 2026 10:00
@FiV0 FiV0 force-pushed the dbsp-or-branch branch from fffa05e to 76183eb Compare May 23, 2026 10:24
@FiV0 FiV0 merged commit ab1b66e into main May 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant