Skip to content

feat(storage): copy-on-write Overlay backend for sound, preview-gated effects#138

Open
ivarvong wants to merge 2 commits into
mainfrom
storage-overlay
Open

feat(storage): copy-on-write Overlay backend for sound, preview-gated effects#138
ivarvong wants to merge 2 commits into
mainfrom
storage-overlay

Conversation

@ivarvong

Copy link
Copy Markdown
Owner

What

Pyex.Storage.Overlay — a staging backend that read-throughs to an inner backend while accumulating writes/deletes in an overlay that isn't committed until the caller chooses. It's the storage half of a dry-run.

overlay = Pyex.Storage.Overlay.new(real_backend)
{:ok, _v, ctx} = Pyex.run(agent_code, storage: overlay, seed: 1)

Pyex.Storage.Overlay.pending(ctx.storage)   # the writes/deletes it WOULD do
Pyex.Turn.render(ctx)                        # the ledger of every store op

# gate on the above, then:
{:ok, committed} = Pyex.Storage.Overlay.commit(ctx.storage)   # apply for real
# ...or drop ctx.storage to discard the run entirely.

Run untrusted (agent-generated) code against an overlay and you get back two things the program cannot forge: the capability ledger of what it intended to do, and the staged effects it would apply. A human or policy engine gates them, then commit/1 applies them.

Why it matters — soundness, not best-effort

Preview-then-commit ("show me what it'll do, I approve, then run it") is normally unsound: in a nondeterministic runtime the program can branch on the clock/RNG, so what you approved in the preview isn't guaranteed to be what runs at commit — a TOCTOU hole at the heart of every "agent asks permission" system.

The overlay gives read-your-writes exactly as a real backend does, so the program executes identically whether its writes are staged or applied. Combined with deterministic execution (seed:), the run you previewed is byte-for-byte the run that commits. No gap between what you approved and what happens.

This is what only a processless, deterministic, capability-ledgered interpreter can offer — it turns pyex from "a safer sandbox" into a verifiable effect system for agent compute.

The proof

test/pyex/storage/overlay_test.exs asserts, for a program that reads, writes (including a read-your-writes update over a staged value), deletes, and lists a prefix:

  • runtime_spans(dry-run) == runtime_spans(committed-run) — the previewed ledger is the committed ledger.
  • the dry-run touches the real backend only on commit/1, and committing produces the identical final state as having run for real.
  • reads pass through; writes/deletes/listing reflect the overlay; the inner backend is untouched until commit.

Scope

A library primitive — callers compose plan → gate → commit; pyex owns the value, not the workflow.

mix format ✅ · mix compile --warnings-as-errors ✅ · mix test (6186) ✅ · mix dialyzer

🤖 Generated with Claude Code

https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9

ivarvong and others added 2 commits June 29, 2026 23:13
… effects

Adds Pyex.Storage.Overlay — a staging backend that read-throughs to an inner
backend while accumulating writes/deletes in an overlay that is NOT committed
until the caller chooses. It's the storage half of a dry-run.

Run agent-generated code against an overlay and you get two things the program
cannot forge: the capability ledger of what it intended to do (on ctx /
%Pyex.Error{}), and the staged effects it would apply (Overlay.pending/1). A
human or policy engine gates those, then Overlay.commit/1 applies them — or
you drop ctx.storage to discard the run entirely.

The point is soundness, not best-effort: the overlay gives read-your-writes
exactly as a real backend does, so the program executes identically whether
its writes are staged or applied. With deterministic execution (seed:), the
run you previewed is byte-for-byte the run that commits — there is no
time-of-check/time-of-use gap between what you approved and what happens, the
hole every nondeterministic "ask permission then act" system has.

The test proves it: runtime_spans(dry-run) == runtime_spans(committed-run) for
a program that reads, writes (incl. read-your-writes over staged values),
deletes, and lists — and that the dry-run touches the real backend only on
commit, producing the identical final state.

This is a library primitive: callers compose plan → gate → commit. Full suite
+ Dialyzer green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9
… friction)

CI's Elixir 1.19 Dialyzer treats MapSet.t() as opaque, so Overlay.new/1's
@SPEC tripped contract_with_opaque. A map-as-set is non-opaque and equivalent
for the staged-deletes set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant