feat(storage): copy-on-write Overlay backend for sound, preview-gated effects#138
Open
ivarvong wants to merge 2 commits into
Open
feat(storage): copy-on-write Overlay backend for sound, preview-gated effects#138ivarvong wants to merge 2 commits into
ivarvong wants to merge 2 commits into
Conversation
… effects
Adds Pyex.Storage.Overlay — a staging backend that read-throughs to an inner
backend while accumulating writes/deletes in an overlay that is NOT committed
until the caller chooses. It's the storage half of a dry-run.
Run agent-generated code against an overlay and you get two things the program
cannot forge: the capability ledger of what it intended to do (on ctx /
%Pyex.Error{}), and the staged effects it would apply (Overlay.pending/1). A
human or policy engine gates those, then Overlay.commit/1 applies them — or
you drop ctx.storage to discard the run entirely.
The point is soundness, not best-effort: the overlay gives read-your-writes
exactly as a real backend does, so the program executes identically whether
its writes are staged or applied. With deterministic execution (seed:), the
run you previewed is byte-for-byte the run that commits — there is no
time-of-check/time-of-use gap between what you approved and what happens, the
hole every nondeterministic "ask permission then act" system has.
The test proves it: runtime_spans(dry-run) == runtime_spans(committed-run) for
a program that reads, writes (incl. read-your-writes over staged values),
deletes, and lists — and that the dry-run touches the real backend only on
commit, producing the identical final state.
This is a library primitive: callers compose plan → gate → commit. Full suite
+ Dialyzer green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9
… friction) CI's Elixir 1.19 Dialyzer treats MapSet.t() as opaque, so Overlay.new/1's @SPEC tripped contract_with_opaque. A map-as-set is non-opaque and equivalent for the staged-deletes set. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Pyex.Storage.Overlay— a staging backend that read-throughs to an inner backend while accumulating writes/deletes in an overlay that isn't committed until the caller chooses. It's the storage half of a dry-run.Run untrusted (agent-generated) code against an overlay and you get back two things the program cannot forge: the capability ledger of what it intended to do, and the staged effects it would apply. A human or policy engine gates them, then
commit/1applies them.Why it matters — soundness, not best-effort
Preview-then-commit ("show me what it'll do, I approve, then run it") is normally unsound: in a nondeterministic runtime the program can branch on the clock/RNG, so what you approved in the preview isn't guaranteed to be what runs at commit — a TOCTOU hole at the heart of every "agent asks permission" system.
The overlay gives read-your-writes exactly as a real backend does, so the program executes identically whether its writes are staged or applied. Combined with deterministic execution (
seed:), the run you previewed is byte-for-byte the run that commits. No gap between what you approved and what happens.This is what only a processless, deterministic, capability-ledgered interpreter can offer — it turns pyex from "a safer sandbox" into a verifiable effect system for agent compute.
The proof
test/pyex/storage/overlay_test.exsasserts, for a program that reads, writes (including a read-your-writes update over a staged value), deletes, and lists a prefix:runtime_spans(dry-run) == runtime_spans(committed-run)— the previewed ledger is the committed ledger.commit/1, and committing produces the identical final state as having run for real.Scope
A library primitive — callers compose plan → gate → commit; pyex owns the value, not the workflow.
mix format✅ ·mix compile --warnings-as-errors✅ ·mix test(6186) ✅ ·mix dialyzer✅🤖 Generated with Claude Code
https://claude.ai/code/session_019NokzcR7BiAigPgC78zpk9