Skip to content

Fix proxy leaks and stale paths in produce()#44

Open
berendkleinhaneveld wants to merge 8 commits into
masterfrom
fix-proxy-leaks-in-produce
Open

Fix proxy leaks and stale paths in produce()#44
berendkleinhaneveld wants to merge 8 commits into
masterfrom
fix-proxy-leaks-in-produce

Conversation

@berendkleinhaneveld

Copy link
Copy Markdown
Collaborator

Problem

Reading a nested container from a produce() draft returns a proxy (DictProxy/ListProxy/SetProxy). Writing such a value back into the draft leaked the proxy object itself into both the result and the recorded patch values:

def recipe(draft):
    draft["b"] = draft["a"]  # result["b"] was a DictProxy

result, patches, reverse = produce({"a": {"x": 1}}, recipe)
json.dumps(result)        # TypeError
apply(base, patches)      # != result

It was extra sneaky because repr passes through, so the leaked proxy printed like a plain dict. Two deeper problems surfaced while investigating, both hit by scene-tree regrouping (move an item, nest it in a new group, keep editing it):

  • Mutating an item between detach and reattach recorded a patch against the already-removed path, so replaying the patches raised IndexError.
  • A held proxy recorded against its old index after list shifts (insert/del/pop/slices), silently corrupting patches: replaying them edited the wrong element.

Fix

Proxies never leak into results or patches (f0e604b)

  • Every write entry point unwraps values: proxies — also when nested inside plain dicts/lists/tuples — are replaced by plain deep copies of their data.
  • PatchRecorder snapshots with a hand-rolled _snapshot (faster than copy.deepcopy for JSON-like data, falls back to deepcopy for unknown types such as observ proxies).
  • All proxy classes implement __deepcopy__ returning plain data (mirroring observ's fix in Fix deepcopy of Proxy objects creating unregistered zombie proxies observ#165).
  • setdefault now returns through __getitem__, so mutations on the returned container are tracked (previously they went unrecorded).

DictProxy.pop falsy defaults (1f80f71)

  • pop(key, None) / pop(key, 0) raised KeyError because the code tested the truthiness of the default instead of its presence. Now uses a sentinel, matching dict.pop.

Live paths through parent links (de991c6)

  • Proxies store (parent, key) and compute their path on demand by walking up to the root; the child-proxy cache doubles as a registry of handed-out proxies.
  • List structural ops reindex held proxies instead of discarding them; sort/reverse remap them by element identity.
  • Removing or replacing a value detaches its proxy (transitively). Detached proxies still mutate their data but stop recording — the reinserting write's snapshot carries the changes.
  • Assigning a detached proxy directly back into the draft re-attaches it (move semantics), so held references stay live across regrouping; a proxy still attached elsewhere is copied, with a cycle guard against adopting an ancestor.
  • list.insert now clamps out-of-range/negative indices like list.insert, so recorded paths match the actual insertion point.

Semantics

Assigning a proxy that is still attached elsewhere has value semantics (a copy is stored) — the only semantics JSON Patch can express, and it matches the scene-tree invariant that a node has exactly one parent. Assigning a detached proxy is a move. The invariant throughout: apply(base, patches) == result and apply(result, reverse) == base.

Tests & performance

  • 44 new tests across test_produce_proxy_leak.py (every write entry point, serialization, aliasing round-trips) and test_produce_live_paths.py (re-attachment, detachment, index shifts, sort/reverse, full regroup scenario). 335 tests pass.
  • Benchmark medians are at parity with master: location tokens are built in a single pass with a fast path for root-level proxies, patch paths use a single Pointer allocation, and scalar writes skip the unwrap/adopt machinery via an inline type check.

🤖 Generated with Claude Code

berendkleinhaneveld and others added 5 commits June 9, 2026 21:11
Reading a nested container from a draft returns a proxy wrapper. When a
recipe wrote such a value back into the draft (setitem, append, insert,
extend, update, setdefault, slice assignment), the proxy object itself
was stored in the result and deepcopied into the patch values, making
results unserializable and patches inconsistent with the result.

- Unwrap values at every write entry point: proxies (also when nested
  inside plain dicts/lists/tuples) are replaced by plain deep copies of
  their data. Assigning a proxy thus has value semantics, which is the
  only semantics JSON patches can express.
- Replace copy.deepcopy in PatchRecorder with a hand-rolled _snapshot
  that copies JSON-like types directly (and unwraps proxies), falling
  back to deepcopy for unknown types such as observ proxies.
- Add __deepcopy__ to all proxy classes so any deepcopy of a proxy
  yields plain data.
- setdefault now returns through __getitem__ so mutations on the
  returned container are tracked (previously they went unrecorded).

Three known limitations remain as strict xfails, to be addressed by
parent-linked live paths (Tier 2): mutating a detached item records
against its stale path, and held proxies record stale indices after
list shifts.

Benchmarks are unchanged within noise.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pop() tested the truthiness of the default instead of whether one was
provided, so pop(key, None), pop(key, 0) or pop(key, "") raised
KeyError on a missing key. Use a sentinel to match dict.pop semantics.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Proxies previously stored an absolute path at creation time, which went
stale as the tree changed: a held proxy recorded against its old index
after list shifts (silently corrupting patches), and proxies for removed
values kept recording against paths that no longer existed.

Proxies now store (parent, key) and compute their path on demand by
walking up to the root. The child-proxy cache doubles as a registry of
handed-out proxies, so structural changes can keep them correct:

- insert/delete/pop/slice ops on lists reindex cached child proxies
  instead of discarding them; sort and reverse remap them by element
  identity.
- Removing or replacing a value detaches its proxy (transitively, via
  the parent chain). Detached proxies still mutate their data but stop
  recording: the mutations are captured by the snapshot of whatever
  write re-inserts the data later.
- Assigning a detached proxy directly back into the draft re-attaches
  it at the new location (move semantics), so held references stay live
  across detach/reattach regrouping. A proxy still attached elsewhere
  is copied, and a cycle guard prevents adopting an ancestor.
- list.insert now clamps out-of-range and negative indices like
  list.insert does, so recorded paths match the actual insertion point.

This turns the three xfailed scene-tree tests green and adds a suite
covering re-attachment, detachment and index-shift scenarios.

Benchmark medians are at parity with the baseline: location tokens are
built in a single pass with a fast path for root-level proxies, and
patch paths are constructed with a single Pointer allocation, matching
the allocation count of the previous absolute-path scheme.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Cover the remaining produce.py branches with behavioral tests: tuple
and frozenset values in patch snapshots, proxies nested inside tuples,
deepcopy of draft proxies inside a recipe, and extended-slice (step)
assignment/deletion with held child proxies. The proxy branch of
_snapshot is unreachable through public flows (write paths pre-unwrap
values) and is covered directly as defense in depth.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Three improvements, all verified against the pre-branch benchmark
baseline:

- Build reverse patches with append + a single reverse() in finalize()
  instead of insert(0, ...) per patch, which made recording O(n^2) in
  the number of mutations (73x slower at 20k patches).
- Create copy-mode drafts with _snapshot instead of copy.deepcopy
  (~2x faster for JSON-like data; unknown types still fall back to
  deepcopy). Unlike deepcopy, _snapshot does not preserve shared
  references within the base; aliased subtrees become independent
  copies, which is what patches can express anyway.
- Skip the _snapshot call for scalar values in the recorder, saving a
  function call per recorded patch.

All produce benchmarks now run at or faster than the master baseline
by median; the largest wins are copy-mode recipes (nested structure
-38%, dict mutations -15%, sparse-on-large-object -10%).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@berendkleinhaneveld

Copy link
Copy Markdown
Collaborator Author

Two follow-up commits:

  • d17927a restores 100% test coverage (CI was failing the coverage gate): behavioral tests for tuple/frozenset snapshot values, proxies nested in tuples, deepcopy of draft proxies, and extended-slice (step) assignment/deletion with held child proxies.
  • 061f418 performance: reverse patches are now built with append + one reverse() instead of insert(0, …) per patch (which was O(n²) — 73× slower at 20k mutations), copy-mode drafts use _snapshot instead of copy.deepcopy (~2× faster for JSON-like data), and the recorder skips snapshotting scalars. All produce benchmarks now run at or faster than master by median; biggest wins in copy mode (nested structure −38%, dict mutations −15%).

🤖 Generated with Claude Code

Parent links introduced reference cycles (child -> parent -> child
cache), turning every proxy into cyclic garbage that only the gc could
reclaim. The resulting collector pauses showed up as large latency
spikes: p99 for a small nested produce() was 190us with gc enabled vs
6us with gc disabled, inflating benchmark means (with huge stddev)
while medians were unaffected.

The recorder now registers every proxy it creates and finalize()
breaks the cycles, so proxies are freed by reference counting again
(p99 back to ~4us, zero cyclic objects left behind). Released proxies
are also marked detached, so a proxy leaked out of a recipe can no
longer record into the already-returned patch lists.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@berendkleinhaneveld

Copy link
Copy Markdown
Collaborator Author

The benchmark CI failure was real, not a fluke — but subtle: the parent links from the live-paths commit created reference cycles (child._parent → parent → parent._proxies → child), so proxies became cyclic garbage that only the gc could reclaim. The collector pauses showed up as rare latency spikes: p99 for a small nested produce() was 190µs with gc enabled vs 6µs with gc disabled, identical medians. That's why exactly the in_place produce variants failed the mean check with 5–18× inflated StdDev (the copy variants had offsetting gains that kept them under the threshold), and why test_diff_list_many_appends — untouched by this PR — got dragged over the line too.

Fixed in 990db03: the recorder registers every proxy it creates, and finalize() breaks the cycles so proxies are freed by refcounting again — zero cyclic objects left per call, p99 back to ~4µs. As a bonus, released proxies are marked detached, so a proxy leaked out of a recipe can no longer record into the already-returned patch lists. Locally all produce benchmarks are now at-or-faster than master in both mean and median, with StdDev back at baseline levels.

🤖 Generated with Claude Code

The CI benchmark gate kept flagging test_produce_observ_nested_in_place
(+5-9% mean), a benchmark dominated by fixed per-produce() bookkeeping:
proxy creation, the proxy registry and the cycle-breaking walk in
finalize(). Measured locally against master with interleaved runs, the
overhead was a real +3.4%.

- Parent links are now weak references, so proxies never form reference
  cycles in the first place: no registry, no cycle-breaking walk, and
  proxies are freed by reference counting. A dead parent reference
  means a detached ancestor was collected, which is treated as
  detached - the semantics the parent chain already implied.
- finalize() now only reverses the reverse patches and detaches the
  root proxy; leaked proxies stay inert because every attached proxy
  computes its path through the (now detached) root.
- The child-proxy cache dict is allocated lazily on first wrap; most
  proxies are leaves that never hand out children.

This halves the regression to +1.9% on that benchmark while all other
produce benchmarks remain at or faster than the master baseline.

Also gate the CI benchmark comparison on median instead of mean: means
on shared runners are easily skewed by scheduler/gc latency spikes
(test_list_diff_identical showed +16% mean on code this branch does
not touch).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@berendkleinhaneveld

Copy link
Copy Markdown
Collaborator Author

Follow-up on the two benchmark CI failures (b85b8bf):

test_produce_observ_nested_in_place (+9.1%, then +5.02%) was a real, small regression. Measured locally against master with interleaved runs: +3.4%, coming from fixed per-produce() bookkeeping (proxy registry, cycle-breaking walk in finalize(), eager cache dict allocation). Fixed by a cleaner design: parent links are now weak references, so proxies never form reference cycles at all — no registry, no teardown walk, refcount-based freeing, and leaked proxies stay inert because finalize() just detaches the root that every attached proxy's path runs through. Plus the child-proxy cache is allocated lazily (most proxies are leaves). Result: regression halved to +1.9% on that one micro-benchmark; everything else remains at-or-faster than master, still 0 cyclic objects per call, 344 tests / 100% coverage.

test_list_diff_identical (+16%) is in diff.py, which this PR doesn't touch — conclusive evidence that gating on mean:5% flakes on shared runners. The workflow now gates on median:5% instead, which is robust to scheduler/GC latency spikes while still catching real regressions (medians exposed every genuine slowdown we found during this work).

🤖 Generated with Claude Code

Changing the CI benchmark gate is out of scope for this PR. The check
is optional and used to understand what is happening; the remaining
~2% mean regression on test_produce_observ_nested_in_place is the
known fixed per-produce() bookkeeping cost of live paths and safe
teardown.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@berendkleinhaneveld

Copy link
Copy Markdown
Collaborator Author

Seems to be pretty ok: some improvements in performance and only a few minor 'regressions' but nothing too significant. But above all it should work now as intended which is more important 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant