feat: regenerate expected heredocs via CANON_REGENERATE_EXPECTED: https://github.com/lutaml/canon/issues/146#147
Open
opoudjis wants to merge 1 commit into
Open
feat: regenerate expected heredocs via CANON_REGENERATE_EXPECTED: https://github.com/lutaml/canon/issues/146#147opoudjis wants to merge 1 commit into
opoudjis wants to merge 1 commit into
Conversation
Add a new opt-in mode that rewrites the source heredoc backing a
failing `be_*_equivalent_to` assertion with the prettyprinted received
value, so fixture rebaselining is `bundle exec rspec` + `git diff` +
commit instead of hundreds of manual copy-paste cycles.
Surface: env var `CANON_REGENERATE_EXPECTED=true`. Default OFF. When
set, `SerializationMatcher#matches?` on failure:
1. Captures `caller_locations` (only when env var is set, so passing
assertions pay no overhead).
2. Parses the caller spec file with Prism, locates the matcher call
at the failing line, identifies the `expected` argument and the
enclosing `it` block.
3. Resolves the `expected` to a heredoc: inline literal, or local var
tracked back to its most-recent assignment within the same block.
4. Pretty-prints the received value with the appropriate
`Canon::PrettyPrinter` (XML or HTML; both are public APIs already
used by `CANON_<FORMAT>_DIFF_SHOW_PRETTYPRINT_RECEIVED`).
5. Atomically rewrites the heredoc body in place.
6. Returns the assertion as passing so CI does not fail mid-rebaseline.
7. Logs `[canon:rebaseline] rewritten <file>:<line>` (or
`skipped_<reason>` for cases that can't be safely rewritten).
Negated matchers (`.not_to`) are never rewritten (via a separate
`does_not_match?` that bypasses the rebaseliner hook). Multiple
rewrites within the same process correctly handle line-number shifts
between the in-memory source (what Ruby reports via `caller_locations`)
and the on-disk file (which has changed under us); a per-file
cumulative line-shift tracker translates each subsequent caller line
into its current file position.
v1 supports:
- `<<~XML` / `<<-XML` / `<<XML` heredoc assigned to a local variable
in the same `it` block, including the metanorma-iso pattern of
multiple sequential assignments to the same variable.
- Inline heredoc passed directly to the matcher.
- Substitution chains on the actual side (the matcher receives the
post-substitution value, so idempotency holds next run).
v1 skips with a warning:
- Heredoc with `#{}` interpolation (v2 plans token-preserving merge).
- Expected value from a method call.
- Expected value from `let`/`shared_context` in a different file.
- Inline string literal expected (no heredoc).
Architecture (lib/canon/rebaseliner/):
- AtomicWriter — tempfile + rename, cross-device safe.
- Logger — single-line stderr writes with `[canon:rebaseline]` prefix.
- HeredocSpec — struct describing a heredoc's byte range and style.
- HeredocRewriter — re-indents `<<~` bodies, writes verbatim for
`<<-`/`<<`.
- HeredocLocator — Prism-based AST walk, "most-recent assignment"
semantics for local-var references.
- CallSiteResolver — Prism parse of the spec file, finds the matcher
call and its enclosing `it` block.
- Rebaseliner (orchestrator) — `enabled?`, `rewrite!`, line-shift
tracker.
Matcher hook in `lib/canon/rspec_matchers.rb`. Env-var schema
registered in `lib/canon/config/env_schema.rb` for `--env-help`
discoverability. `prism` added as a runtime dependency (in Ruby 3.3+
stdlib; gem on 2.7-3.2, matching canon's gemspec floor).
Tests: 10 fixture cases covering squiggly/dash/inline heredocs,
multi-assignment within `it` block, interpolation skip, inline-string
skip, method-call skip, negated-matcher no-op, and post-rewrite
idempotency (re-run with env var OFF still passes). Full suite green:
2222 examples, 0 failures.
Documentation: `docs/features/regenerate-expected.adoc` (internal,
architecture + supported forms + roadmap); README link.
v2 (separate PR per the discussion on #146): interpolation support
via token preservation, JSON/YAML prettyprinter wiring, `canon
regenerate SPEC_GLOB` Thor subcommand, parallel-rspec safety,
optional `rubocop -A` post-rewrite formatter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #146.
Summary
Adds opt-in mode
CANON_REGENERATE_EXPECTED=truethat rewrites the source heredoc backing a failingbe_*_equivalent_toassertion with the prettyprinted received value. Default OFF; passing assertions are never touched; negated matchers (.not_to) are never rewritten.Workflow:
What's new
v1 supports:
<<~XML/<<-XML/<<XMLheredoc assigned to a local variable in the sameitblock.outputbetween format-specific expects.expect(strip_guid(actual).gsub(...))).v1 skips with
[canon:rebaseline] skipped_<reason>warning:#{}interpolation (v2 plans token-preserving merge).load_fixture(...)).let/shared_contextin a different file.Line-shift handling. Multiple rewrites within one process correctly handle line numbering shifts between the in-memory source (which Ruby's
caller_locationsreports against) and the on-disk file (which the rebaseliner has changed). A per-file cumulative shift tracker translates each subsequent caller line into its current file position. Tested via the multi-assignment fixture.Implementation
New module
Canon::Rebaselinerunderlib/canon/rebaseliner/:AtomicWriter— tempfile + rename, cross-device safe.Logger— single-line stderr writes with[canon:rebaseline]prefix.HeredocSpec— struct describing a heredoc's byte range and style.HeredocRewriter— re-indents<<~bodies, writes verbatim for<<-/<<.HeredocLocator— Prism-based AST walk; "most-recent assignment" semantics for local-var references.CallSiteResolver— Prism parse, finds the matcher call and its enclosingitblock.Rebaseliner(orchestrator) —enabled?,rewrite!, line-shift tracker.Matcher hook in
lib/canon/rspec_matchers.rb. Env-var schema registered inlib/canon/config/env_schema.rbfor--env-helpdiscoverability.caller_locationsis only captured when the env var is set, so passing assertions pay no overhead.prismadded as a runtime dependency (stdlib in Ruby 3.3+; thin gem on 2.7-3.2).Tests
10 new fixture cases under
spec/fixtures/rebaseliner/:it-block rewrite.Plus unit tests on
Rebaseliner.enabled?env-var truthiness.Full canon suite: 2222 examples, 0 failures, 1 pending (no regressions).
Internal documentation
docs/features/regenerate-expected.adoccovers architecture, supported forms, line-shift handling, limitations, and v2 roadmap. README updated to link.v2 (separate PR)
Per discussion on #146:
#{var}fragments in the prettyprinted actual).canon regenerate SPEC_GLOBThor subcommand.parallel_rspecsafety.rubocop -Apost-rewrite formatter.Test plan
🤖 Generated with Claude Code