feat(diff): FingerprintTree for directory-mode module loading#23
Merged
Conversation
FingerprintSource loads a single file and cannot resolve symbols
defined in sibling files. Pre-module trees whose functions call
sibling helpers (e.g. envconfig's Process() calling lookupEnv from
env_syscall.go) fail to type-check under FingerprintSource and the
fingerprint is unusable.
FingerprintTree loads the tree via packages.Load with full sibling
resolution. When the tree has no go.mod, the loader synthesizes one
through packages.Config.Overlay so resolution proceeds through a
canonical module path rather than falling back to
"command-line-arguments". LoadMeta exposes HadGoMod,
SynthesizedGoMod, ModulePath, and LoadErrors so callers can
distinguish the three load regimes.
The synthetic module path is a stable constant ("synthetic.local/
anonymous"), not basename-derived. The first draft of this fix
derived the synthetic path from filepath.Base(rootDir), which
reintroduced qualifier asymmetry in a new form: pairwise
comparisons load each side from its own temp directory, so basename-
derived paths differ across sides, deflating types.Type.String()-
based similarity on any signature containing user-defined types.
exec_v2 head-to-head showed -0.0968 deflation under basename-
derived paths; the identical-basename diagnostic recovered the
baseline bit-identically (0.8046 vs 0.8046); a three-way
comparison against real-go.mod confirmed agreement across all
three load regimes on the synthetic corpus (exec 0.8046, net
0.5946, syscall 0.5978 — nine measurements all bit-identical).
Also bump the go directive 1.24.0 -> 1.26.3 to match the installed
toolchain (the directive is a floor, not a target — the toolchain
was already 1.26.3 functionally, so this only raises self-declared
minimum). FingerprintSource fingerprints remain bit-identical on
the synthetic corpus before and after the bump, verified under
GOWORK=off against v4.0.0.
Add /semantic_firewall to .gitignore so the stray built binary at
repo root cannot ride along into future commits.
…tion The committed FingerprintTree synthesizes a go.mod with the stable constant module path "synthetic.local/anonymous" when no real go.mod is found. That works for self-contained single-package trees (the synthetic-corpus shape the fix was first validated against) but does not resolve same-module sub-package imports in real multi-package modules — the synthetic module identity does not match the real module path the source code imports. Real-corpus triage of the 3 genuine same-package-sibling commits in the pilot (go-cmp 8ebdfab3, x/text c8872a1a, x/text db455d00) showed each one's failing sub-package directory exists on-disk at the path the real import declares, so a synthesized go.mod declaring the REAL module name at the worktree root would resolve the imports. The fix shape (moduleNameHint parameter + load from tree root) is mechanism- verified but implementation-deferred — the 3-commit payoff did not justify the engine-API + bench-runner refactor at this stage. This commit only documents the limitation in the FingerprintTree and syntheticModulePath doc comments so the boundary lives in the code, not just in conversation. No behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FingerprintTree/FingerprintTreeAdvancedtopkg/diff/fingerprinter.gogolang.org/x/tools/go/packages(multi-file package resolution), eliminating the sibling-symbol-missing failure classmodule synthetic.local/anonymousgo.mod viapackages.Config.Overlay(zero disk writes) when no real go.mod is found — fixes asymmetric type-qualifier inflation on pre-module commitsGOPROXY=off,CGO_ENABLED=0,GOFLAGS=-mod=readonly)Coverage impact (from bench harness pilot)
sibling-symbol-missingbroken-dependency/other-package-load-errorTest plan
go test ./pkg/diff/...passesTestFingerprintTree_SyntheticGoMod— no real go.mod present, synthetic injectedTestFingerprintTree_RealGoMod— real go.mod used, metadata flags correctTestFingerprintTree_SiblingFiles— multi-file package, all functions discovered🤖 Generated with Claude Code