fix: batch resolve #824–#956 (129 issues)#959
Merged
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #959 +/- ##
==========================================
+ Coverage 85.78% 85.81% +0.02%
==========================================
Files 106 107 +1
Lines 65768 65783 +15
Branches 65374 65389 +15
==========================================
+ Hits 56419 56450 +31
+ Misses 8918 8903 -15
+ Partials 431 430 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
551bf2e to
4595dd7
Compare
The const doc scoped the bump rule to "the weighted composite formula" while the module note required a bump for any formula change. They disagreed: in percentile mode finalize stamps the same risk_score_version and apply_percentile overwrites risk_score, yet the const doc implied the percentile blend was unversioned. A maintainer following the const doc literally could change the blend without bumping the version, silently corrupting cache reuse and cross-run comparisons. Reword the const doc to state a single RISK_SCORE_VERSION versions both formulas, list what a bump covers for each (weighted term set/weights/bumps; percentile extractor set/mid-rank/normalization), align the module rule, add the contract to apply_percentile's doc, and update the book and the version_is_two test comment to match. Doc-only; no behavior change. Fixes #824
The emit_section_md (Markdown) and emit_html_section (HTML) doc comments justified their escaping behaviour by citing "the `>` in the many-parameters heading", but #677 moved the `>3` floor out of every hotspot title into the legend/Args column, so no current concept carries a `>`. Rewrite both comments to state the real reasons (Markdown headings are not table cells; HTML escaping is defensive) and link HotspotTitle::render, which already documents that no concept carries a `>`. Fixes #847
The clippy::too_many_arguments allow-comment on `analyze` claimed "Eight keyword-only PyO3 args", but the `#[pyo3(signature = ...)]` list has only six args after the `*` marker (exclude_tests, allow_lossy_path, skip_generated, metrics, vcs, vcs_per_function). `path` is positional-only and `py` is the injected GIL token. Correct the count to "Six keyword-only PyO3 args (plus the positional-only `path`)", matching the sibling `analyze_batch` comment. Comment-only; the allow itself stays (8 Rust params exceed clippy's threshold of 7). Fixes #853
The module docstring documented compute-metrics and compute-ci-metrics with a -l grammar flag, but argparse registers -g/--grammar for all three subcommands. Copying the documented invocations verbatim failed with "error: unrecognized arguments: -l". Align the two stale examples with the actual flag. Fixes #871
generate_json passed escape=true to get_token_names, selecting the double-backslash escape form intended for values that survive a second source-string interpretation layer. JSON string literals are parsed exactly once, so every token containing ", \, tab, newline, or carriage return was over-escaped and decoded back to the wrong text (a " token decoded to \", a tab token to a literal backslash-t). Switch to escape=false via a named JSON_TOKEN_ESCAPE constant, matching the Rust and Go generators. Add round-trip regression tests that render JsonTemplate and assert the emitted JSON decodes back to the original token; they fail under escape=true and pass under escape=false. Fixes #862
Under a `--cors <allow-list>` policy, `Vary: Origin` was set only on responses to *matched* origins (inside `apply_cors_headers`'s `Echo` arm). Responses to unlisted origins and same-origin requests carried no `Vary`, so a shared cache could serve one origin's response to another — the cross-origin cache contamination the code's own RFC 9110 §12.5.5 comment cites. Move the `Vary` responsibility out of the matched-only `Echo` arm and into the middleware, keyed on a new `CorsPolicy::varies_on_origin` predicate that is true only for `AllowList` (Wildcard answers `*` uniformly and Disabled emits nothing, so neither depends on Origin). Append rather than insert so any inner-handler `Vary` is preserved and the matched path is not duplicated. Fixes #859
The inline comment in `trend::timestamps` claimed `points` is "validated `<= MAX_TREND_POINTS`", but the function is `pub`, never calls `validate_points`, and enforces no such bound. Replace it with an accurate description: `points` is an unvalidated caller-supplied count whose totality comes solely from the `try_from(...).unwrap_or(i64::MAX)` fallback, and `points - 1` cannot underflow because of the `points <= 1` early return. Also note on the doc comment that the precondition is not re-checked here, so a direct caller passing a large `points` allocates a correspondingly large `Vec` (the bound lives in `validate_points`). Comment-accuracy only; behavior unchanged, existing timestamps_* tests cover the function. Fixes #826
The module-level doc in tests/common/fixtures.rs listed both sarif_test.rs and checkstyle_test.rs as consumers of the shared fixture builders, but sarif_test.rs imports only common::validators and constructs OffenderRecord inline. Only checkstyle_test.rs uses `rec`; empty_space has no current consumer. List only the real consumer so the "Used by" note no longer overstates adoption. Fixes #939
The skip-rationale comment cited only the now-closed #83 and framed the skip as pending "until the issue is resolved," implying a live blocker under a number that no longer tracks anything. Update it to name #83 as the origin and the open #86 as the active tracker for the five remaining excluded files, matching tests/README.md. Fixes #946
The `BaselineEntry::value` doc described only the non-finite skip, but both the read (`from_str`) and write (`from_violations`) filters also drop negative values via `is_sign_negative()` (catching `-0.0`, which `< 0.0` misses). A maintainer reading the field doc could have assumed negatives round-trip and loosened the filter, reintroducing the double-signed `[regr +-N%]` tag bug the check guards against. Pure documentation change; existing tests already cover the filter behaviour. Fixes #830
The hard-tier comment in manifest_soft_threshold_subtable_applies_only_at_soft_tier said `classify` is cyclomatic 5, contradicting the authoritative BRANCHY_RUST constant doc (== 4) and the sibling manifest_headroom_scales_thresholds_at_soft_tier test (== 4). Verified via `bca metrics` on the BRANCHY_RUST fixture: the `classify` function space reports cyclomatic sum 4; the value 5 is the file/unit-level roll-up (+1 for the unit space), not the function's complexity. Comment now agrees with both siblings. Comment-only; no behavior change. Fixes #911
The run() docstring in metric_selection.py called bca.METRIC_NAMES a tuple[str, ...] (the native _native.METRIC_NAMES type), but the re-exported public constant is tuple[MetricName, ...] — a StrEnum, hence str-comparable, which is why "halstead" in bca.METRIC_NAMES works. It also framed the membership guard as testing the ["loc", "cyclomatic"] selection; the guard is an ABI/catalog presence smoke check for an unrelated metric. Corrected both while keeping the file's line count stable so metrics.md's :17:47 include window still resolves. Fixes #888
The vcs.rs module doc and an inline comment named a vcs_metrics() Python entry point that never shipped; #612 renamed the surface to the big_code_analysis.vcs.rank / trend / commit / score_diff facade. Map each Rust bridge to its live vcs.* entry point and enumerate the inject/batch-helper bridges the doc previously omitted. Fixes #857
The MetricsFormat::Text doc comment said the format "ignores --output", but #661 made an explicit `--format text` combined with --output/--output-dir a hard error rather than a silent no-op (normalize_text_format collapses text to None, then resolve_structured_output dies on a destination without a structured format). Update the comment to describe the hard-error behavior. Pure doc fix; no behavior change. Fixes #841
The FileMarkers doc claimed JSON output preserves the raw on-disk path, but render_json applies strip_path_prefix to every marker and baseline path exactly as the text and Markdown renderers do (pinned by strip_prefix_trims_displayed_paths_in_every_format). Rewrite the comment to state that path is the raw walked path and every renderer applies strip_prefix at render time, so --strip-prefix trims JSON output too. Also drop the garbled "trimmed at collection time is not done here" clause. Documentation-only; no behavior change. Fixes #842
The scrub_ci_env doc comment claimed the big-code-analysis-web bin() builder delegates to this helper. It does not and cannot: Cargo does not share test modules across workspace members, web's bin() does no env scrubbing, and bca-web reads no GITHUB_* vars (only BCA_MAX_ORPHANED_TASKS) so it has nothing to scrub. Rewrite the final paragraph to describe only the CLI-crate cli() call sites that genuinely delegate (via bca_command / cli_in) and note why the web crate intentionally needs no scrubbing. Fixes #898
The control_characters_in_message_replaced test injects \u{0001}
(SOH, U+0001) but the comment claimed the metric name carries a NUL
(U+0000). The two are distinct C0 controls; the comment now names
the actual code point. Comment-only — the production substitution
arm treats every C0 control identically, so behaviour is unchanged.
Fixes #928
The retry_with_single_attempt_calls_once comment cited a saturating_sub guard that does not exist in retry_transient. The real mechanism is the empty `1..attempts` range, which never runs the loop body for attempts 0 or 1, so the closure fires exactly once with no counter to underflow. Reword the comment to match the production comment in blame.rs and drop the false reference. Comment-only; the existing calls == 1 assertions already pin the invariant. Fixes #931
The comment called the flattened keys "case-distinct" and presented that as why they map cleanly onto SQLite columns. SQLite identifiers are case-insensitive, so case-only-distinct names collide. The real safety property is case-insensitive uniqueness (the N1/n1 Halstead collision was removed in #511). Reword to state that guarantee. Fixes #884
The sarif_output example labeled len(sarif) as "bytes", but len() on a
str counts Unicode code points, not bytes. For any non-ASCII SARIF
document the UTF-8 file written by write_text(..., encoding="utf-8") is
larger than len(sarif) reports. Encode to UTF-8 before measuring so the
printed figure matches the on-disk file size.
The edit stays on the same line, preserving the book's
{{#include ...:17:33}} range in python/sarif.md.
Fixes #886
`test_lang_values_exactly_match_supported_languages` asserted `[str(m) for m in members] == [m.value for m in members]`, which is tautological for a StrEnum (`str(member)` is `member.value` by construction) and can never fail. The "in declaration order" comment described an ordering contract no assertion enforced. Replace the no-op line with an independent-source check against the raw `_native.supported_languages()` slug table, pinning both each member's value and its order to the source of truth. Verified non-vacuous: under a simulated value mismap or order drift the old line still passes while the new one fails. Fixes #919
The #919 strengthening imported the private `_native`, which pyright (strict) rejects as reportPrivateUsage, and compared `supported_languages()` members against `_native.supported_languages()` — a tautology, since those members are built from that same native list via `Lang(name)`. Pin the enum's declaration order (`list(Lang)`) against the native-sourced slug list obtained through the public `supported_languages()` facade instead, so the test catches a slug mismap or a declaration-order drift while touching only public API.
The diff-baseline and diff commands gained an opt-in --exit-code flag (#692) that exits 2 (EXIT_GATE_BREACH) when the filtered diff is non-empty, but several doc comments still asserted an unconditional "always exits 0 on success". Align the function doc, the clap help (propagated to the man pages), the DiffArgs struct doc, and the DiffError doc with the real contract: exit 0 by default, 2 under --exit-code on a non-empty diff, 1 on a tool error. Exemptions is read-only with no --exit-code path, so its "always exits 0" claim is correct and left unchanged. Regenerated man pages via cargo xtask. Fixes #829
The comment on materialize_tree_rejects_non_empty_dest claimed the empty-dest guard "precedes any git invocation" and that the test runs "without a git repo", both false and internally contradictory. materialize_tree probes git_repo_root() first and only then checks the dest. The test reaches the "is not empty" error precisely because it runs inside this repo checkout, where git_repo_root() succeeds. Rewrite the comment to state that order; no behaviour change. Fixes #833
The doc on `assert_checkstyle_well_formed_and_structural` in the CLI test-helper copy claimed it matched the lib-crate copy "byte-for-byte modulo whitespace". That is false: the CLI copy carries an HTML well-formedness section the lib copy lacks, and the lib copy defines `sarif_schema_metadata` the CLI copy lacks. Only the shared SARIF + Checkstyle core is kept in lock-step. Reword the doc to state the real contract (shared core stays identical; each crate carries crate-local helpers) so a reviewer is not lulled into waving through a real divergence in the shared core. Doc-only change; no behavior change, no test added. Fixes #899
The codegen `language_slugs()` doc comment claimed it mirrored the runtime `supported_languages()` set "so the generated Lang enum cannot drift", but the two apply different filters: `language_slugs()` only filters by `is_public_language`, while `supported_languages()` also applies the feature-dependent `is_enabled` filter. The pinning test `lang_slugs_match_supported_languages` only passed because the bindings hard-code default features (every grammar on); a downstream `--no-default-features` build would fail it for the wrong reason. The generated `_enums.py` is a checked-in, build-feature-independent artifact and must list every public language, so omitting `is_enabled` is correct. Make `public_languages()` pub(crate) and have `language_slugs()` call it directly (one real source), correct the comments to name the feature-independent predicate, and rename/retarget the test to compare against `public_languages()` instead of the feature-dependent `supported_languages()`. Fixes #852
analyze() returns None for any file the CLI walker skips: empty (<=3 bytes), binary (non-UTF-8 leading window), or generated. The quick_start example and its book page both attributed every None to "looks generated", misdiagnosing the empty and binary cases. List all three reasons in the message and the prose so they match the documented contract in _native.pyi. Fixes #885
The four bus-factor specs (VcsAggregateDict, BusFactorDict, GroupBusFactorDict, DirectoryBusFactorDict) modeled src/vcs/bus_factor.rs but were never passed to assert_keys_match, so a field add/remove/rename in bus_factor.rs would silently desync the generated _types.py from the wire shape -- the exact failure mode #623's TypedDicts exist to prevent. The module docstring nonetheless claimed every VCS spec was pinned. Extend vcs_and_jit_specs_match_wire_json_keys to serialize a populated VcsAggregate and assert each of the four specs against the live JSON keys, stripping the NotRequired key_author_ids and confirming both group specs list it as optional. Correct the docstring to name the bus-factor family explicitly. Verified by revert: renaming the wire key for files turns the new assertion red. Fixes #856
The cognitive check asserted only `cognitive_sum() > 0`, which masks any regression in the C cognitive arm that still yields a non-zero but wrong value (e.g. a dropped `&&` boolean-sequence increment). Replace it with an exact `assert_eq!` against the derivable value 4 (if + && + for + switch, each at function-body top level), matching the anchored cyclomatic assertion alongside it. Fixes #936
Both tests asserted `cyclomatic_sum() >= 3` where the value is exactly derivable (5 for the MOZ_-annotated class, 3 for the QM_TRY function). The loose bound let a dropped `if` decision point in the class fixture (sum 4) pass green. Replace with exact `assert_eq!` so both an undercount and an overcount fail; `cyclomatic_sum()` is u64, so equality is safe. Fixes #944
The non-triviality sanity guard spot-checked only abc.assignments and abc.branches, omitting abc.conditions — the dimension the fixture's load-bearing <=>/try/catch constructs land in. Add an exact `abc.conditions == 6` assertion (spaceship + its result compare + the inner if + try + catch + ternary) so a regression that drops those condition arms fails even though the cpp-vs-mozcpp equality stays symmetric. Fold the three find-by-key lookups into one helper closure. Fixes #942
The checkstyle walker claimed structural conformance to the XSD but tracked only a scalar depth counter, so it validated each element's own name/attributes and tag-balance while accepting arbitrary nesting: <error> inside <error>, <file> inside <file>, or a <checkstyle> that is not the document root all passed. Replace the depth counter and saw_root flag with a parent stack and an allowed_child table encoding the XSD containment hierarchy (checkstyle > file > error/exception; error/exception also directly under checkstyle; error/exception are leaves; checkstyle only at the root). check_element now returns the element name for the nesting check. Applied to both the lib and CLI copies for parity. Fixes #943
`git_at` forwards a verbatim arg array (merge --no-ff, commit --amend) and so cannot inject --no-verify per call the way commit_inner does, leaving merge-commit fixtures non-hermetic for contributors with a global core.hooksPath. Point core.hooksPath at an empty dir inside the temp repo in Repo::init so every commit-producing command is hermetic regardless of subcommand; the per-call --no-verify flags become belt-and-suspenders. Verified via test-via-revert: a failing global hook breaks the merge fixture without this change and passes with it. Fixes #941
empty_space had no consumers anywhere in the test suite (confirmed via rg); it survived only because pub mod fixtures is #[allow(dead_code)]. Remove the function, its now-unused FuncSpace/CodeMetrics/SpaceKind/ SuppressionScope imports, and the FuncSpace mentions in the module doc and tests/README.md so the surviving surface matches. Fixes #940
resolve_corpus_files returns an empty Vec for a missing or empty corpus root (walk errors dropped by filter_map(Result::ok)), and the runner treats zero files as Ok(()) without ever invoking act_on_file, so every snapshot assertion is silently skipped and the corpus comparison tests pass having verified nothing — the classic "passes for the wrong reason" failure mode when integration submodules are uninitialized. Assert the resolved file list is non-empty before handing it to the runner, with an actionable message pointing at the likely cause (uninitialized submodule). Fixes #938
The generated `From<u16> for Tcl` out-of-range fallback resolved to `Tcl::Error` — the Tcl `error` command keyword (display "error") — instead of the tree-sitter ERROR sentinel, which the generator renames to `Tcl::Error2` (display "ERROR") because the grammar already owns an `error` keyword. Tcl is the only grammar with such a clash, so it was the lone language module whose fallback diverged from the ERROR-sentinel contract every sibling upholds. The `enums/` Rust template hardcoded `unwrap_or(Self::Error)` and never consumed the computed sentinel name. Thread the sentinel (the last entry of `get_token_names`, whose `ts_name` is "ERROR") into the template via a new `RustTemplate::error_sentinel` field and render it in the fallback. For grammars without the clash the sentinel stays `Error`, so every other `language_*.rs` is byte-identical; only `language_tcl.rs` changes (`Self::Error` -> `Self::Error2`). Add generator-level tests (Tcl renders `Error2`, Rust renders plain `Error`) and a runtime regression test that `Tcl::from(u16::MAX)` maps to the ERROR sentinel; both verified via revert. Fixes #954
`mac.py` regenerated its data file with an append-only contract (read the file, union in only `macros - old`, write the union back), so the artifact could only grow: a name removed from the `macs` template was never pruned and `c_macros.txt` silently drifted away from its own generator. Rewrite the file to exactly `sorted(macros)` each run, dropping the read-existing / diff / union step. The output is now a pure function of the template, so removals take effect on the next run. The checked-in `c_macros.txt` is byte-identical (it already matched the template), so no downstream `c_macros.rs` regen is needed. Add prune-behaviour tests to the enums-codegen-drift self-test suite: a seeded bogus entry is removed, and the output is independent of the prior file contents. Demonstrated against the old logic via revert (the bogus line survives append-only, is pruned by the rewrite). Fixes #892
Feed whitespace-only, surrounding-whitespace, and trailing-newline env values and assert the trimmed result, so removing `.trim()` from `non_empty_env` (reintroducing the `origin/ ` bad-revision footgun) regresses the suite. Fixes #831
The metric_set_tests suite never called `with_metric_set`; it only exercised the `MetricSet::with` primitive, so the builder's #743 dependency-closure contract was unguarded at this level. Add a test that calls `MetricsOptions::with_metric_set` and asserts the caller's set gains Mi's Loc/Cyclomatic/Halstead deps; dropping `.resolved()` from the builder now flips an assertion. Fixes #927
Every existing suppression test fed a marker whose only metric was unknown, where "void whole marker" and "skip the unknown token" are indistinguishable. Add a lib unit test that a marker mixing a valid metric with an unknown one errors on the unknown token, voiding the entire list; swapping the `?` in `parse_metric_list` for `continue` now fails it. Fixes #948
The schema-validation suite only used clean ASCII paths, so `path_to_uri_reference` was a no-op for every fixture and the encoding branch was never certified against the Draft-07 `uri-reference` format. Add a space-bearing path, assert the emitted uri is percent-encoded (%20), and validate it against the schema; neutering the percent-encoding arm now regresses. Fixes #949
Two book examples in `library/ast-traversal.md` had drifted from the guardian test (`tests/book_ast_traversal_examples.rs`) and the current crate API: - The `Ast::dump` (`no_run`) example omitted the required `language` field that `AstCfg` gained in #654, so a reader copying it hit a `missing field` compile error. - The metrics-plus-symbol-table (`ignore`) example compared `functions_sum()` against `3.0`, a stale float left from the metric f64-to-u64 type flip; the accessor now returns `u64`. Both now match the guardian test's assertions; the fixed `no_run` fence compiles against the current crate. Fixes #937
The book documented the REST API surface (endpoint reference plus curl recipes) but had no home for running and operating the daemon itself. That material lived only in the web crate's README, invisible to book readers. Add `commands/web-server.md` covering build/run (including the #252 subset-language caveat), the full flag set, the `BCA_MAX_ORPHANED_TASKS` and `RUST_LOG` environment variables, the 4 MiB body limit and orphaned-task back-pressure, and the security posture: loopback default, CORS-off default, and the VCS `repo_path` read / `cache_dir` write trust boundaries. Every claim is grounded in the web crate source; the endpoint and CORS detail is cross-linked to `commands/rest.md`, not duplicated. Wire the page into SUMMARY.md, cross-link it from `commands/rest.md` and `recipes/rest-api.md`, and point the web crate README at it (also correcting the README's stale `-v`/`--version` short flag to `-V`). Fixes #870
The float and int baseline-coercion tests asserted a substring (`0.25` / `25`) that is also present in the fixture's Cargo-side version `0.25.0`, so the "shows drift" claim passed from the always-present Cargo token rather than the coerced baseline scalar. A regression dropping `_coerce_baseline_value`'s (int, float) branch into the non-scalar `return None` path stayed green. Choose baseline scalars whose string form is not a substring of `0.25.0` (`0.99` / `99`), assert the full baseline-side drift fragment (`baseline '0.99'` / `baseline '99'`), and pin the scalar-coercion warning wording so the "Treating as missing" non-scalar path no longer satisfies the generic warning check. Verified by reverting the (int, float) branch: both tests now fail for the right reason. Fixes #874
The `make py-stubtest` gate (#673) ran `mypy.stubtest` against `big_code_analysis._native` only; the `vcs` submodule was allowlisted out wholesale, so PyO3 signature/default drift in `rank` / `trend` / `commit` / `score_diff` / the `Options` constructor escaped the gate — reopening the #583 stub-drift failure mode for the change-history surface. Add a second `big_code_analysis.vcs` module argument to the stubtest run so the facade (which re-exports the native submodule) is diffed against `vcs.pyi`. Rewrite `vcs.pyi` to the PyO3-introspected shapes the runtime exposes: a `@final` `Options` with a `__new__` constructor (PyO3 `#[new]` is `__new__`, not `__init__`) and `commit`'s computed-default rendered as `...` (PyO3 cannot render `"HEAD".to_owned()` in `__text_signature__`). Narrow the allowlist comment to record that the lone `_native.vcs` entry now suppresses only the submodule-attribute-existence check, not its signatures. Verified: clean stubs pass `make py-stubtest` ("no issues found in 2 modules"); a planted default mismatch (`points` 12 vs 13) fails the gate. Usage typecheck (mypy --strict / pyright) over the constructor and functions stays clean. Fixes #854
simplify-rust pass over the batch branch: - baseline::classify now derives Covered from the shared direction-aware `breaches_limit` predicate instead of a parallel inline closure, so the ratchet and the gate cannot drift on metric direction (the NaN arm still runs first). - Correct materialize_tree's doc to say `git cat-file blob <oid>` (the intentional per-blob form), not the `--batch` it never used.
The new helper_scrubs_artifact_link_env_to_local_fallback mutated GITHUB_REPOSITORY/GITHUB_RUN_ID process-wide under a guard-private lock, racing the pre-existing cli_helper_does_not_leak_to_github_ step_summary, which mutates GITHUB_STEP_SUMMARY under no lock. Child spawn snapshots the whole environment, so a concurrent set_var from one tore the other's snapshot and leaked the CI artifact vars into the step-summary child, failing that assertion ~50% of runs. Add a binary-wide common::process_env_lock() and have both env-mutating-and-spawning tests hold it across their mutate/spawn/ restore window, making those critical sections mutually exclusive regardless of which variable each touches. Found via audit-tests.
#838 replaced the git archive/tar diff materialization with git ls-tree + cat-file, removing the last tar user; prune the now-orphaned workspace.dependencies entry.
code-review pass over the batch branch: - diff materialize_tree skips symlink (mode 120000) ls-tree blobs: their content is the link target string, not source, so materializing them as regular files made the metrics walker parse a bogus file and diverge from the after-side working-tree walk (#838 follow-up). - walk_paths de-duplicates missing seeds against `seen` like valid files, so a repeated nonexistent seed yields a single AnalysisFailure (#858 follow-up).
The #920 conftest-helper tests created `debug/bca` and `release/bca`, but `_locate_workspace_binary` looks for `bca.exe` on Windows, so the locator found nothing and the tests failed on windows-latest. Mirror the locator's `.exe` suffix (no-op on Unix) so the fixtures match on both.
…tion) - #77: issue refs in a `///` doc comment leak into `bca <cmd> --help` and the generated man pages; keep them in `//` and regenerate man/ (#841). - #78: a metric's lower-is-worse direction belongs in one shared predicate every gate/report calls, or surfaces drift on mi.* (#825/#827/#837). - Extend lesson #40 with the dual failure mode: a fixture baking in one platform's spelling (e.g. a binary name without the .exe suffix) fails spuriously only on that platform's CI (#920).
0a4ae15 to
f90076d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Batch fix of the #824–#956 issue range: 129 issues resolved on
one integration branch. The range was almost entirely test-quality defects
(tests passing for the wrong reason), documentation/comment drift, and a set
of genuine CLI/py/web/enums bugs surfaced by the #710 review sweep and
stragglers.
Work was scheduled in waves: quick-win doc/comment fixes first, then
file-area chunks, then by crate, with worktree-isolated agents clustered so
no two in a wave shared a file. Every wave was merged and validated with the
full
make pre-commitgate.Notable bug fixes (not just tests/docs)
mi.*(Maintainability Index) islower-is-worse; an MI drop was bucketed/classified as an improvement. The
three baseline surfaces now share one direction-aware
breaches_limitpredicate so the gate, the ratchet, and
diff-baselinecannot drift(cli(diff-baseline): mi.* direction inverted in worsened/improved buckets #825 cli(baseline): classify tests never set lower_is_worse, masking mi.* regression suppression #827 cli(check): classify_check_outcome hard-breach ignores lower_is_worse, masked by commands_tests #837).
diff --sincematerialized the before-side withgit archive, which honouredexport-ignoreand dropped files, showingthem as spuriously added. Now uses
git ls-tree+git cat-file blob(and skips symlink blobs) (cli(diff): git archive export-ignore drops before-side files, spurious added diffs #838).
--fail-aboveaccepted NaN/inf/negative, silentlydisabling the gate; now rejected at parse time via the shared validator (cli(vcs/jit): --fail-above accepts NaN/inf/negative, silently disabling the gate #850).
container spaces; now emitted only at leaf spaces, matching the CLI (output(sarif): py binding over-emits skip_at_unit metrics at interior container spaces #855).
per-file, python: batch VCS-injection failure aborts whole batch instead of degrading per-file #851);
analyze_pathssilently dropped a nonexistent seed (nowsurfaced as
AnalysisFailure, python(walk): analyze_paths silently drops a nonexistent path seed (breaks CLI #596 parity) #858);pipeline_dbcrashed on db reusewith a wider metric set (now reconciles the schema, bug(py): pipeline_db _persist crashes on db reuse when metric column set changes (CREATE TABLE IF NOT EXISTS keeps old schema) #890).
Vary: Originwas missing on unmatched allow-list responses,breaking shared-cache correctness (web/cors: Vary: Origin missing on unmatched allow-list responses #859).
FromStringreferenced an undefinedStringtype(enums(go): generated FromString uses undefined Go type
String(won't compile) #863); JSON codegen over-escaped tokens (enums(json): generate_json over-escapes token strings (escape=true) #862); Tcl'sFrom<u16>fallbackpointed at the
errorkeyword instead of the ERROR sentinel (enums(rust): Tcl From<u16> fallback points to 'error' keyword, not ERROR sentinel #954).dropped (cli(manifest): typos in [check]/[report]/[vcs] sub-table keys are silently dropped (no unknown-key warning) #843); the jit→commit deprecation warning was lost behind global
flags / after
--(cli(deprecations): jit->commit warning dropped when a global flag precedes the subcommand #834 cli(deprecations): jit subcommand scan ignores -- marker, warns on path values #836); report escaping doubled backslashes / leakedmarkdown metachars / mismatched advisory cutoffs (cli(report): nargs suppressed-breakdown ignores advisory cutoff, diverging from Many-parameters table #844 cli(report): advisory cutoff labels round while counts use the exact value #845 cli(report): escape_name doubles backslashes inside Markdown code span #846 cli(report): markdown provenance footer does not escape seed paths in italic span #848 cli(report): bus-factor per-directory counts skip thousands separators (vcs report) #849).
Test-quality fixes
~50 tests that passed for the wrong reason were strengthened to fail on a real
regression — vacuous assertions, tautologies (
str(m) == m.value),> 0/>= Nloosened metric checks tightened to exact derived values, exit-code-onlyassertions, substring-on-structured-data, and missing-coverage gaps across the
cli/lib/py suites and the vcs/grammar/parity tests. Each was verified by
test-via-revert.
Post-batch review pass
Ran simplify-rust, rust-optimize, audit-tests, /review, and /code-review over
the integrated branch. Findings fixed: reuse the shared
breaches_limitin thebaseline ratchet; correct a stale
diff.rsdoc; skip symlink blobs in the diffbefore-side; de-duplicate missing seeds; drop the now-unreferenced
tarworkspace dependency; and a real flaky env-race introduced by #900's test
(serialized via a shared lock, 12/12 clean stress runs).
Skipped / follow-ups
initabort guard is unreachable oncurrent
main); left open, recommend close-as-not-reproducible.requiring design decisions (key management, cache-schema interaction); left
for deliberate design.
own-value breaches; not fixable in the binding alone (wire shape lacks the
per-space own value).
Validation
make pre-commitpasses on the branch tip (fmt, clippy ×2, full workspacetests, doc, udeps, markdown/TOML/shell/Makefile/Actions lint, man-page drift,
self-scan + headroom gates, and the Python ruff/mypy/pyright/pytest/stubtest
stages). No
big-code-analysis-outputsubmodule drift.Closes the 129 issues referenced by per-commit
Fixes #Ntrailers.