Skip to content

fix: batch resolve #824–#956 (129 issues)#959

Merged
dekobon merged 141 commits into
mainfrom
fix/batch-2026-06-15
Jun 16, 2026
Merged

fix: batch resolve #824–#956 (129 issues)#959
dekobon merged 141 commits into
mainfrom
fix/batch-2026-06-15

Conversation

@dekobon

@dekobon dekobon commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Summary

Batch fix of the #824#956 issue range: 129 issues resolved on
one integration branch. The range was almost entirely test-quality defects
(tests passing for the wrong reason), documentation/comment drift, and a set
of genuine CLI/py/web/enums bugs surfaced by the #710 review sweep and
stragglers.

Work was scheduled in waves: quick-win doc/comment fixes first, then
file-area chunks, then by crate, with worktree-isolated agents clustered so
no two in a wave shared a file. Every wave was merged and validated with the
full make pre-commit gate.

Notable bug fixes (not just tests/docs)

Test-quality fixes

~50 tests that passed for the wrong reason were strengthened to fail on a real
regression — vacuous assertions, tautologies (str(m) == m.value), > 0 /
>= N loosened metric checks tightened to exact derived values, exit-code-only
assertions, substring-on-structured-data, and missing-coverage gaps across the
cli/lib/py suites and the vcs/grammar/parity tests. Each was verified by
test-via-revert.

Post-batch review pass

Ran simplify-rust, rust-optimize, audit-tests, /review, and /code-review over
the integrated branch. Findings fixed: reuse the shared breaches_limit in the
baseline ratchet; correct a stale diff.rs doc; skip symlink blobs in the diff
before-side; de-duplicate missing seeds; drop the now-unreferenced tar
workspace dependency; and a real flaky env-race introduced by #900's test
(serialized via a shared lock, 12/12 clean stress runs).

Skipped / follow-ups

Validation

make pre-commit passes on the branch tip (fmt, clippy ×2, full workspace
tests, doc, udeps, markdown/TOML/shell/Makefile/Actions lint, man-page drift,
self-scan + headroom gates, and the Python ruff/mypy/pyright/pytest/stubtest
stages). No big-code-analysis-output submodule drift.

Closes the 129 issues referenced by per-commit Fixes #N trailers.

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.81%. Comparing base (c301a05) to head (f90076d).
⚠️ Report is 141 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #959      +/-   ##
==========================================
+ Coverage   85.78%   85.81%   +0.02%     
==========================================
  Files         106      107       +1     
  Lines       65768    65783      +15     
  Branches    65374    65389      +15     
==========================================
+ Hits        56419    56450      +31     
+ Misses       8918     8903      -15     
+ Partials      431      430       -1     
Flag Coverage Δ
python 100.00% <ø> (ø)
rust 85.72% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/languages/language_tcl.rs 10.52% <100.00%> (+3.75%) ⬆️
src/languages/mod.rs 100.00% <100.00%> (ø)
src/metric_catalog.rs 100.00% <ø> (ø)
src/suppression.rs 96.17% <100.00%> (+0.04%) ⬆️
src/vcs/score.rs 100.00% <ø> (ø)
src/vcs/trend.rs 98.36% <ø> (ø)

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dekobon dekobon force-pushed the fix/batch-2026-06-15 branch 2 times, most recently from 551bf2e to 4595dd7 Compare June 16, 2026 00:17
dekobon added 27 commits June 15, 2026 17:21
The const doc scoped the bump rule to "the weighted composite
formula" while the module note required a bump for any formula
change. They disagreed: in percentile mode finalize stamps the
same risk_score_version and apply_percentile overwrites risk_score,
yet the const doc implied the percentile blend was unversioned. A
maintainer following the const doc literally could change the blend
without bumping the version, silently corrupting cache reuse and
cross-run comparisons.

Reword the const doc to state a single RISK_SCORE_VERSION versions
both formulas, list what a bump covers for each (weighted term
set/weights/bumps; percentile extractor set/mid-rank/normalization),
align the module rule, add the contract to apply_percentile's doc,
and update the book and the version_is_two test comment to match.

Doc-only; no behavior change.

Fixes #824
The emit_section_md (Markdown) and emit_html_section (HTML) doc
comments justified their escaping behaviour by citing "the `>` in
the many-parameters heading", but #677 moved the `>3` floor out of
every hotspot title into the legend/Args column, so no current
concept carries a `>`. Rewrite both comments to state the real
reasons (Markdown headings are not table cells; HTML escaping is
defensive) and link HotspotTitle::render, which already documents
that no concept carries a `>`.

Fixes #847
The clippy::too_many_arguments allow-comment on `analyze` claimed
"Eight keyword-only PyO3 args", but the `#[pyo3(signature = ...)]`
list has only six args after the `*` marker (exclude_tests,
allow_lossy_path, skip_generated, metrics, vcs, vcs_per_function).
`path` is positional-only and `py` is the injected GIL token.

Correct the count to "Six keyword-only PyO3 args (plus the
positional-only `path`)", matching the sibling `analyze_batch`
comment. Comment-only; the allow itself stays (8 Rust params
exceed clippy's threshold of 7).

Fixes #853
The module docstring documented compute-metrics and compute-ci-metrics
with a -l grammar flag, but argparse registers -g/--grammar for all
three subcommands. Copying the documented invocations verbatim failed
with "error: unrecognized arguments: -l". Align the two stale examples
with the actual flag.

Fixes #871
generate_json passed escape=true to get_token_names, selecting the
double-backslash escape form intended for values that survive a second
source-string interpretation layer. JSON string literals are parsed
exactly once, so every token containing ", \, tab, newline, or carriage
return was over-escaped and decoded back to the wrong text (a " token
decoded to \", a tab token to a literal backslash-t).

Switch to escape=false via a named JSON_TOKEN_ESCAPE constant, matching
the Rust and Go generators. Add round-trip regression tests that render
JsonTemplate and assert the emitted JSON decodes back to the original
token; they fail under escape=true and pass under escape=false.

Fixes #862
Under a `--cors <allow-list>` policy, `Vary: Origin` was set only on
responses to *matched* origins (inside `apply_cors_headers`'s `Echo`
arm). Responses to unlisted origins and same-origin requests carried no
`Vary`, so a shared cache could serve one origin's response to another
— the cross-origin cache contamination the code's own RFC 9110 §12.5.5
comment cites.

Move the `Vary` responsibility out of the matched-only `Echo` arm and
into the middleware, keyed on a new `CorsPolicy::varies_on_origin`
predicate that is true only for `AllowList` (Wildcard answers `*`
uniformly and Disabled emits nothing, so neither depends on Origin).
Append rather than insert so any inner-handler `Vary` is preserved and
the matched path is not duplicated.

Fixes #859
The inline comment in `trend::timestamps` claimed `points` is
"validated `<= MAX_TREND_POINTS`", but the function is `pub`, never
calls `validate_points`, and enforces no such bound. Replace it with an
accurate description: `points` is an unvalidated caller-supplied count
whose totality comes solely from the `try_from(...).unwrap_or(i64::MAX)`
fallback, and `points - 1` cannot underflow because of the `points <= 1`
early return.

Also note on the doc comment that the precondition is not re-checked
here, so a direct caller passing a large `points` allocates a
correspondingly large `Vec` (the bound lives in `validate_points`).

Comment-accuracy only; behavior unchanged, existing timestamps_* tests
cover the function.

Fixes #826
The module-level doc in tests/common/fixtures.rs listed both
sarif_test.rs and checkstyle_test.rs as consumers of the shared
fixture builders, but sarif_test.rs imports only common::validators
and constructs OffenderRecord inline. Only checkstyle_test.rs uses
`rec`; empty_space has no current consumer. List only the real
consumer so the "Used by" note no longer overstates adoption.

Fixes #939
The skip-rationale comment cited only the now-closed #83 and framed
the skip as pending "until the issue is resolved," implying a live
blocker under a number that no longer tracks anything. Update it to
name #83 as the origin and the open #86 as the active tracker for the
five remaining excluded files, matching tests/README.md.

Fixes #946
The `BaselineEntry::value` doc described only the non-finite skip,
but both the read (`from_str`) and write (`from_violations`) filters
also drop negative values via `is_sign_negative()` (catching `-0.0`,
which `< 0.0` misses). A maintainer reading the field doc could have
assumed negatives round-trip and loosened the filter, reintroducing
the double-signed `[regr +-N%]` tag bug the check guards against.

Pure documentation change; existing tests already cover the filter
behaviour.

Fixes #830
The hard-tier comment in
manifest_soft_threshold_subtable_applies_only_at_soft_tier said
`classify` is cyclomatic 5, contradicting the authoritative
BRANCHY_RUST constant doc (== 4) and the sibling
manifest_headroom_scales_thresholds_at_soft_tier test (== 4).

Verified via `bca metrics` on the BRANCHY_RUST fixture: the
`classify` function space reports cyclomatic sum 4; the value 5
is the file/unit-level roll-up (+1 for the unit space), not the
function's complexity. Comment now agrees with both siblings.

Comment-only; no behavior change.

Fixes #911
The run() docstring in metric_selection.py called bca.METRIC_NAMES a
tuple[str, ...] (the native _native.METRIC_NAMES type), but the
re-exported public constant is tuple[MetricName, ...] — a StrEnum,
hence str-comparable, which is why "halstead" in bca.METRIC_NAMES
works. It also framed the membership guard as testing the
["loc", "cyclomatic"] selection; the guard is an ABI/catalog
presence smoke check for an unrelated metric. Corrected both while
keeping the file's line count stable so metrics.md's :17:47 include
window still resolves.

Fixes #888
The vcs.rs module doc and an inline comment named a vcs_metrics()
Python entry point that never shipped; #612 renamed the surface to
the big_code_analysis.vcs.rank / trend / commit / score_diff facade.
Map each Rust bridge to its live vcs.* entry point and enumerate the
inject/batch-helper bridges the doc previously omitted.

Fixes #857
The MetricsFormat::Text doc comment said the format "ignores
--output", but #661 made an explicit `--format text` combined with
--output/--output-dir a hard error rather than a silent no-op
(normalize_text_format collapses text to None, then
resolve_structured_output dies on a destination without a structured
format). Update the comment to describe the hard-error behavior.

Pure doc fix; no behavior change.

Fixes #841
The FileMarkers doc claimed JSON output preserves the raw on-disk
path, but render_json applies strip_path_prefix to every marker and
baseline path exactly as the text and Markdown renderers do (pinned
by strip_prefix_trims_displayed_paths_in_every_format). Rewrite the
comment to state that path is the raw walked path and every renderer
applies strip_prefix at render time, so --strip-prefix trims JSON
output too. Also drop the garbled "trimmed at collection time is not
done here" clause. Documentation-only; no behavior change.

Fixes #842
The scrub_ci_env doc comment claimed the big-code-analysis-web bin()
builder delegates to this helper. It does not and cannot: Cargo does
not share test modules across workspace members, web's bin() does no
env scrubbing, and bca-web reads no GITHUB_* vars (only
BCA_MAX_ORPHANED_TASKS) so it has nothing to scrub.

Rewrite the final paragraph to describe only the CLI-crate cli()
call sites that genuinely delegate (via bca_command / cli_in) and
note why the web crate intentionally needs no scrubbing.

Fixes #898
The control_characters_in_message_replaced test injects \u{0001}
(SOH, U+0001) but the comment claimed the metric name carries a NUL
(U+0000). The two are distinct C0 controls; the comment now names
the actual code point. Comment-only — the production substitution
arm treats every C0 control identically, so behaviour is unchanged.

Fixes #928
The retry_with_single_attempt_calls_once comment cited a
saturating_sub guard that does not exist in retry_transient. The
real mechanism is the empty `1..attempts` range, which never runs
the loop body for attempts 0 or 1, so the closure fires exactly
once with no counter to underflow. Reword the comment to match the
production comment in blame.rs and drop the false reference.

Comment-only; the existing calls == 1 assertions already pin the
invariant.

Fixes #931
The comment called the flattened keys "case-distinct" and presented
that as why they map cleanly onto SQLite columns. SQLite identifiers
are case-insensitive, so case-only-distinct names collide. The real
safety property is case-insensitive uniqueness (the N1/n1 Halstead
collision was removed in #511). Reword to state that guarantee.

Fixes #884
The sarif_output example labeled len(sarif) as "bytes", but len() on a
str counts Unicode code points, not bytes. For any non-ASCII SARIF
document the UTF-8 file written by write_text(..., encoding="utf-8") is
larger than len(sarif) reports. Encode to UTF-8 before measuring so the
printed figure matches the on-disk file size.

The edit stays on the same line, preserving the book's
{{#include ...:17:33}} range in python/sarif.md.

Fixes #886
`test_lang_values_exactly_match_supported_languages` asserted
`[str(m) for m in members] == [m.value for m in members]`, which is
tautological for a StrEnum (`str(member)` is `member.value` by
construction) and can never fail. The "in declaration order" comment
described an ordering contract no assertion enforced.

Replace the no-op line with an independent-source check against the raw
`_native.supported_languages()` slug table, pinning both each member's
value and its order to the source of truth. Verified non-vacuous:
under a simulated value mismap or order drift the old line still
passes while the new one fails.

Fixes #919
The #919 strengthening imported the private `_native`, which pyright
(strict) rejects as reportPrivateUsage, and compared
`supported_languages()` members against `_native.supported_languages()`
— a tautology, since those members are built from that same native
list via `Lang(name)`. Pin the enum's declaration order (`list(Lang)`)
against the native-sourced slug list obtained through the public
`supported_languages()` facade instead, so the test catches a slug
mismap or a declaration-order drift while touching only public API.
The diff-baseline and diff commands gained an opt-in --exit-code flag
(#692) that exits 2 (EXIT_GATE_BREACH) when the filtered diff is
non-empty, but several doc comments still asserted an unconditional
"always exits 0 on success". Align the function doc, the clap help
(propagated to the man pages), the DiffArgs struct doc, and the
DiffError doc with the real contract: exit 0 by default, 2 under
--exit-code on a non-empty diff, 1 on a tool error.

Exemptions is read-only with no --exit-code path, so its "always
exits 0" claim is correct and left unchanged.

Regenerated man pages via cargo xtask.

Fixes #829
The comment on materialize_tree_rejects_non_empty_dest claimed the
empty-dest guard "precedes any git invocation" and that the test runs
"without a git repo", both false and internally contradictory.

materialize_tree probes git_repo_root() first and only then checks the
dest. The test reaches the "is not empty" error precisely because it
runs inside this repo checkout, where git_repo_root() succeeds. Rewrite
the comment to state that order; no behaviour change.

Fixes #833
The doc on `assert_checkstyle_well_formed_and_structural` in the CLI
test-helper copy claimed it matched the lib-crate copy "byte-for-byte
modulo whitespace". That is false: the CLI copy carries an HTML
well-formedness section the lib copy lacks, and the lib copy defines
`sarif_schema_metadata` the CLI copy lacks. Only the shared SARIF +
Checkstyle core is kept in lock-step.

Reword the doc to state the real contract (shared core stays
identical; each crate carries crate-local helpers) so a reviewer is
not lulled into waving through a real divergence in the shared core.

Doc-only change; no behavior change, no test added.

Fixes #899
The codegen `language_slugs()` doc comment claimed it mirrored the
runtime `supported_languages()` set "so the generated Lang enum cannot
drift", but the two apply different filters: `language_slugs()` only
filters by `is_public_language`, while `supported_languages()` also
applies the feature-dependent `is_enabled` filter. The pinning test
`lang_slugs_match_supported_languages` only passed because the bindings
hard-code default features (every grammar on); a downstream
`--no-default-features` build would fail it for the wrong reason.

The generated `_enums.py` is a checked-in, build-feature-independent
artifact and must list every public language, so omitting `is_enabled`
is correct. Make `public_languages()` pub(crate) and have
`language_slugs()` call it directly (one real source), correct the
comments to name the feature-independent predicate, and rename/retarget
the test to compare against `public_languages()` instead of the
feature-dependent `supported_languages()`.

Fixes #852
analyze() returns None for any file the CLI walker skips: empty
(<=3 bytes), binary (non-UTF-8 leading window), or generated. The
quick_start example and its book page both attributed every None to
"looks generated", misdiagnosing the empty and binary cases. List
all three reasons in the message and the prose so they match the
documented contract in _native.pyi.

Fixes #885
dekobon added 26 commits June 15, 2026 18:36
The four bus-factor specs (VcsAggregateDict, BusFactorDict,
GroupBusFactorDict, DirectoryBusFactorDict) modeled src/vcs/bus_factor.rs
but were never passed to assert_keys_match, so a field add/remove/rename
in bus_factor.rs would silently desync the generated _types.py from the
wire shape -- the exact failure mode #623's TypedDicts exist to prevent.
The module docstring nonetheless claimed every VCS spec was pinned.

Extend vcs_and_jit_specs_match_wire_json_keys to serialize a populated
VcsAggregate and assert each of the four specs against the live JSON
keys, stripping the NotRequired key_author_ids and confirming both group
specs list it as optional. Correct the docstring to name the bus-factor
family explicitly. Verified by revert: renaming the wire key for files
turns the new assertion red.

Fixes #856
The cognitive check asserted only `cognitive_sum() > 0`, which masks any
regression in the C cognitive arm that still yields a non-zero but wrong
value (e.g. a dropped `&&` boolean-sequence increment). Replace it with
an exact `assert_eq!` against the derivable value 4 (if + && + for +
switch, each at function-body top level), matching the anchored
cyclomatic assertion alongside it.

Fixes #936
Both tests asserted `cyclomatic_sum() >= 3` where the value is exactly
derivable (5 for the MOZ_-annotated class, 3 for the QM_TRY function).
The loose bound let a dropped `if` decision point in the class fixture
(sum 4) pass green. Replace with exact `assert_eq!` so both an undercount
and an overcount fail; `cyclomatic_sum()` is u64, so equality is safe.

Fixes #944
The non-triviality sanity guard spot-checked only abc.assignments and
abc.branches, omitting abc.conditions — the dimension the fixture's
load-bearing <=>/try/catch constructs land in. Add an exact
`abc.conditions == 6` assertion (spaceship + its result compare + the
inner if + try + catch + ternary) so a regression that drops those
condition arms fails even though the cpp-vs-mozcpp equality stays
symmetric. Fold the three find-by-key lookups into one helper closure.

Fixes #942
The checkstyle walker claimed structural conformance to the XSD but
tracked only a scalar depth counter, so it validated each element's own
name/attributes and tag-balance while accepting arbitrary nesting:
<error> inside <error>, <file> inside <file>, or a <checkstyle> that is
not the document root all passed.

Replace the depth counter and saw_root flag with a parent stack and an
allowed_child table encoding the XSD containment hierarchy (checkstyle >
file > error/exception; error/exception also directly under checkstyle;
error/exception are leaves; checkstyle only at the root). check_element
now returns the element name for the nesting check. Applied to both the
lib and CLI copies for parity.

Fixes #943
`git_at` forwards a verbatim arg array (merge --no-ff, commit --amend)
and so cannot inject --no-verify per call the way commit_inner does,
leaving merge-commit fixtures non-hermetic for contributors with a
global core.hooksPath. Point core.hooksPath at an empty dir inside the
temp repo in Repo::init so every commit-producing command is hermetic
regardless of subcommand; the per-call --no-verify flags become
belt-and-suspenders.

Verified via test-via-revert: a failing global hook breaks the merge
fixture without this change and passes with it.

Fixes #941
empty_space had no consumers anywhere in the test suite (confirmed via
rg); it survived only because pub mod fixtures is #[allow(dead_code)].
Remove the function, its now-unused FuncSpace/CodeMetrics/SpaceKind/
SuppressionScope imports, and the FuncSpace mentions in the module doc
and tests/README.md so the surviving surface matches.

Fixes #940
resolve_corpus_files returns an empty Vec for a missing or empty corpus
root (walk errors dropped by filter_map(Result::ok)), and the runner
treats zero files as Ok(()) without ever invoking act_on_file, so every
snapshot assertion is silently skipped and the corpus comparison tests
pass having verified nothing — the classic "passes for the wrong reason"
failure mode when integration submodules are uninitialized.

Assert the resolved file list is non-empty before handing it to the
runner, with an actionable message pointing at the likely cause
(uninitialized submodule).

Fixes #938
The generated `From<u16> for Tcl` out-of-range fallback resolved to
`Tcl::Error` — the Tcl `error` command keyword (display "error") —
instead of the tree-sitter ERROR sentinel, which the generator renames
to `Tcl::Error2` (display "ERROR") because the grammar already owns an
`error` keyword. Tcl is the only grammar with such a clash, so it was
the lone language module whose fallback diverged from the ERROR-sentinel
contract every sibling upholds.

The `enums/` Rust template hardcoded `unwrap_or(Self::Error)` and never
consumed the computed sentinel name. Thread the sentinel (the last entry
of `get_token_names`, whose `ts_name` is "ERROR") into the template via a
new `RustTemplate::error_sentinel` field and render it in the fallback.
For grammars without the clash the sentinel stays `Error`, so every other
`language_*.rs` is byte-identical; only `language_tcl.rs` changes
(`Self::Error` -> `Self::Error2`).

Add generator-level tests (Tcl renders `Error2`, Rust renders plain
`Error`) and a runtime regression test that `Tcl::from(u16::MAX)` maps to
the ERROR sentinel; both verified via revert.

Fixes #954
`mac.py` regenerated its data file with an append-only contract
(read the file, union in only `macros - old`, write the union back),
so the artifact could only grow: a name removed from the `macs`
template was never pruned and `c_macros.txt` silently drifted away
from its own generator.

Rewrite the file to exactly `sorted(macros)` each run, dropping the
read-existing / diff / union step. The output is now a pure function
of the template, so removals take effect on the next run. The
checked-in `c_macros.txt` is byte-identical (it already matched the
template), so no downstream `c_macros.rs` regen is needed.

Add prune-behaviour tests to the enums-codegen-drift self-test suite:
a seeded bogus entry is removed, and the output is independent of the
prior file contents. Demonstrated against the old logic via revert
(the bogus line survives append-only, is pruned by the rewrite).

Fixes #892
Feed whitespace-only, surrounding-whitespace, and trailing-newline
env values and assert the trimmed result, so removing `.trim()`
from `non_empty_env` (reintroducing the `origin/ ` bad-revision
footgun) regresses the suite.

Fixes #831
The metric_set_tests suite never called `with_metric_set`; it only
exercised the `MetricSet::with` primitive, so the builder's #743
dependency-closure contract was unguarded at this level. Add a test
that calls `MetricsOptions::with_metric_set` and asserts the
caller's set gains Mi's Loc/Cyclomatic/Halstead deps; dropping
`.resolved()` from the builder now flips an assertion.

Fixes #927
Every existing suppression test fed a marker whose only metric was
unknown, where "void whole marker" and "skip the unknown token"
are indistinguishable. Add a lib unit test that a marker mixing a
valid metric with an unknown one errors on the unknown token,
voiding the entire list; swapping the `?` in `parse_metric_list`
for `continue` now fails it.

Fixes #948
The schema-validation suite only used clean ASCII paths, so
`path_to_uri_reference` was a no-op for every fixture and the
encoding branch was never certified against the Draft-07
`uri-reference` format. Add a space-bearing path, assert the emitted
uri is percent-encoded (%20), and validate it against the schema;
neutering the percent-encoding arm now regresses.

Fixes #949
Two book examples in `library/ast-traversal.md` had drifted from the
guardian test (`tests/book_ast_traversal_examples.rs`) and the current
crate API:

- The `Ast::dump` (`no_run`) example omitted the required `language`
  field that `AstCfg` gained in #654, so a reader copying it hit a
  `missing field` compile error.
- The metrics-plus-symbol-table (`ignore`) example compared
  `functions_sum()` against `3.0`, a stale float left from the
  metric f64-to-u64 type flip; the accessor now returns `u64`.

Both now match the guardian test's assertions; the fixed `no_run`
fence compiles against the current crate.

Fixes #937
The book documented the REST API surface (endpoint reference plus curl
recipes) but had no home for running and operating the daemon itself.
That material lived only in the web crate's README, invisible to book
readers.

Add `commands/web-server.md` covering build/run (including the #252
subset-language caveat), the full flag set, the `BCA_MAX_ORPHANED_TASKS`
and `RUST_LOG` environment variables, the 4 MiB body limit and
orphaned-task back-pressure, and the security posture: loopback default,
CORS-off default, and the VCS `repo_path` read / `cache_dir` write trust
boundaries. Every claim is grounded in the web crate source; the
endpoint and CORS detail is cross-linked to `commands/rest.md`, not
duplicated.

Wire the page into SUMMARY.md, cross-link it from `commands/rest.md` and
`recipes/rest-api.md`, and point the web crate README at it (also
correcting the README's stale `-v`/`--version` short flag to `-V`).

Fixes #870
The float and int baseline-coercion tests asserted a substring
(`0.25` / `25`) that is also present in the fixture's Cargo-side
version `0.25.0`, so the "shows drift" claim passed from the
always-present Cargo token rather than the coerced baseline scalar.
A regression dropping `_coerce_baseline_value`'s (int, float)
branch into the non-scalar `return None` path stayed green.

Choose baseline scalars whose string form is not a substring of
`0.25.0` (`0.99` / `99`), assert the full baseline-side drift
fragment (`baseline '0.99'` / `baseline '99'`), and pin the
scalar-coercion warning wording so the "Treating as missing"
non-scalar path no longer satisfies the generic warning check.
Verified by reverting the (int, float) branch: both tests now
fail for the right reason.

Fixes #874
The `make py-stubtest` gate (#673) ran `mypy.stubtest` against
`big_code_analysis._native` only; the `vcs` submodule was allowlisted
out wholesale, so PyO3 signature/default drift in `rank` / `trend` /
`commit` / `score_diff` / the `Options` constructor escaped the gate —
reopening the #583 stub-drift failure mode for the change-history
surface.

Add a second `big_code_analysis.vcs` module argument to the stubtest
run so the facade (which re-exports the native submodule) is diffed
against `vcs.pyi`. Rewrite `vcs.pyi` to the PyO3-introspected shapes
the runtime exposes: a `@final` `Options` with a `__new__` constructor
(PyO3 `#[new]` is `__new__`, not `__init__`) and `commit`'s
computed-default rendered as `...` (PyO3 cannot render
`"HEAD".to_owned()` in `__text_signature__`). Narrow the allowlist
comment to record that the lone `_native.vcs` entry now suppresses
only the submodule-attribute-existence check, not its signatures.

Verified: clean stubs pass `make py-stubtest` ("no issues found in 2
modules"); a planted default mismatch (`points` 12 vs 13) fails the
gate. Usage typecheck (mypy --strict / pyright) over the constructor
and functions stays clean.

Fixes #854
simplify-rust pass over the batch branch:
- baseline::classify now derives Covered from the shared direction-aware
  `breaches_limit` predicate instead of a parallel inline closure, so the
  ratchet and the gate cannot drift on metric direction (the NaN arm
  still runs first).
- Correct materialize_tree's doc to say `git cat-file blob <oid>` (the
  intentional per-blob form), not the `--batch` it never used.
The new helper_scrubs_artifact_link_env_to_local_fallback mutated
GITHUB_REPOSITORY/GITHUB_RUN_ID process-wide under a guard-private
lock, racing the pre-existing cli_helper_does_not_leak_to_github_
step_summary, which mutates GITHUB_STEP_SUMMARY under no lock. Child
spawn snapshots the whole environment, so a concurrent set_var from
one tore the other's snapshot and leaked the CI artifact vars into
the step-summary child, failing that assertion ~50% of runs.

Add a binary-wide common::process_env_lock() and have both
env-mutating-and-spawning tests hold it across their mutate/spawn/
restore window, making those critical sections mutually exclusive
regardless of which variable each touches. Found via audit-tests.
#838 replaced the git archive/tar diff materialization with git
ls-tree + cat-file, removing the last tar user; prune the now-orphaned
workspace.dependencies entry.
code-review pass over the batch branch:
- diff materialize_tree skips symlink (mode 120000) ls-tree blobs: their
  content is the link target string, not source, so materializing them as
  regular files made the metrics walker parse a bogus file and diverge
  from the after-side working-tree walk (#838 follow-up).
- walk_paths de-duplicates missing seeds against `seen` like valid files,
  so a repeated nonexistent seed yields a single AnalysisFailure (#858
  follow-up).
The #920 conftest-helper tests created `debug/bca` and `release/bca`,
but `_locate_workspace_binary` looks for `bca.exe` on Windows, so the
locator found nothing and the tests failed on windows-latest. Mirror the
locator's `.exe` suffix (no-op on Unix) so the fixtures match on both.
…tion)

- #77: issue refs in a `///` doc comment leak into `bca <cmd> --help` and
  the generated man pages; keep them in `//` and regenerate man/ (#841).
- #78: a metric's lower-is-worse direction belongs in one shared predicate
  every gate/report calls, or surfaces drift on mi.* (#825/#827/#837).
- Extend lesson #40 with the dual failure mode: a fixture baking in one
  platform's spelling (e.g. a binary name without the .exe suffix) fails
  spuriously only on that platform's CI (#920).
@dekobon dekobon force-pushed the fix/batch-2026-06-15 branch from 0a4ae15 to f90076d Compare June 16, 2026 01:36
@dekobon dekobon merged commit f90076d into main Jun 16, 2026
54 checks passed
@dekobon dekobon deleted the fix/batch-2026-06-15 branch June 16, 2026 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant