Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ Status legend: **ACTIVE** (current spec/contract, AI agents should read first) /
| `docs/exit_codes.md` | ACTIVE | Stable CLI exit code policy (0 / 1 / 2 / 3 / 4) for CI integration, including `--strict-repair`, `severity: info` Advisor channel, and per-subcommand notes |
| `docs/json_schema.md` | ACTIVE | CLI JSON envelopes — verdict / compile at `schema_version="6"`, compile-repair at independent `schema_version="1"`, validate-plan at independent `schema_version="2"`. Includes compatibility policy and v2→v3 through v5→v6 diffs (plus validate-plan v1→v2) |
| `docs/cli_test_inventory.md` | ACTIVE | CLI test coverage inventory, runtime notes, and conservative reduction candidates |
| `docs/target_yaml_guide.md` | ACTIVE | `target.yaml` authoring guide. Practical companion to `design.md §4` / `cli_usage.md`. Centralises authoring hazards D1 (`--package-root` scope vs `tests/` visibility), D3 (template / user constraint duplication), D4 (config-only PR の vacuous PASS), D6 (nested-function complexity 盲点) — 2026-05-07 Session 4 / 2026-05-28 real-PR dogfooding 由来 |
| `docs/target_yaml_guide.md` | ACTIVE | `target.yaml` authoring guide. Practical companion to `design.md §4` / `cli_usage.md`. Centralises authoring hazards D1 (`--package-root` scope vs `tests/` visibility), D3 (template / user constraint duplication), D4 (config-only PR の vacuous PASS), D6 (nested-function complexity 盲点), D7 (extract-method × cyclomatic 微増) — 2026-05-07 Session 4 / 2026-05-28 real-PR dogfooding 由来 |
| `docs/target_authoring_surface.md` | ACTIVE | Authoring surface 設計契約 (Brief 8 / CSCI-41)。target.yaml は hand-written 必須でない / 生成経路 3 通り (recipe + sources / catalog 参照 / hand-written) / LLM 経路は Brief 8b 分離 / 全経路は verdict 前に declared intent として固定 / Authoring・Advisor・Provenance surface は evaluator 不可参照 / `candidate_code_used: false` 固定。§23.3.1 の実装側 catch-up |
| `docs/ssp_protocol.md` | ACTIVE | SSP v0.1 normative spec: SensorOutput / Finding / SSPDelta / SSPVerdict definitions, 5-element SAST + 3-element SCA fingerprint, Python profile AST normalization, delta computation, verdict precedence (`unknown > fail > pass`), JSON Schema artifact, Sensor Provenance Invariant (§23.1 mirror), determinism requirements, core isolation contract |
| `docs/ssp_usage_guide.md` | ACTIVE | SSP practical usage guide: quick start (Semgrep SAST / pip-audit SCA / fixture mode), output formats (JSON / human / SARIF), CI integration (GitHub Actions workflow / exit code routing / fixture-based CI), hand-built SensorOutput examples, delta computation overview, relationship to core Semantic CI |
Expand Down
5 changes: 3 additions & 2 deletions docs/cli_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -738,7 +738,7 @@ semantic-ci target-doctor [--target <yaml>] [--package-root <dir>]
[--format {human,json}] [--output <file>]
```

Audits a `target.yaml` for eight authoring hazards and renders them as
Audits a `target.yaml` for nine authoring hazards and renders them as
advisories. Advisor surface — advisory presence does not change the
verdict and does not change the exit code (`docs/exit_codes.md`).

Expand All @@ -748,12 +748,13 @@ verdict and does not change the exit code (`docs/exit_codes.md`).
| `ADVISORY-D3` | A user constraint duplicates a template-expanded constraint (same kind/target/operator/expected). |
| `ADVISORY-D4` | The target is lock-only and the candidate diff (`--baseline-rev` ↔ `--candidate-rev`) touches no Python files; the verdict would be a vacuous PASS. Skipped silently when neither rev is given and git is unavailable. |
| `ADVISORY-D6` | The target declares a `complexity_delta` constraint and the candidate diff (`--baseline-rev` ↔ `--candidate-rev`) grows the nested-function count in an in-scope Python file. The complexity extractor does not descend into nested defs, so a reported cyclomatic/cognitive decrease may be displacement, not simplification. Skipped silently when neither rev is given and git is unavailable. |
| `ADVISORY-D7` | The candidate diff adds N extractor-visible function definitions net across the in-scope diff (the extract-method shape), and a `primary_kind: refactor` target declares a `complexity_delta.cyclomatic` constraint that rejects a `+N` delta (declared `tolerance` honoured — a budget covering the helper count stays silent). The summed cyclomatic is mathematically guaranteed to micro-increase (+1 base per extracted helper), so the lock can FAIL on a faithful refactor — prefer `complexity_delta.cognitive`. Skipped silently when neither rev is given and git is unavailable. |
| `ADVISORY-I1` | `intent` is the empty string. Repair adapters and `validate-plan` produce better guidance when intent describes the change purpose; use `init --intent` or edit `target.yaml`. |
| `ADVISORY-P1` | `primary_kind: feature` has no positive addition constraint. |
| `ADVISORY-P2` | `primary_kind: bugfix` has no `test_surface_delta.new_cases` expectation. |
| `ADVISORY-S1` | A user constraint has `severity: info` paired with `unknown_policy in {fail, repair}`. After Brief D1-4 the warning scope narrows to extraction-cause / open_runtime UNKNOWN. |

`--format json` emits the `advisory-1` envelope
`--format json` emits the `advisory-2` envelope
([`docs/json_schema.md`](./json_schema.md)). There is no `--strict-advice`
flag — CI that wants to gate on advisory presence should consume the JSON
output and apply a workflow-level policy.
Expand Down
6 changes: 3 additions & 3 deletions docs/dogfooding_findings_tracker.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Status taxonomy:
| D4 | 2026-05-07 Session 4 dogfood | vacuous PASS (out-of-scope diff) | config-only PR produces empty Python `CodeStateDelta`; lock-only template constraints pass without exercising the actual change | **解決** | `docs/target_yaml_guide.md` Hazard 3 + `ADVISORY-D4` detector |
| D5 | 2026-05-07 Session 5 dogfood (FINDING-1) | set operator partial-match semantics | partial-dict `expected` items canonicalised as different elements from full extractor records — false positive on `includes_*` / `subset_of` / `superset_of`, **false negative (CI bypass) on `excludes_all`** | **解決** | PR #65 (CSCI-35c) — Match Schema partial-record matching + flat-projection aliases + `evidence.matched`; schema_version v4→v5 |
| D6 | 2026-05-28 real-PR complexity dogfood (FINDING-F1) | vacuous PASS (extractor coverage gap) — **重複・関連 = sibling of D4** | nested function bodies are excluded from `ComplexityEntry` by `python_complexity_extractor` spec (`api_surface` parity); refactor that nests outer-function body into nested helpers reports large CC drop while real complexity is unchanged | **解決** | `docs/target_yaml_guide.md` Hazard 4 + `ADVISORY-D6` detector (`authoring/hazards.py::detect_d6`, candidate path (a), mirrors D4's diff-aware contract) — fires when a verdict-participating `complexity_delta` constraint meets nested-def growth in the in-scope candidate diff (`--baseline-rev` ↔ `--candidate-rev`); growth signal computed per file via `authoring/nested_defs.py`, syntax-error sides skipped fail-silent. Path (b) (extractor emits nested-function entries) remains a long-term, schema-impacting option, not pursued. Reproduction: langgraph PR #3700 (8/1 vacuous PASS in real-PR pass) |
| D7 | 2026-05-28 real-PR complexity dogfood (FINDING-F2) | authoring mismatch (operator / constraint pairing) | `extract-method` refactor is mathematically guaranteed to **micro-increase cyclomatic** (each extracted function adds base 1), even with `_` prefix discipline and api_surface preserved. Cognitive is the metric that drops. Authors declaring `complexity_delta.cyclomatic ≤ 0` for extract-method refactors hit a structural false-FAIL | **未解決** | Candidate paths: (a) authoring guide section "Choosing complexity metric per refactor pattern" recommending `cognitive_delta` for extract-method; (b) future `ADVISORY-D7` detector emitted when a `change.primary_kind=refactor` target uses `cyclomatic_delta ≤ 0` and the diff matches extract-method shape. Low priority: this is authoring advice, not a CI integrity hazard |
| D7 | 2026-05-28 real-PR complexity dogfood (FINDING-F2) | authoring mismatch (operator / constraint pairing) | `extract-method` refactor is mathematically guaranteed to **micro-increase cyclomatic** (each extracted function adds base 1), even with `_` prefix discipline and api_surface preserved. Cognitive is the metric that drops. Authors declaring `complexity_delta.cyclomatic ≤ 0` for extract-method refactors hit a structural false-FAIL | **解決** | `docs/target_yaml_guide.md` Hazard 5 (incl. "Choosing a complexity metric per refactor pattern" table, candidate path (a)) + `ADVISORY-D7` detector (`authoring/hazards.py::detect_d7`, candidate path (b)) — fires when a `primary_kind=refactor` target locks `complexity_delta.cyclomatic` against any increase (`≤ 0` / `< 1` / `equals ≤ 0` / `within_range [·, ≤ 0]`) and the in-scope candidate diff adds extractor-visible defs (extract-method shape; counted via `authoring/nested_defs.py::count_visible_defs`, which delegates to the real extractor so parity cannot drift). Same diff-aware contract as D4 / D6 (silent skip without revs) |
| D8 | 2026-06-07 scale + security dogfood (SCA gap) | SCA sensor dependency-source discovery gap | SSP SCA auto-discovery (`_requirements_file` in `src/semantic_ci_code/cli/commands/ssp.py`) only found `requirements.txt` at repo root; the `--locked` fallback only accepted `pylock.toml` / requirements lockfiles. PEP 621 pyproject-only projects (litellm) and `pdm.lock` projects (pdm) declared deps in unrecognised formats → `pip-audit --locked .` errors "no lockfiles found" → empty JSON → adapter degraded to `unknown` (exit 3). Correct graceful degradation (no silent false PASS, honours `unknown > fail > pass`) but a real usability gap that blocked SCA on most modern Python projects | **解決** | CSCI-55 / PR #151 — dependency source discovery now recognises `requirements.txt`, `pylock.toml` / `pylock.*.toml`, `uv.lock`, `pdm.lock`, `poetry.lock`, and static PEP 621 `[project].dependencies`; lock sources are converted deterministically to pinned temp requirements, optional/non-default-group/marker-inactive packages are filtered, and malformed recognized sources fail closed to SSP `unknown` |

## Reading order
Expand All @@ -37,8 +37,8 @@ Status taxonomy:
## Classification at a glance

- **重複・関連 pairs**: D4 ↔ D6 (both are "vacuous PASS" via extractor coverage gap, distinct mechanism — D4 is "diff outside Python scope", D6 is "diff inside scope but inside nested function")
- **解決 (7 of 8)**: D1, D2, D3, D4, D5, D6, D8
- **未解決 (1 of 8)**: D7 (authoring advice, low priority)
- **解決 (8 of 8)**: D1, D2, D3, D4, D5, D6, D7, D8 — **D-class closure complete** (ROADMAP v0.1.0 exit criterion "D 全解決/waive" satisfied)
- **未解決 (0 of 8)**: none
- **observation-only (not a D#)**: F6 (pattern-SAST logic-vuln blindspot) — **UNTESTED HYPOTHESIS, not a demonstrated observation in the 2026-06-07 pass**: the Semgrep registry rulesets returned HTTP 403, so Semgrep ran with 0 loaded rules over 0 paths and produced no valid SAST measurement. F6 records the *a-priori* expectation that deterministic SAST misses semantic / business-logic vulns, cross-linked to Phase H (`docs/llm_sensor_adapter_planning.md`) as **motivation** — it is **not** empirically validated by this pass. Recorded in `docs/dogfooding_scale_and_security.md` (which now carries a validity warning + repro note for redoing the SAST sub-pass under a network policy allowing `semgrep.dev`). Distinct from the demonstrated observations of the same pass: real vulns merged-then-fixed (git evidence) and SCA clean-on-litellm (pip-audit positive-controlled with `jinja2==2.11.2` → 5 CVEs)

## Source pass index
Expand Down
4 changes: 2 additions & 2 deletions docs/exit_codes.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ git errors (`CompileError` on `target.yaml`, git revision resolution
failure when `--baseline-rev` / `--candidate-rev` is given, git
unavailable when explicitly required) exit 3. Internal bugs exit 4. When
neither `--baseline-rev` nor `--candidate-rev` is given and git is
unavailable or no baseline can be resolved, ADVISORY-D4 and ADVISORY-D6
are silently skipped rather than failing. There is no `--strict-advice` flag; CI that
unavailable or no baseline can be resolved, the diff-aware advisories
(ADVISORY-D4 / D6 / D7) are silently skipped rather than failing. There is no `--strict-advice` flag; CI that
wants to gate on advisory presence should consume `--format json` and
apply a workflow-level policy. Silent success on bad input is forbidden —
the advisor surface only suppresses the verdict step, not the input
Expand Down
9 changes: 5 additions & 4 deletions docs/json_schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ compile-repair, or validate-plan envelopes. The shape is pinned by

```jsonc
{
"schema_version": "advisory-1",
"schema_version": "advisory-2",
"subcommand": "target-doctor",
"advisories": [
{
Expand All @@ -316,14 +316,14 @@ compile-repair, or validate-plan envelopes. The shape is pinned by

| Field | Meaning |
|---|---|
| `schema_version` | Always `"advisory-1"`. |
| `schema_version` | Always `"advisory-2"`. |
| `subcommand` | Always `"target-doctor"`. |
| `advisories[].code` | One of `ADVISORY-D1`, `ADVISORY-D3`, `ADVISORY-D4`, `ADVISORY-D6`, `ADVISORY-I1`, `ADVISORY-P1`, `ADVISORY-P2`, `ADVISORY-S1`. |
| `advisories[].code` | One of `ADVISORY-D1`, `ADVISORY-D3`, `ADVISORY-D4`, `ADVISORY-D6`, `ADVISORY-D7`, `ADVISORY-I1`, `ADVISORY-P1`, `ADVISORY-P2`, `ADVISORY-S1`. |
| `advisories[].severity` | Always `"info"` — the Advisor surface never participates in the verdict (`docs/code_semantic_ci_design.md §23.3.1`). |
| `advisories[].message` | Human-readable explanation of the hazard. |
| `advisories[].evidence` | Per-advisory diagnostic fields (e.g. `constraint_id`, `target`, `package_root`, `files_touched_count`). |

Advisories are emitted in canonical order (D1 → D3 → D4 → D6 → I1 → P1 → P2 → S1)
Advisories are emitted in canonical order (D1 → D3 → D4 → D6 → D7 → I1 → P1 → P2 → S1)
with `constraint_id` as the within-code tiebreak so output is byte-identical
across runs. Advisory presence does not change the exit code — see
`docs/exit_codes.md`.
Expand Down Expand Up @@ -399,6 +399,7 @@ bump the envelope version.
| `1` | validate-plan | Initial Brief 5 pre-generation validation envelope with `risk_summary`. |
| `2` | validate-plan | Brief D3: added `risk_summary.authoring_errors` as a sibling list (positioned first). Adapter rendering surfaces a two-step "fix authoring first, then implement" instruction. |
| `advisory-1` | target-doctor | Brief 8 / CSCI-43: initial advisory envelope. Independent schema; not tied to verdict / compile / compile-repair / validate-plan versions. |
| `advisory-2` | target-doctor | D-class closure: added `ADVISORY-D6` (nested-function complexity displacement) and `ADVISORY-D7` (extract-method cyclomatic false-FAIL) to the `advisories[].code` enum. Enum extension on an in-use envelope requires a bump per the compatibility policy (D6 briefly shipped inside `advisory-1` on unreleased main; folded into this bump). |
| `catalog-1` | target-catalog | Brief 8 / CSCI-44: initial catalog envelope. Independent schema; mirrors runtime registries via INV-5 parity. |

## v2 to v3 Diff
Expand Down
2 changes: 1 addition & 1 deletion docs/target_authoring_surface.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,4 +161,4 @@ sentinel, not a switch.
- `docs/exit_codes.md` — `target-doctor` exit code policy
(advisory presence does not change the verdict; usage / engine errors
still use the global 2 / 3 / 4 policy)
- `docs/json_schema.md` — `advisory-1` and `catalog-1` envelopes
- `docs/json_schema.md` — `advisory-2` and `catalog-1` envelopes
39 changes: 39 additions & 0 deletions docs/target_yaml_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,45 @@ candidate diff grows the nested-def count in an in-scope Python file.
Growth is a heuristic displacement signal, not proof — review the diff
rather than treating the advisory as a verdict.

## Hazard 5 — Extract-method micro-increases cyclomatic (D7)

The mirror image of Hazard 4 (observed on a real PR in
`docs/dogfooding_real_pr_complexity.md` FINDING-F2). Where D6 hides
complexity by nesting helpers *inside* a function, D7 bites when you do
the **right** thing — extracting helpers to module level, where they are
extracted and counted.

`complexity_delta.cyclomatic` is the **sum** of per-function cyclomatic
over extractor-visible functions, and every function starts at base 1.
An extract-method refactor that preserves every branch therefore raises
the sum by exactly +1 per extracted helper — it is mathematically
impossible for a faithful extraction to keep `cyclomatic ≤ 0`. A
`primary_kind: refactor` target declaring that lock will FAIL on exactly
the refactor it means to endorse (a structural false-FAIL, not an engine
bug). Cognitive complexity is the metric that *drops* under extraction:
the nesting penalty disappears and `BoolOp` / branch costs redistribute.

**Choosing a complexity metric per refactor pattern:**

| Refactor pattern | Recommended constraint |
|---|---|
| extract-method / extract-function | `complexity_delta.cognitive less_than_or_equal 0` |
| inline-method (merging helpers back) | `complexity_delta.cyclomatic less_than_or_equal 0` (cognitive may rise from re-nesting) |
| pure simplification, no function count change | either metric; `cyclomatic` is stricter |
| large decomposition (N extracted helpers) | `cyclomatic less_than_or_equal N` if you must use cyclomatic |

`semantic-ci target-doctor` detects this hazard as `ADVISORY-D7` when
`--baseline-rev` / `--candidate-rev` are given: it fires when the
candidate diff adds N extractor-visible function definitions **net**
across the in-scope diff (pure relocation between files cancels out and
does not warn) and a refactor target declares a
`complexity_delta.cyclomatic` constraint that rejects that `+N` delta —
a declared `tolerance` or allowance that covers the helper count is
honoured and stays silent. Hazards 4
and 5 are complementary: D6 warns that a green cyclomatic verdict may be
hollow, D7 warns that a red one may be noise — both resolve by pairing
the right metric with the refactor shape.

## Constraint Authoring Tips

### Pick `kind` deliberately
Expand Down
1 change: 1 addition & 0 deletions src/semantic_ci_code/authoring/advisory.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"ADVISORY-D3",
"ADVISORY-D4",
"ADVISORY-D6",
"ADVISORY-D7",
"ADVISORY-I1",
"ADVISORY-P1",
"ADVISORY-P2",
Expand Down
Loading
Loading