Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .well-known/agents-shipgate.json
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
"2": "config_error",
"3": "input_parse_error",
"4": "other_error",
"6": "baseline_integrity_failure",
"20": "strict_gate_failure"
},
"agent_mode_env_var": "AGENTS_SHIPGATE_AGENT_MODE",
Expand Down
81 changes: 81 additions & 0 deletions STABILITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ These commands and flags are stable across all `0.x.y` releases. They will only
| `agents-shipgate bootstrap` | `--workspace`, `--confidence`, `--no-ci`, `--no-apply`, `--json` |
| `agents-shipgate list-checks` | `--json`, `--no-plugins` |
| `agents-shipgate baseline save` | `-c`, `--config`, `--out` |
| `agents-shipgate baseline verify` (v0.11+) | `--baseline`, `--audit-log`, `--strict`, `--json`, `--verbose` |
| `agents-shipgate fixture list` | `--json` |
| `agents-shipgate fixture run` | `<name>`, `--ci-mode`, `--out` |
| `agents-shipgate fixture copy` | `<name>`, `--to` |
Expand All @@ -39,6 +40,7 @@ These commands and flags are stable across all `0.x.y` releases. They will only
| `2` | Manifest config error (missing/typo/invalid) |
| `3` | Input parse error (malformed YAML/JSON, file too large, path traversal blocked) |
| `4` | Other Agents Shipgate error |
| `6` | Baseline integrity failure (v0.11+) — `agents-shipgate baseline verify --strict` detected `SHIP-BASELINE-INTEGRITY-MISMATCH`. Only the standalone `baseline verify` command emits this code; `scan` continues to use `20` for gate failure regardless of integrity-mode. |
| `20` | Strict-mode gate failure (≥ 1 unsuppressed finding hit `fail_on`, or ≥ 1 active unbaselined finding sets `blocks_release`) |

### Runtime contract JSON
Expand Down Expand Up @@ -171,6 +173,85 @@ reject unknown top-level fields instead of silently ignoring release policy.
Manifests that use `action_surface:` require a CLI whose
`agents-shipgate contract --json` reports `report_schema_version >= 0.16`.

### Baseline Integrity (v0.5)

Baseline schema bumps to `0.5`. The wire shape adds an optional
`findings[].provenance` block per entry recording when and by which scanner
the entry was added:

```json
{
"fingerprint": "fp_…",
"check_id": "SHIP-…",
"tool_name": "…",
"severity": "high",
"title": "…",
"provenance": {
"scanner_version": "0.11.0",
"run_id": "agents_shipgate_…",
"recorded_at": "2026-05-15T14:23:00Z",
"reason": null,
"expires": null
}
}
```

`provenance` is optional on the wire so older v0.2/v0.3/v0.4 baselines still
load. The integrity check flags legacy-no-provenance entries as
`SHIP-BASELINE-INTEGRITY-MISMATCH` until they are re-stamped by re-running
`agents-shipgate baseline save`. `provenance.reason` and `provenance.expires`
are reviewer-set and free-form / ISO-8601 date respectively.

Each `agents-shipgate baseline save` appends one JSON line to
`<baseline-dir>/baseline-audit.log`. The log row is **stable**:

- `audit_schema_version: "0.1"`
- `timestamp` — ISO-8601 UTC
- `run_id` — scan's run_id (matches `BaselineProvenance.run_id` for any
fingerprints added in this save)
- `scanner_version` — Agents Shipgate version that wrote the row
- `baseline_path` — string path saved at the time of the row
- `hash_before` — `"sha256:…"` of the prior baseline file content, or `null`
when this was the first save
- `hash_after` — `"sha256:…"` of the new baseline file content
- `added_fingerprints[]`, `removed_fingerprints[]` — sorted deltas

The audit log is append-only and intentionally co-located with the baseline so
a single `.agents-shipgate/` directory carries both. Commit both files
together; reviewers can `git log .agents-shipgate/baseline-audit.log` to see
when fingerprints joined the baseline.

`manifest.baseline.integrity_mode` controls behavior when `scan --baseline X`
detects an integrity issue. Stable values:

- `off` — no integrity checks. Back-compat escape hatch for repos that have
not migrated to v0.5 baselines yet.
- `warn` (default in v0.11) — integrity findings emitted but
`blocks_release: false`; release decision is unaffected.
- `strict` — `SHIP-BASELINE-INTEGRITY-MISMATCH` carries
`blocks_release: true` and `agents-shipgate baseline verify` exits `6` on
the same condition.

New stable check IDs (v0.11+):

- `SHIP-BASELINE-INTEGRITY-MISMATCH` (critical) — file hash mismatch, missing
audit log, audit log empty, entry references unknown `run_id`, or entry
loaded from a legacy schema without provenance.
- `SHIP-BASELINE-ENTRY-EXPIRED` (high) — `provenance.expires` < today.
- `SHIP-BASELINE-ENTRY-STALE` (low) — deprecated check ID in the entry, or
the entry matched no active finding (scan-aware; resolved-not-pruned).

Integrity findings bypass `checks.ignore` (suppression) and
`checks.severity_overrides`. Silencing tamper detection would defeat the
trust property the audit log defends. They flow through the regular report
pipeline otherwise (fingerprinting, baseline-status assignment, remediation
annotation).

The audit log is **tamper-evident, not tamper-proof**: a well-resourced
adversary who atomically rewrites both the baseline JSON and the audit log
defeats `verify`. The goal is to make casual or accidental edits observably
wrong in code review.

### Tool-Surface Diff

`agents-shipgate scan --diff-from <path>` accepts a prior `report.json` or a
Expand Down
64 changes: 64 additions & 0 deletions docs/checks.json
Original file line number Diff line number Diff line change
Expand Up @@ -539,6 +539,70 @@
"requires_human_review": true,
"suggested_patch_kind": "manual"
},
{
"autofix_safe": false,
"category": "baseline",
"default_severity": "high",
"description": "Baseline entry's review window has expired.",
"docs_url": "https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/docs/checks.md#ship-baseline-entry-expired",
"evidence_fields": [
"fingerprint",
"check_id",
"tool_name",
"expires",
"days_overdue",
"reason"
],
"fires_when": "A baseline entry's `provenance.expires` is before today's UTC date.",
"id": "SHIP-BASELINE-ENTRY-EXPIRED",
"rationale": "Reviewer-set `provenance.expires` is the renewable consent for accepting technical debt. Past that date the entry needs a fresh review, not a silent extension.",
"recommendation": "Re-review the accepted debt and either remove the entry, fix the underlying finding, or extend `provenance.expires` with a new reason.",
"requires_human_review": true,
"suggested_patch_kind": "manual"
},
{
"autofix_safe": false,
"category": "baseline",
"default_severity": "low",
"description": "Baseline entry no longer corresponds to an active finding or check ID.",
"docs_url": "https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/docs/checks.md#ship-baseline-entry-stale",
"evidence_fields": [
"fingerprint",
"check_id",
"tool_name",
"kind",
"replacement_check_ids"
],
"fires_when": "A baseline entry references a deprecated check ID (an alias in LEGACY_CHECK_ID_ALIASES) or did not match any active scan finding (resolved_count contribution).",
"id": "SHIP-BASELINE-ENTRY-STALE",
"rationale": "Stale baseline entries hide intent \u2014 reviewers cannot tell whether the accepted debt was resolved or whether the check was renamed. Pruning keeps the baseline aligned with reality.",
"recommendation": "Remove resolved entries via `agents-shipgate baseline save`, or update deprecated check IDs to their canonical replacements.",
"requires_human_review": true,
"suggested_patch_kind": "manual"
},
{
"autofix_safe": false,
"category": "baseline",
"default_severity": "critical",
"description": "Baseline file integrity check failed.",
"docs_url": "https://github.com/ThreeMoonsLab/agents-shipgate/blob/main/docs/checks.md#ship-baseline-integrity-mismatch",
"evidence_fields": [
"fingerprint",
"check_id",
"tool_name",
"kind",
"expected_hash",
"computed_hash",
"audit_log_path",
"latest_audit_run_id"
],
"fires_when": "The baseline file's SHA-256 differs from the latest audit log entry's hash_after, the audit log is missing or empty, an entry's provenance.run_id is not in the audit log, or an entry pre-dates v0.5 and lacks provenance entirely.",
"id": "SHIP-BASELINE-INTEGRITY-MISMATCH",
"rationale": "The baseline JSON has been edited outside `agents-shipgate baseline save`, lacks an audit log row, or references a run_id not present in the audit log. A release gate that accepts silent baseline edits cannot claim to govern technical debt.",
"recommendation": "Re-run `agents-shipgate baseline save` to refresh the baseline and audit log, or `agents-shipgate baseline verify` for the full report. Investigate the diff before accepting.",
"requires_human_review": true,
"suggested_patch_kind": "manual"
},
{
"autofix_safe": false,
"category": "codex_plugin",
Expand Down
31 changes: 31 additions & 0 deletions docs/checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,37 @@ who is accountable for remediation.
unused scopes or add tool metadata showing why the permission is needed. Broad
unused write/admin scopes are `high`; other unused scopes are `medium`.

### SHIP-BASELINE-INTEGRITY-MISMATCH

Baseline file integrity check failed. Emitted when the baseline JSON has been
edited outside `agents-shipgate baseline save` (hash mismatch against the
audit log), when the audit log is missing or empty for a non-empty baseline,
when an entry's `provenance.run_id` is not present in the audit log, or when
an entry pre-dates the v0.5 provenance contract. In
`baseline.integrity_mode: strict` the finding carries `blocks_release=true`
and `agents-shipgate baseline verify --strict` exits with code 6.
Re-run `agents-shipgate baseline save` to refresh the baseline alongside its
audit row; investigate the diff before accepting.

### SHIP-BASELINE-ENTRY-EXPIRED

A baseline entry's reviewer-set `provenance.expires` date is past today.
Renewable consent is a deliberate choice: accepted technical debt should
need re-review on a schedule, not a silent extension. Re-review the entry
and either remove it, fix the underlying finding, or extend
`provenance.expires` with a new `reason`.

### SHIP-BASELINE-ENTRY-STALE

A baseline entry no longer corresponds to an active finding or check ID.
Two sub-kinds, both `low` severity:

- `deprecated_check_id` — entry references an alias in `LEGACY_CHECK_ID_ALIASES`.
Update the entry to the canonical replacement check IDs (re-running
`baseline save` does not rewrite check IDs).
- `resolved_not_pruned` — entry matched no active scan finding. Re-run
`agents-shipgate baseline save` to drop the entry from the baseline.

## Risk Tags

Risk tags are hints, not findings by themselves. Checks consume tags with confidence thresholds.
Expand Down
33 changes: 33 additions & 0 deletions docs/manifest-v0.1.json
Original file line number Diff line number Diff line change
Expand Up @@ -604,6 +604,36 @@
"title": "ArtifactPathConfig",
"type": "object"
},
"BaselineConfig": {
"additionalProperties": false,
"description": "Manifest knob governing v0.5 baseline integrity checks.\n\n``integrity_mode`` decides what happens when ``scan`` (with\n``--baseline``) detects an integrity issue:\n\n- ``off``: no integrity checks run (back-compat escape hatch for\n repos that have not migrated to v0.5 baselines yet).\n- ``warn`` (default in v0.17): integrity findings are emitted but\n ``blocks_release`` is false; release decision is unaffected.\n- ``strict``: ``SHIP-BASELINE-INTEGRITY-MISMATCH`` findings get\n ``blocks_release=true`` and ``agents-shipgate baseline verify``\n exits with code 6 on the same condition. Recommended target for\n v0.18.\n\n``audit_log`` overrides the default audit log path (relative to\nthe baseline file's directory). Usually left at its default.",
"properties": {
"audit_log": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Audit Log"
},
"integrity_mode": {
"default": "warn",
"enum": [
"off",
"warn",
"strict"
],
"title": "Integrity Mode",
"type": "string"
}
},
"title": "BaselineConfig",
"type": "object"
},
"ChecksConfig": {
"additionalProperties": false,
"properties": {
Expand Down Expand Up @@ -1507,6 +1537,9 @@
],
"default": null
},
"baseline": {
"$ref": "#/$defs/BaselineConfig"
},
"checks": {
"$ref": "#/$defs/ChecksConfig"
},
Expand Down
31 changes: 31 additions & 0 deletions llms-full.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1481,6 +1481,37 @@ who is accountable for remediation.
unused scopes or add tool metadata showing why the permission is needed. Broad
unused write/admin scopes are `high`; other unused scopes are `medium`.

### SHIP-BASELINE-INTEGRITY-MISMATCH

Baseline file integrity check failed. Emitted when the baseline JSON has been
edited outside `agents-shipgate baseline save` (hash mismatch against the
audit log), when the audit log is missing or empty for a non-empty baseline,
when an entry's `provenance.run_id` is not present in the audit log, or when
an entry pre-dates the v0.5 provenance contract. In
`baseline.integrity_mode: strict` the finding carries `blocks_release=true`
and `agents-shipgate baseline verify --strict` exits with code 6.
Re-run `agents-shipgate baseline save` to refresh the baseline alongside its
audit row; investigate the diff before accepting.

### SHIP-BASELINE-ENTRY-EXPIRED

A baseline entry's reviewer-set `provenance.expires` date is past today.
Renewable consent is a deliberate choice: accepted technical debt should
need re-review on a schedule, not a silent extension. Re-review the entry
and either remove it, fix the underlying finding, or extend
`provenance.expires` with a new `reason`.

### SHIP-BASELINE-ENTRY-STALE

A baseline entry no longer corresponds to an active finding or check ID.
Two sub-kinds, both `low` severity:

- `deprecated_check_id` — entry references an alias in `LEGACY_CHECK_ID_ALIASES`.
Update the entry to the canonical replacement check IDs (re-running
`baseline save` does not rewrite check IDs).
- `resolved_not_pruned` — entry matched no active scan finding. Re-run
`agents-shipgate baseline save` to drop the entry from the baseline.

## Risk Tags

Risk tags are hints, not findings by themselves. Checks consume tags with confidence thresholds.
Expand Down
Loading
Loading