-
Notifications
You must be signed in to change notification settings - Fork 0
Concepts
Vocabulary used throughout the rest of the wiki. Every term has a stable meaning in the JSON report and CLI output.
A shipgate.yaml file. The single source of truth for one agent release. Every scan reads exactly one manifest and produces exactly one report. Schema is versioned (version: "0.1") and validated strictly — typos fail the scan with a suggested fix. Full grammar in Manifest Reference.
A manifest declares:
- project / agent identity — names used in run IDs, fingerprints, finding evidence
- declared purpose — short prose used to detect scope contradictions (e.g. read-only purpose + DELETE tool → finding)
-
environment.target —
local | staging | production_like | production. Production targets fire stricter inventory checks - tool_sources — pointers to MCP exports, OpenAPI specs, or OpenAI Agents SDK Python files
- permissions / policies / risk_overrides — declared expectations against which checks fire
- ci — advisory/strict mode and which severities fail
- checks.ignore — explicit suppressions with required reasons
The complete set of tools an agent can call after scanning. Built by:
- Loading every
tool_sources[]entry and anyopenai_apiartifacts - Flattening into a single list keyed by tool name
- Resolving duplicates by source priority (
openai_api > openapi > mcp > sdk_function); losers emit aDuplicate tool namewarning - Enriching each tool with risk hints
The flattened list is what the check catalog operates on. It's also surfaced verbatim in report.json under tool_inventory.
A (tag, source, confidence, evidence) record attached to a tool. Tags include read_only, write, destructive, external_write, financial_action, customer_communication, sensitive_data_access, infrastructure_change, code_execution. Sources include openapi_method, mcp_annotation, sdk_keyword, manual (from risk_overrides). Confidence is low | medium | high.
Hints are inputs to checks, not findings themselves. Most checks demand min_confidence="medium" to fire. You can promote, demote, or remove hints per tool via risk_overrides.tools.
Internally hints are produced by core/risk_hints.py:_add_automatic_hints plus your manual overrides. As of v0.2.0 the keyword classifier is fully tokenized — "deploy" matches the standalone token deploy but not the substring inside deployments.
A pure function (ScanContext) -> list[Finding]. The 28 built-in checks are listed in the Check Catalog and live under src/agents_shipgate/checks/. Each has:
- a stable check ID (e.g.
SHIP-POLICY-APPROVAL-MISSING) - a default severity (
critical | high | medium | low | info) - a category (
policy,auth,schema,documentation,inventory,manifest,scope,side_effects,api)
You can override the severity via checks.severity_overrides and add custom checks via Plugin Authoring.
A single scan output. Every finding has:
-
id— the fingerprint plus a content-derived discriminator on collision -
fingerprint— a stablesha256(check_id | tool_name | canonical evidence)[:16], prefixedfp_ -
check_id,severity,category,title -
tool_name/tool_id(oragent_idfor agent-level findings) -
evidence— structured payload describing why the check fired -
recommendation— actionable next step -
suppressed,suppression_reason -
baseline_status—new,matched, orresolvedwhen a baseline is applied
Fingerprints are content-addressed and stable across runs. They are the identity primitive used by suppressions and baselines.
A manifest entry under checks.ignore that marks matching findings with suppressed: true and a required reason. Suppressed findings still appear in the JSON report (audit trail) but do not count toward severity totals or trigger CI failure. Stale suppressions (referencing missing checks or tools) emit SHIP-MANIFEST-STALE-SUPPRESSION.
A snapshot of currently-active findings stored at .agents-shipgate/baseline.json. After saving a baseline, future scans tag each finding as matched (already in baseline), new, or resolved. Strict CI with --baseline fails only on new findings. See Baseline Workflow for the full pattern.
The fingerprint algorithm is v1 and intentionally excludes severity overrides, baseline status, source paths, timestamps, and the default_severity audit-evidence key — so a baseline survives manifest tweaks that don't change actual finding identity.
advisory (default) or strict. Strict mode exits with code 20 on any unsuppressed finding whose severity is in ci.fail_on (default [critical]; configurable). Advisory mode never fails. See CI Recipes for usage patterns.
Shipgate runs as a static analyzer. By default it does not import user code, run agents, call tools, invoke LLMs, connect to MCP servers, make network calls, or collect telemetry. The only opt-in to this guarantee is third-party check plugins, gated by AGENTS_SHIPGATE_ENABLE_PLUGINS=1 and overridable per-scan with --no-plugins. See Trust Model for details.
| Code | Meaning |
|---|---|
0 |
Scan completed; advisory mode or strict-mode pass |
2 |
Manifest config error (typo, missing field) |
3 |
Input parse error (malformed YAML/JSON, path traversal blocked, file too large) |
4 |
Other Agents Shipgate error |
20 |
Strict-mode gate failure (findings exist at ci.fail_on severity) |
A nonzero exit is always either a real finding (20) or a real error (2/3/4). Check the stderr message to disambiguate. See Troubleshooting.
Agents Shipgate · Apache-2.0 · maintained by Three Moons Lab · Report a false positive
Getting started
Reference
Workflows
Extending
Project