Skip to content

Concepts

Pengfei Hu edited this page Apr 26, 2026 · 1 revision

Concepts

Vocabulary used throughout the rest of the wiki. Every term has a stable meaning in the JSON report and CLI output.


Manifest

A shipgate.yaml file. The single source of truth for one agent release. Every scan reads exactly one manifest and produces exactly one report. Schema is versioned (version: "0.1") and validated strictly — typos fail the scan with a suggested fix. Full grammar in Manifest Reference.

A manifest declares:

  • project / agent identity — names used in run IDs, fingerprints, finding evidence
  • declared purpose — short prose used to detect scope contradictions (e.g. read-only purpose + DELETE tool → finding)
  • environment.targetlocal | staging | production_like | production. Production targets fire stricter inventory checks
  • tool_sources — pointers to MCP exports, OpenAPI specs, or OpenAI Agents SDK Python files
  • permissions / policies / risk_overrides — declared expectations against which checks fire
  • ci — advisory/strict mode and which severities fail
  • checks.ignore — explicit suppressions with required reasons

Tool surface

The complete set of tools an agent can call after scanning. Built by:

  1. Loading every tool_sources[] entry and any openai_api artifacts
  2. Flattening into a single list keyed by tool name
  3. Resolving duplicates by source priority (openai_api > openapi > mcp > sdk_function); losers emit a Duplicate tool name warning
  4. Enriching each tool with risk hints

The flattened list is what the check catalog operates on. It's also surfaced verbatim in report.json under tool_inventory.

Risk hint

A (tag, source, confidence, evidence) record attached to a tool. Tags include read_only, write, destructive, external_write, financial_action, customer_communication, sensitive_data_access, infrastructure_change, code_execution. Sources include openapi_method, mcp_annotation, sdk_keyword, manual (from risk_overrides). Confidence is low | medium | high.

Hints are inputs to checks, not findings themselves. Most checks demand min_confidence="medium" to fire. You can promote, demote, or remove hints per tool via risk_overrides.tools.

Internally hints are produced by core/risk_hints.py:_add_automatic_hints plus your manual overrides. As of v0.2.0 the keyword classifier is fully tokenized — "deploy" matches the standalone token deploy but not the substring inside deployments.

Check

A pure function (ScanContext) -> list[Finding]. The 28 built-in checks are listed in the Check Catalog and live under src/agents_shipgate/checks/. Each has:

  • a stable check ID (e.g. SHIP-POLICY-APPROVAL-MISSING)
  • a default severity (critical | high | medium | low | info)
  • a category (policy, auth, schema, documentation, inventory, manifest, scope, side_effects, api)

You can override the severity via checks.severity_overrides and add custom checks via Plugin Authoring.

Finding

A single scan output. Every finding has:

  • id — the fingerprint plus a content-derived discriminator on collision
  • fingerprint — a stable sha256(check_id | tool_name | canonical evidence)[:16], prefixed fp_
  • check_id, severity, category, title
  • tool_name / tool_id (or agent_id for agent-level findings)
  • evidence — structured payload describing why the check fired
  • recommendation — actionable next step
  • suppressed, suppression_reason
  • baseline_statusnew, matched, or resolved when a baseline is applied

Fingerprints are content-addressed and stable across runs. They are the identity primitive used by suppressions and baselines.

Suppression

A manifest entry under checks.ignore that marks matching findings with suppressed: true and a required reason. Suppressed findings still appear in the JSON report (audit trail) but do not count toward severity totals or trigger CI failure. Stale suppressions (referencing missing checks or tools) emit SHIP-MANIFEST-STALE-SUPPRESSION.

Baseline

A snapshot of currently-active findings stored at .agents-shipgate/baseline.json. After saving a baseline, future scans tag each finding as matched (already in baseline), new, or resolved. Strict CI with --baseline fails only on new findings. See Baseline Workflow for the full pattern.

The fingerprint algorithm is v1 and intentionally excludes severity overrides, baseline status, source paths, timestamps, and the default_severity audit-evidence key — so a baseline survives manifest tweaks that don't change actual finding identity.

CI mode

advisory (default) or strict. Strict mode exits with code 20 on any unsuppressed finding whose severity is in ci.fail_on (default [critical]; configurable). Advisory mode never fails. See CI Recipes for usage patterns.

Trust model

Shipgate runs as a static analyzer. By default it does not import user code, run agents, call tools, invoke LLMs, connect to MCP servers, make network calls, or collect telemetry. The only opt-in to this guarantee is third-party check plugins, gated by AGENTS_SHIPGATE_ENABLE_PLUGINS=1 and overridable per-scan with --no-plugins. See Trust Model for details.

Exit codes

Code Meaning
0 Scan completed; advisory mode or strict-mode pass
2 Manifest config error (typo, missing field)
3 Input parse error (malformed YAML/JSON, path traversal blocked, file too large)
4 Other Agents Shipgate error
20 Strict-mode gate failure (findings exist at ci.fail_on severity)

A nonzero exit is always either a real finding (20) or a real error (2/3/4). Check the stderr message to disambiguate. See Troubleshooting.

Clone this wiki locally