ai-surface

Find the AI attack surface your code is about to ship. Locally, offline, before the PR merges.

ai-surface maps the AI attack surface in your codebase: LLM calls, agents, MCP servers, RAG/vector stores, model gateways, self-hosted runtimes, provider keys, and the HTTP APIs that expose them. Run it locally or in CI to see what AI surfaces a PR introduces, generate an AI-BOM, and gate new high-risk findings before merge.

It runs as a local static analysis pass that executes no code, makes no network calls, sends no telemetry, and requires no credentials, so your source never leaves the host.

Try it without installing:

uvx --from apisec-ai-surface ai-surface scan .

Findings map to the OWASP LLM Top 10 and the EU AI Act, NIST AI RMF, and ISO 42001, so the AI-BOM doubles as governance evidence (see Compliance). Runtime exploit validation is out of scope for this OSS scanner.

_{The optional --ui map shows detected AI surfaces as nodes grouped by category. It is served on loopback and runs locally.}

Who is this for?

Use ai-surface if you are:

adding agents, MCP servers, RAG, model gateways, or LLM calls to an application
reviewing AI-related pull requests before they merge
adding an AI risk gate to CI/CD
building an AI-BOM or AI-governance inventory from source code
trying to understand where AI risk enters your codebase

Built for DevOps, DevSecOps, platform engineering, AppSec, and security-minded engineering teams.

Who is this for?
Quick start
What the output looks like
First run on a mature repo
GitHub Action and CI gating
Open the local UI
What it detects
Proven on real code
Output formats
CLI reference
Compliance and governance
How it works
Comparison with adjacent tools
What it does not do
Roadmap
Runtime validation
Development
Project
License

Quick start

Install once, then run ai-surface anywhere:

pipx install apisec-ai-surface
ai-surface scan .

# or run once with no install
uvx --from apisec-ai-surface ai-surface scan .

# or in a project venv
pip install apisec-ai-surface && ai-surface scan .

# explore the results visually
ai-surface scan . --ui

Requires Python 3.9+. The CLI runs locally; --ui serves on loopback only.

What the output looks like

Reproduce a full multi-category report yourself on the bundled demo app: ai-surface scan examples/demo-app (add --ui for the interactive map). The report below is from a representative AI app.

AI Attack Surface Report
────────────────────────────────────────────────────────────────
Project:    lumora
Repository: apisec-inc/lumora
19 production AI surfaces · 25 risk indicators · across 8 detector(s)
Severity: 6 high · 2 medium

AGENT FRAMEWORKS
  • LangChain Agent: agent (in backend/app/ai/support_agent.py) [HIGH]
      Tools/perms: process_refund, lookup_order, send_email, update_address, search_knowledge
      ⚠ financial action exposed   ⚠ messaging action exposed   ⚠ high blast-radius combination
      ⚑ [HIGH] financial-action
        Agent can invoke financial tools (process_refund)
        OWASP: LLM06
        Governance: EU AI Act Art. 9
        Fix: Gate financial tools behind human approval; least-privilege the agent.
      ⚑ [HIGH] no-human-oversight
        High-risk action runs with no human approval / in-the-loop gate detected
        OWASP: LLM06, LLM09
        Governance: EU AI Act Art. 14
      ⚑ [MEDIUM] pii-to-llm
        Personal data (PII) is interpolated into a prompt template sent to the model
        OWASP: LLM02
        Governance: EU AI Act Art. 10, ISO 42001 A.7
      → validate at runtime: agent validation in APIsec
  • Mastra Agent: inventory (in assistant/src/inventory-agent.ts) [HIGH]
      Tools/perms: checkStock, reorder, deleteSku
      ⚑ [HIGH] destructive-action
        Agent can invoke destructive tools (deleteSku)
        OWASP: LLM06    Governance: EU AI Act Art. 9

MCP SERVERS
  • MCP Server: payments-mcp [HIGH]   Trust: verified (90/100)
      ⚑ [HIGH] secrets-in-env
        Environment variables in the config appear to hold sensitive credentials
        OWASP: LLM02, LLM07    Governance: EU AI Act Art. 15
      ⚑ [HIGH] financial-action
        MCP exposes financial tools (refund, charge, payout) to the model
        OWASP: LLM06    Governance: EU AI Act Art. 9
      ⚑ [MEDIUM] unverified-source
        MCP is not from a known/verified publisher
        OWASP: LLM03    Governance: ISO 42001 A.10
  • MCP Server: db-mcp [HIGH]
      ⚑ [HIGH] database-access      MCP can query or modify database contents      OWASP: LLM06
  • MCP Server: filesystem-mcp [HIGH]
      ⚑ [HIGH] filesystem-access    MCP can read/write files on the host           OWASP: LLM06

VECTOR-STORE
  • Vector store: pgvector  ·  RAG pipeline: LangChain   (backend/app/ai/knowledge.py)
      ⚠ retrieved content reaches the model (retrieval-augmented generation)
      ⚠ ingests external content (RAG poisoning surface)

LLM SDK CALL SITES
  • OpenAI SDK · gpt-4o · backend/app/ai/llm.py        ⚠ non-literal data flows into LLM call
  • AWS Bedrock · us.anthropic.claude-sonnet-4 · backend/app/ai/llm.py

API ENDPOINTS
  • GET   /customers/{customer_id}     ⚠ object-id in path (BOLA candidate)
  • PATCH /customers/{customer_id}     ⚠ object-id in path (BOLA candidate)

First run on a mature repo

The first run maps the AI surfaces already present in the codebase. The pattern that scales is to baseline existing surfaces, then gate only new high-risk findings in pull requests:

ai-surface scan . --update-baseline           # 1. snapshot today's inventory
ai-surface scan . --baseline                  # 2. show only what changed
ai-surface scan . --baseline --fail-on high   # 3. in CI, fail only on NEW high+ risk

--baseline --fail-on high is the recommended PR gate: low-noise, non-blocking on pre-existing debt, and actionable.

GitHub Action and CI gating

Drop this into .github/workflows/ai-surface.yml:

name: AI Surface Check
on: [pull_request]

permissions:
  contents: read
  pull-requests: write   # required when comment-on-pr is true

jobs:
  ai-surface:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0   # required for base-vs-head diff
      - uses: apisec-inc/AI-Surface@v1
        with:
          path: '.'
          comment-on-pr: 'true'
          fail-on: 'high'  # fail only on NEW high-or-critical findings

Every PR gets a sticky comment showing what changed in the PR, not just the current repo state. fail-on gates on assessed severity, so inventory-only findings do not fail the build. With fail-on: high, the build fails only when the PR introduces a new high-or-critical finding.

No API keys are required. The action uses the built-in GITHUB_TOKEN to post or update the PR comment.

For non-GitHub CI, the gate is just an exit code:

ai-surface scan . --fail-on high

See docs/CI_INTEGRATION.md for permissions, fork PR behavior, baseline options, SARIF upload, policy files, and multi-repo rollups.

Open the local UI

After installing ai-surface, you can open the interactive AI attack-surface map from any repo:

ai-surface scan . --ui

The UI shows detected AI surfaces as nodes grouped by category, with risk indicators, governance badges, evidence, and AI-BOM download where available.

The UI is served locally over 127.0.0.1 from a temporary directory.

The analysis never runs in the browser, your source code never leaves your machine, and no telemetry is collected.

To stop the local UI server, press Ctrl-C in the terminal.

What it detects

ai-surface looks for eight categories of AI surface. Configuration files, provider keys, manifests, and specs are detected across stacks; deeper code-level detection is strongest today for Python and TypeScript/JavaScript. See docs/LANGUAGE_SUPPORT.md for the full matrix.

Category	Coverage	What it finds
Agent frameworks	10 Python + 6 JS/TS frameworks	LangChain, LangGraph, CrewAI, LlamaIndex, AutoGen, Haystack, Semantic Kernel, Pydantic AI, AWS Strands, LangChain.js, LangGraph.js, Vercel AI SDK, Mastra, OpenAI Agents, and LlamaIndex.ts. Extracts agent tool inventories and flags financial, destructive, and high-blast-radius authority.
MCP servers	Discovery + deep-dive audit	Configured MCP servers such as `.mcp.json` entries and in-house source servers. Audited findings include risk flags, remediation, detected secrets by name/type only, and registry/trust signals.
Vector stores / RAG	13 stores + 2 frameworks	Pinecone, Weaviate, Chroma, Qdrant, Milvus, FAISS, LanceDB, pgvector, Elasticsearch/OpenSearch/Vespa/Redis in vector mode, plus LangChain and LlamaIndex retrieval pipelines. Flags managed-store egress, RAG data flow, embeddings, and external ingestion.
LLM SDK call sites	13 providers	Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, Google Generative AI, Vertex AI, Together, Mistral, Cohere, Replicate, Groq, LiteLLM, and Vercel AI SDK. Extracts models where visible and flags non-literal prompt/message flow.
API endpoints	HTTP/REST + OpenAPI	OpenAPI/Swagger specs and framework routes including FastAPI, Starlette, Flask, Express, Spring, and Django. Captures method, path, framework, auth style, and object-id path segments that may need BOLA review.
Model gateways	Configs + source	LiteLLM proxy, Portkey, Helicone, Cloudflare AI Gateway, and OpenRouter. Captures routed-model inventories where visible.
AI infrastructure	Manifests + IaC	Kubernetes, Helm, Docker Compose, Dockerfiles, and Terraform for AI runtimes and managed AI services such as Ollama, vLLM, TGI, SGLang, Triton, llama.cpp, Bedrock, SageMaker, and Vertex endpoints.
AI provider keys	Names only	Common provider key names such as `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, and `AZURE_OPENAI_*` across environment files and config. Values are never read or printed.

Inventory categories do not get severity by default. Severity is assigned only when the deep-dive audit has enough evidence, currently in MCP, agent, and RAG findings. See docs/DETECTORS.md for every pattern matched.

Proven on real code

We tested ai-surface against 19 popular open-source AI projects on GitHub, representing about 941k combined stars at the time of analysis. The set included AutoGPT, Dify, RAGFlow, AutoGen, CrewAI, LlamaIndex, Continue, Danswer, and others.

This was a static, offline analysis only: each repo was shallow-cloned, analyzed locally, and deleted. No application was run. No code left the host.

The set split into 12 applications and 7 framework/library repos. Frameworks and libraries are reported separately because they often include many integrations as code paths, so their raw component counts are not comparable to applications.

Across the 12 application repos, ai-surface found:

Signal	Apps
Ship AI agents	83%
Have a vector store / RAG layer	83%
Expose API endpoints	83%
Have API endpoints that may need BOLA review	67%
Expose MCP servers	42%
Run an agent/MCP surface with no observability wired	33%
Interpolate PII into prompts	17%
Trip at least one risk and one governance rule	100%

These are category-presence signals, not exploitability claims. The numbers are a floor, not a ceiling: tool resolution is regex/AST-light today, so some agent/tool risks can under-fire on larger platforms. Raw per-component counts are useful for investigation but can vary by framework, coding style, and detector coverage.

Full methodology, per-app appendix, framework/library appendix, and caveats are in the State of AI Surface report.

Output formats

ai-surface scan .                      # terminal report
ai-surface scan . --ui                 # interactive map in a local browser
ai-surface scan . --output json        # machine-readable JSON (schema 1.0)
ai-surface scan . --output markdown    # human-readable Markdown report
ai-surface scan . --output cyclonedx   # CycloneDX AI-BOM
ai-surface scan . --output sarif       # SARIF 2.1.0 for GitHub code scanning
ai-surface scan . --write-inventory    # write .ai-inventory.md to the project root
ai-surface scan . --quiet              # one-line CI summary

CycloneDX output is the AI-BOM: an inventory artifact generated in CI the way teams already generate SBOMs, with AI governance mappings attached.

SARIF output can be uploaded to GitHub code scanning for Security tab visibility and inline PR annotations.

The --ui viewer serves over 127.0.0.1 from a throwaway temp directory; the analysis never runs in the browser, nothing is sent off your machine, and there is no telemetry.

CLI reference

# Map the current project
ai-surface scan .

# Open the local interactive map
ai-surface scan . --ui

# Produce machine-readable or human-readable output
ai-surface scan . --output json
ai-surface scan . --output markdown
ai-surface scan . --output cyclonedx
ai-surface scan . --output sarif

# Filter to specific categories
# aliases: mcp, agents, llm, gateway, infra, keys, api, vector
ai-surface scan . --categories mcp,agents
ai-surface scan . --categories vector

# Gate CI by assessed severity
ai-surface scan . --fail-on high       # fail on high or critical findings
ai-surface scan . --fail-on critical   # fail only on critical findings

# Aggressive gate: fail on any risk indicator
ai-surface scan . --fail-on-risk

# Baseline existing AI surfaces, then show only what changed
ai-surface scan . --update-baseline
ai-surface scan . --baseline
ai-surface scan . --baseline --fail-on high

# Compare two reports
ai-surface compare base.json head.json

Compliance and governance

ai-surface maps audited findings to the OWASP LLM Top 10 and to evidence-relevant clauses in the EU AI Act, NIST AI RMF, and ISO/IEC 42001.

The UI shows these mappings as badges. JSON output carries them as structured standards fields. CycloneDX output carries them as component properties, making it your AI-BOM artifact.

ai-surface produces evidence; it does not certify, attest, or assert compliance. A framework requirement is reported only when the analysis produced that kind of evidence.

What the mappings cover

Evidence kind	Examples	Framework use
Inventory	agents, MCP servers, RAG/vector stores, LLM calls, gateways, AI infra	AI-BOM, system documentation, governance inventory
Risk	financial tools, destructive tools, high-blast-radius agents, secret-bearing MCP config	risk review and assessment evidence
Human oversight	high-risk action with no detected approval gate	EU AI Act Art. 14 review signal
Logging / monitoring	agent or MCP execution surface with no detected tracing	EU AI Act Art. 12, NIST MEASURE 3, ISO A.6.2.6 evidence
Data governance	RAG/vector layer, PII interpolated into prompts	EU AI Act Art. 10, ISO A.7 evidence

Example risk-flag mappings

Risk flag	OWASP	EU AI Act	NIST AI RMF	ISO/IEC 42001
`secrets-in-env`	LLM02	Art. 15	-	-
`financial-action` / `destructive-action` / `high-blast-radius`	LLM06	Art. 9	-	-
`no-human-oversight`	LLM06 / LLM09	Art. 14	-	-
`no-observability`	-	Art. 12	MEASURE 3	A.6.2.6
`pii-to-llm`	LLM02	Art. 10	-	A.7
`unverified-source` / `remote-mcp` / `local-binary`	LLM03	-	-	A.10
vector store / RAG present	LLM08	Art. 10	data	A.7

The reported footprint is a floor, not a ceiling: static analysis can miss risks in code that builds tools dynamically or through factory functions.

Full mapping details, caveats, and AI-BOM generation guidance are in docs/COMPLIANCE.md.

How it works

ai-surface maps AI surfaces from source code and configuration. It reads files, matches known AI-surface signatures, runs deep-dive audits where enough evidence exists, attaches governance mappings, and produces reports for developers, CI systems, and security teams.

flowchart TD
    A[Developer writes AI code] --> B[Opens PR]
    B --> C[CI runs ai-surface]
    C --> D[Read repo files locally]
    D --> E[Detect AI surfaces<br/>agents · MCP · RAG · LLMs · APIs]
    E --> F[Audit risk indicators<br/>+ map governance evidence]
    F --> G[Compare base vs head<br/>+ attach AI-BOM]
    G --> H{DevOps / security reviewer}
    H --> I[Request changes]
    H --> J[Merge]

    style E fill:#00a99d,stroke:#0b6b63,stroke-width:2px,color:#ffffff
    style F fill:#7c3aed,stroke:#4c1d95,stroke-width:2px,color:#ffffff
    style H fill:#d97706,stroke:#7c4a14,stroke-width:2px,color:#ffffff
    style J fill:#16a34a,stroke:#0f7a37,stroke-width:2px,color:#ffffff

The CLI executes no code, uses no credentials, sends no telemetry, and makes no network calls.

When PR comments are enabled, the GitHub Action uses the repository's GITHUB_TOKEN to post or update a comment through the GitHub API. It does not send source code, findings, or metadata to APIsec.

Deep dive: docs/ARCHITECTURE.md.

Comparison with adjacent tools

Tool	What it tells you	When it sees AI
SAST (Semgrep, CodeQL)	Code-pattern vulnerabilities	After commit; usually does not build an AI surface inventory
DAST (Burp, ZAP)	Reachable web vulnerabilities	After deploy; sees HTTP behavior, not LLM/agent internals
SCA (Snyk, Dependabot)	Vulnerable dependencies	After commit; sees packages, not how AI components are used
Observability (Helicone, LangSmith, Arize, Phoenix)	What LLM calls happened at runtime	After deploy; requires runtime traffic
AI-SPM / AI governance tools	Cloud/runtime AI inventory and posture	Often runtime/cloud-first; not usually a local PR-time gate
`ai-surface`	What AI attack surface is about to ship, mapped to governance evidence	At PR time, before merge, offline
APIsec platform	Which AI/API surfaces are actually exploitable	Runtime validation with replayable evidence

ai-surface does not replace these tools. It focuses on the local, PR-time AI attack-surface gap that most adjacent tools do not cover directly.

What it does not do

Runtime telemetry or behavior monitoring. Use tools like Helicone, LangSmith, Arize, or Phoenix.
Runtime exploit validation. ai-surface maps and audits statically; it does not prove exploitability against a running app (see Runtime validation).
Prompt injection, jailbreak, bias, or accuracy testing. Out of scope by design. ai-surface is a structural analyzer, not a model evaluator.
Full cross-file dataflow for tool resolution. Regex/AST-light today; agent tools built through factory functions may not be fully resolved. Treat the map as a strong floor, not proof of complete coverage. AST/dataflow is the top roadmap item.
Secret-value reads or PII classification. Secrets are reported by name and type only, with values redacted. Use a dedicated secret scanner for value-level coverage.

Roadmap

Version	Status	What's in it
v1.0	Shipped	8-category mapping, MCP + agent + RAG audits, OWASP + EU/NIST/ISO governance mapping, AI-BOM + SARIF, interactive `--ui` map, frozen schema 1.0, GitHub Action with PR diff comments, `--baseline` and `--fail-on` gates.
Fast-follow	Planned	AST / cross-file dataflow for tool resolution, `.ai-surface.yml` policy file, GitLab CI component.
Later	Planned	kubectl plugin, live cluster discovery, continuous mode + drift alerts, multi-repo rollup, plugin SDK.

Runtime validation

ai-surface maps AI attack surfaces from source code and configuration. It identifies where agents, MCP servers, RAG paths, LLM calls, gateways, infrastructure, provider keys, and AI-exposed APIs exist, and flags risk indicators where static evidence is present.

It does not prove exploitability against a running application.

For runtime validation with replayable evidence, see APIsec:

Source surface	Runtime validation path
AI / agent surfaces	agent validation
MCP servers	MCP runtime validation
Discovered APIs	API outside-in runtime testing

The boundary is intentional: free local discovery here, runtime exploit validation in APIsec. Bridges are an upgrade path, not a data-sharing integration. No finding data leaves your machine; the bridge is a deep link.

Development

git clone https://github.com/apisec-inc/AI-Surface
cd AI-Surface
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest                       # tests
ruff check src/ tests/       # lint
mypy src/                    # types

To add a detector, implement the Detector protocol in types.py, register it in default_detectors(), and add fixtures and tests under tests/.

The report shape is frozen in docs/SCHEMA_v1.md. See CONTRIBUTING.md for contributor guidance.

Project

Resource	Link
Detectors	docs/DETECTORS.md
Compliance mapping	docs/COMPLIANCE.md
Language support	docs/LANGUAGE_SUPPORT.md
Architecture	docs/ARCHITECTURE.md
CI integration	docs/CI_INTEGRATION.md
Report schema	docs/SCHEMA_v1.md
State of AI Surface	docs/STATE_OF_AI_SURFACE.md
Privacy	docs/PRIVACY.md
Changelog	CHANGELOG.md

License

MIT. See LICENSE.

Maintained by APIsec. Part of the APIsec Labs OSS family.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai-surface

Who is this for?

Table of Contents

Quick start

What the output looks like

First run on a mature repo

GitHub Action and CI gating

Open the local UI

What it detects

Proven on real code

Output formats

CLI reference

Compliance and governance

What the mappings cover

Example risk-flag mappings

How it works

Comparison with adjacent tools

What it does not do

Roadmap

Runtime validation

Development

Project

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
docs		docs
examples		examples
fixtures		fixtures
src/ai_surface		src/ai_surface
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

ai-surface

Who is this for?

Table of Contents

Quick start

What the output looks like

First run on a mature repo

GitHub Action and CI gating

Open the local UI

What it detects

Proven on real code

Output formats

CLI reference

Compliance and governance

What the mappings cover

Example risk-flag mappings

How it works

Comparison with adjacent tools

What it does not do

Roadmap

Runtime validation

Development

Project

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages