From dc8a448370202289d78e4fe7f20e924cb9418c3f Mon Sep 17 00:00:00 2001 From: Ruben Sousa Dinis Date: Tue, 16 Jun 2026 11:54:51 +0100 Subject: [PATCH 1/8] Add polygraph skill: behavioral trust grades for MCP servers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Polygraph grades MCP servers A–F by connecting like an agent, fingerprinting the exact tool surface, and running three behavioral probes (C-01 tool-output injection, C-02 permission/egress overreach, C-03 sensitive-data leak), then publishing a reproducible grade as an onchain EAS attestation on Base. The skill covers: checking a grade (`npx polygraphso check `), running the open litmus harness locally to grade your own server, why a server got a given grade, and the verify-before-trust pattern for Bankr agents (recompute the live tool-surface fingerprint and require it to match the attestation before executing). Co-Authored-By: Claude Opus 4.8 (1M context) --- polygraph/SKILL.md | 185 ++++++++++++++++++++++ polygraph/references/bankr-integration.md | 117 ++++++++++++++ polygraph/references/cli.md | 144 +++++++++++++++++ polygraph/references/methodology.md | 104 ++++++++++++ 4 files changed, 550 insertions(+) create mode 100644 polygraph/SKILL.md create mode 100644 polygraph/references/bankr-integration.md create mode 100644 polygraph/references/cli.md create mode 100644 polygraph/references/methodology.md diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md new file mode 100644 index 0000000000..67773b40f5 --- /dev/null +++ b/polygraph/SKILL.md @@ -0,0 +1,185 @@ +--- +name: polygraph +description: Behavioral trust grades (A–F) for MCP servers and AI tools. Use when an agent needs to check whether an MCP server is safe before using it, look up a server's published grade, get a project graded, verify an onchain attestation before trusting or paying a server, or understand why a server received a grade. Polygraph connects to a server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on: MCP server safety, is this MCP server safe, tool trust, prompt injection, tool poisoning, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. +emoji: πŸ§ͺ +tags: [security, mcp, trust, grade, attestation, base, prompt-injection, agent-safety] +visibility: public +--- + +# Polygraph: Behavioral Trust Grades for MCP Servers + +Agents wire up third-party MCP servers and then trust whatever those servers' tools +return. Polygraph tests a server's **behavior** before your agent does, and assigns a +letter grade **A–F** backed by reproducible evidence. + +A passing grade is a **measurement, not a guarantee** β€” it says "this exact tool surface +did not misbehave under these probes," and because the harness is open and deterministic, +anyone can re-run it and disprove a bad grade. That falsifiability is the whole point. + +- **Home / methodology:** [polygraph.so](https://polygraph.so) +- **Lookup CLI (npm):** `polygraphso` +- **Grading harness:** `@polygraphso/litmus` (open source) + +--- + +## What a grade measures + +Polygraph connects to a server the way an agent would β€” **stdio** for local packages, +**Streamable HTTP** for remote URLs β€” fingerprints its exact tool surface +(`tools/list` β†’ canonical JSON β†’ sha256 β†’ `bytes32`), then runs three probe categories: + +- **C-01 β€” Tool-output injection.** Does the server try to hijack the agent? Static scan of + tool names/descriptions/schemas for injection-shaped content (invisible unicode, + instruction mimicry, markdown tricks) **plus** dynamic bait calls that check whether tool + outputs smuggle in instructions. +- **C-02 β€” Permission / egress overreach.** Does the server do more than it claims? Flags + tools that declare `readOnlyHint: true` but carry destructive verbs, and runs the server in + a hardened **default-deny Docker sandbox** where any outbound network attempt is a finding. +- **C-03 β€” Sensitive-data handling.** Does the server leak secrets? Plants canary values in + the environment and working directory, exercises the tools, and scans both tool outputs and + egress for any canary that surfaces. + +### Grade scale + +| Grade | Meaning | +|-------|---------| +| **A** | Passed all three categories. No injection, no unexpected egress, no data leak. | +| **B** | Injection checks passed; egress **not verified** (no Docker sandbox, or a remote target). Capped at B by design. | +| **C** | Reserved β€” not currently assigned. | +| **D** | Unexpected egress / permission overreach, but no injection or leak. Serious, not proven exfiltration β†’ capped at D. | +| **F** | Disqualifying: active tool-output injection (C-01) or a sensitive-data leak (C-03). This is a server that would harm an agent that trusts it. | + +(There is no E.) Every grade ships with a plain-English **rationale** β€” never a bare letter. +See [`references/methodology.md`](references/methodology.md) for the full decision logic and +each probe in depth. + +--- + +## Check a grade + +A sub-second lookup against published grades β€” **one command before your agent installs +anything:** + +```bash +npx polygraphso check npm/@modelcontextprotocol/server-filesystem +``` + +Refs are **registry-prefixed** (the prefix disambiguates β€” `redis` exists on npm, PyPI, and +GitHub with different content): + +```bash +polygraphso check npm/@modelcontextprotocol/server-filesystem +polygraphso check pypi/mcp-server-git +polygraphso check github/anthropic/mcp-server-foo +polygraphso list # every tracked server + adoption tier + status +polygraphso list --json | jq '.servers[] | select(.adoption_tier == "top10")' +``` + +Behavioral grades are rolling out β€” tracked-but-ungraded servers report +`polygraph: not yet available` with a `notify` link, and the result lands once the litmus +harness has graded them. Full CLI reference: [`references/cli.md`](references/cli.md). + +--- + +## β˜… Get your project graded + +This is the CTA β€” **run the open harness on your own MCP server, get an A–F grade plus a +reproducible evidence bundle, and publish it onchain so agents can verify it:** + +```bash +# Grade your server end-to-end (npm ref, https URL, or local path) +npx -y -p @polygraphso/litmus polygraphso-litmus litmus npm/@your-scope/your-mcp-server +``` + +You get the grade, the per-category verdicts, your tool-surface fingerprint, and a +content-addressed evidence bundle. Publishing that grade as an **onchain EAS attestation on +Base** (so other agents can look it up and verify it) is a one-step hand-off β€” see +[`references/methodology.md`](references/methodology.md#publishing-a-grade). + +Prefer not to run it yourself? Request a grade or get notified when yours publishes at +**[polygraph.so](https://polygraph.so)**. + +> **One line for builders:** check any MCP server before your agent uses it with +> `npx polygraphso check `, and get your own server graded at +> [polygraph.so](https://polygraph.so). + +--- + +## Run the harness locally + +The harness is the same open, deterministic engine that produces published grades: + +```bash +npm i -g @polygraphso/litmus # or use npx, above +polygraphso-litmus litmus npm/@modelcontextprotocol/server-filesystem +polygraphso-litmus litmus https://example.com/mcp --bearer "$TOKEN" +polygraphso-litmus litmus ./path/to/local-mcp-server --json +``` + +- **Node β‰₯ 18.** **Docker is optional** but recommended β€” without it the egress probe (C-02) + is skipped and the grade is **capped at B**. +- **Exit codes are CI-friendly:** non-zero on a failing grade (D/F), zero on A/B/C β€” drop it + into a pipeline to gate dependencies. + +Flags, env vars, `--json` output, and the `check` / `challenge` / `list` subcommands are all +in [`references/cli.md`](references/cli.md). + +--- + +## Why a server got grade X + +Every run prints the methodology, the per-category verdict, the tool-surface fingerprint, and +the grade with a one-paragraph rationale: + +``` +β†’ litmus Β· npm/@modelcontextprotocol/server-filesystem +β†’ version 0.1.0 +β†’ C-01 pass Β· C-02 pass Β· C-03 pass +β†’ fingerprint 0x1a2b3c4d…5e6f7890 +β†’ grade: A + All three categories passed. No injection, no unexpected egress, no data leak. +``` + +On a failure the report surfaces the top HIGH-severity findings (tool name, finding kind, the +offending snippet). [`references/methodology.md`](references/methodology.md) maps every +grade and finding kind to its cause. + +--- + +## Verify before you trust (Bankr integration) + +This is why polygraph matters for agents: **gate an MCP server through its grade before your +agent uses it, pays it, or routes a transaction through it.** Polygraph is the *verify* step; +Bankr is the *execute* step. + +The trust anchor is the **tool-surface fingerprint**: an attestation is only meaningful if the +server you're about to call still has the surface that was graded. The agent recomputes the +live fingerprint and requires it to equal the attested one before acting β€” a built-in +rug-pull check. Drop the `verify_attestation` MCP tool in front of execution, or use the +`gateDecision` helper. Full patterns, the MCP server config, and a worked +"verify-then-execute" example: [`references/bankr-integration.md`](references/bankr-integration.md). + +--- + +## How much to trust the grade (honest limits) + +- **Reproducibility is the trust anchor.** The harness is open source and deterministic, so a + false grade is falsifiable β€” anyone can re-run it against the same server and the result + must match. +- **A self-published grade is forgeable** by whoever signs it; that's why reproducibility (not + the signature) is what makes a grade trustworthy, and why the fingerprint recheck guards + against a graded-then-swapped server. +- **Evasion is the residual limit:** a server that detects the test context could behave during + grading and misbehave in production. This is disclosed, not hidden. +- Stronger, independent guarantees (staked bonds, TEE-backed runs, independent re-grading) are + on the roadmap, not claimed today. + +--- + +## Resources + +- **Home + methodology:** https://polygraph.so +- **Lookup CLI:** `npx polygraphso check //` Β· https://www.npmjs.com/package/polygraphso +- **Grading harness:** `@polygraphso/litmus` (open source β€” see polygraph.so for the repo) +- **Onchain proof:** EAS attestations on Base +- **References:** [`methodology.md`](references/methodology.md) Β· [`cli.md`](references/cli.md) Β· [`bankr-integration.md`](references/bankr-integration.md) diff --git a/polygraph/references/bankr-integration.md b/polygraph/references/bankr-integration.md new file mode 100644 index 0000000000..571a75719e --- /dev/null +++ b/polygraph/references/bankr-integration.md @@ -0,0 +1,117 @@ +# Polygraph + Bankr Integration Guide + +## Overview + +Polygraph is the **verify** layer; Bankr is the **execute** layer. Before a Bankr agent adds +an MCP server as a tool, routes a payment through it, or trusts its output, gate it through +its polygraph grade. Untrusted tool surfaces are exactly how an agent gets prompt-injected or +made to leak a key β€” polygraph turns "should I trust this server?" into a checkable fact. + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Your Agent β”‚ +β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Polygraph β”‚ Verify β”‚ Bankr β”‚ Execute β”‚ +β”‚ β”‚ Skill β”‚ ───────────────▢│ Skill β”‚ ───────────▢ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β”‚ β”‚ β”‚ β”‚ +β”‚ β–Ό β–Ό β”‚ +β”‚ β€’ Look up grade (A–F) β€’ Swaps / transfers β”‚ +β”‚ β€’ Verify onchain attestation β€’ Stop-loss / DCA β”‚ +β”‚ β€’ Recompute live fingerprint β€’ Token launches β”‚ +β”‚ β€’ gate: pay / refuse β€’ Any signed action β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +## The core rule: fingerprint must match + +A grade is only valid for the exact tool surface it was measured against. An attestation binds +the grade to a `toolDefsFingerprint`. **Before trusting a server, recompute its live +fingerprint and require it to equal the attested one.** If they differ, the server changed +after it was graded β€” treat it as ungraded and refuse. This is the built-in rug-pull check. + +`A` and `B` are usable grades; `D` and `F` are refusals by default (`D` = unexpected egress, +`F` = injection or leak). Pick your own threshold, but never skip the fingerprint check. + +## Use cases + +### 1. Gate a new MCP tool before your agent adds it + +```bash +REF="npm/@some-vendor/their-mcp-server" + +# Run the harness (or use `polygraphso check $REF` for a published grade) +GRADE=$(npx -y -p @polygraphso/litmus polygraphso-litmus litmus "$REF" --json | jq -r '.grade') + +case "$GRADE" in + A|B) echo "βœ“ $REF graded $GRADE β€” safe to wire up" ;; + *) echo "βœ— $REF graded $GRADE β€” do NOT add as a tool"; exit 1 ;; +esac +``` + +`litmus` exits non-zero on D/F, so in CI you can also just let the exit code gate the step. + +### 2. Verify-then-execute (the agent gate) + +```ts +import { readAttestation, liveFingerprint, gateDecision } from "@polygraphso/litmus"; + +async function safeToUse(serverRef: string): Promise { + const attestation = await readAttestation(serverRef); // onchain EAS record on Base + if (!attestation || attestation.revoked) return false; + + const live = await liveFingerprint(serverRef); // recompute current tool surface + const decision = gateDecision(attestation, live); // checks grade + fingerprint match + return decision.action === "pay"; +} + +// Only let Bankr act once the upstream tool is verified +if (await safeToUse("npm/@vendor/price-oracle-mcp")) { + await bankr("swap $100 USDC to ETH on base"); +} else { + console.warn("Upstream MCP server failed polygraph gate β€” refusing to execute."); +} +``` + +### 3. Inline MCP verification + +With the polygraph MCP server configured, the agent can verify before it acts: + +``` +verify_attestation { "serverRef": "npm/@vendor/price-oracle-mcp" } +β†’ { status: "attested", grade: "A", attestationUid: "0x…", toolDefsFingerprint: "0x…", revoked: false, network: "base" } +``` + +Then recompute the live fingerprint and only proceed if it equals `toolDefsFingerprint`. + +## MCP configuration (Polygraph + Bankr) + +```json +{ + "mcpServers": { + "polygraph": { + "command": "npx", + "args": ["-y", "-p", "@polygraphso/litmus", "polygraphso-litmus-mcp"], + "env": { "POLYGRAPH_API_URL": "https://polygraph.so" } + }, + "bankr": { + "command": "npx", + "args": ["bankr-mcp-server"], + "env": { "BANKR_API_KEY": "bk_..." } + } + } +} +``` + +## Best practices + +1. **Verify before you execute.** Check the grade *and* the fingerprint before letting Bankr + sign or pay through any server-derived data. +2. **Never trust a grade without the fingerprint match** β€” a graded-then-swapped server is the + obvious attack. +3. **Pick a threshold and enforce it.** Default: accept A/B, refuse D/F; decide C-as-reserved + per your risk tolerance. +4. **Re-verify on change.** Cache by fingerprint; if the live fingerprint changes, re-gate. +5. **Treat a pass as a measurement, not a guarantee.** It bounds risk; it does not remove it. + Keep Bankr's own transaction-verification guards on. diff --git a/polygraph/references/cli.md b/polygraph/references/cli.md new file mode 100644 index 0000000000..13c39a55de --- /dev/null +++ b/polygraph/references/cli.md @@ -0,0 +1,144 @@ +# Polygraph CLI & MCP reference + +Polygraph ships two command-line surfaces: + +| Package | Bin | Purpose | +|---------|-----|---------| +| **`polygraphso`** | `polygraphso` | Thin, sub-second **lookup** client for published grades. Published on npm. | +| **`@polygraphso/litmus`** | `polygraphso-litmus`, `polygraphso-litmus-mcp` | The full open **harness** β€” runs the probes and grades a server; also an embeddable MCP server. | + +Server refs are always **registry-prefixed**: `//` β€” e.g. +`npm/@modelcontextprotocol/server-filesystem`, `pypi/mcp-server-git`, +`github/anthropic/mcp-server-foo`. The prefix disambiguates names that exist on multiple +registries. The harness also accepts a raw `https://…/mcp` URL or a local path. + +--- + +## `polygraphso` β€” look up a grade + +```bash +npx polygraphso check npm/@modelcontextprotocol/server-filesystem # sub-second lookup +npm i -g polygraphso # or install globally + +polygraphso check // # latest published grade +polygraphso list [--json] # every tracked server + adoption tier + status +polygraphso --version +polygraphso --help +``` + +Example output: + +``` +β†’ tracked Β· top 10 adoption +β†’ polygraph: A Β· version 0.1.0 Β· https://base.easscan.org/attestation/view/ +``` + +Tracked-but-ungraded servers report `polygraph: not yet available` with a notify link; +behavioral grades are rolling out as the harness grades each server. + +Config: `POLYGRAPH_API_URL` overrides the lookup endpoint (useful for local testing). + +--- + +## `@polygraphso/litmus` β€” run the harness + +```bash +npm i -g @polygraphso/litmus +# or, no install: +npx -y -p @polygraphso/litmus polygraphso-litmus litmus +``` + +### Commands + +```bash +polygraphso-litmus litmus # grade a server end-to-end +polygraphso-litmus check # look up a published grade +polygraphso-litmus challenge # dispute a grade by re-running it +polygraphso-litmus list # list published grades +polygraphso-litmus --version | --help +``` + +`challenge` is the teeth behind reproducibility: re-run the harness against a server that +carries a grade and, if your result disagrees, you have a falsification anchored to the same +fingerprint. + +### Flags (`litmus`) + +| Flag | Effect | +|------|--------| +| `--json` | Emit the full canonical `EvidenceBundle` instead of the human summary. | +| `--bearer ` | Bearer auth for an HTTP target (or set `LITMUS_BEARER`). | +| `--header "Key: Value"` | Add a custom request header (repeatable). | +| `--allow-state-changing` | Permit calls to state-mutating tools during dynamic probes. | + +### Environment + +| Var | Effect | +|-----|--------| +| `POLYGRAPH_API_URL` | Set to `https://polygraph.so` to pin the evidence bundle and get a publish/mint hand-off URL. Unset = fully offline run. | +| `LITMUS_BEARER` | Bearer token for HTTP auth. | +| `LITMUS_STDIO_ISOLATION` | Set to `docker` to **require** Docker isolation for stdio targets (fail-closed if Docker is unavailable). | + +### Requirements & exit codes + +- **Node β‰₯ 18.** +- **Docker optional** β€” without it the egress probe (C-02) is skipped and the grade is capped + at **B**. With `LITMUS_STDIO_ISOLATION=docker`, isolation is mandatory. +- **Exit codes:** non-zero on a failing grade (**D/F**), zero on **A/B/C** β€” drop `litmus` into + CI to gate a dependency on its behavioral grade. + +### Human output + +``` +β†’ litmus Β· npm/@modelcontextprotocol/server-filesystem +β†’ version 0.1.0 +β†’ C-01 pass Β· C-02 pass Β· C-03 pass +β†’ fingerprint 0x1a2b3c4d…5e6f7890 +β†’ grade: A + All three categories passed. No injection, no unexpected egress, no data leak. +``` + +On failure the summary lists the top HIGH-severity findings (tool name, finding kind, +snippet). The `--json` bundle carries everything (see +[`methodology.md`](methodology.md#the-evidence-bundle)). + +--- + +## MCP server (`polygraphso-litmus-mcp`) + +Embed polygraph in Claude, Cursor, or any MCP client so your agent can grade and verify +servers inline. Tools: + +- **`run_litmus`** β€” grade a server and return grade, per-category findings, fingerprint, and + (when `POLYGRAPH_API_URL` is set) a publish hand-off. +- **`verify_attestation`** β€” read a server's onchain grade and return the attested grade, + fingerprint, report CID, and revocation/network status. Recompute the live fingerprint and + require it to equal the attested one before trusting the server. + +```json +{ + "mcpServers": { + "polygraph": { + "command": "npx", + "args": ["-y", "-p", "@polygraphso/litmus", "polygraphso-litmus-mcp"], + "env": { "POLYGRAPH_API_URL": "https://polygraph.so" } + } + } +} +``` + +See [`bankr-integration.md`](bankr-integration.md) for the verify-then-execute pattern. + +--- + +## Programmatic use + +```ts +import { runLitmus, gateDecision, liveFingerprint, readAttestation } from "@polygraphso/litmus"; + +const bundle = await runLitmus("npm/@scope/server"); // β†’ EvidenceBundle { grade, categories, fingerprint, … } + +const attestation = await readAttestation("npm/@scope/server"); +const live = await liveFingerprint("npm/@scope/server"); +const decision = gateDecision(attestation, live); // β†’ { action: "pay" | "refuse", reason } +``` diff --git a/polygraph/references/methodology.md b/polygraph/references/methodology.md new file mode 100644 index 0000000000..5e06e36dda --- /dev/null +++ b/polygraph/references/methodology.md @@ -0,0 +1,104 @@ +# Polygraph Methodology β€” how a server gets its grade + +Polygraph runs the **litmus** harness: connect to an MCP server the way an agent would, +fingerprint its exact tool surface, run three behavioral probe categories, and assign an +**A–F** grade with a deterministic, content-addressed evidence bundle. The harness is open +source and the run is reproducible β€” that is what makes a grade trustworthy. + +## Connect & fingerprint + +- **Transport:** `stdio` for local packages (npm/PyPI/path), **Streamable HTTP** for remote + URLs. +- **Fingerprint:** `tools/list` β†’ canonical JSON of each tool's `{name, description, + inputSchema}` β†’ `sha256` β†’ `bytes32`. The fingerprint is the trust anchor: a grade is only + valid for the exact surface it was measured against. If a server is graded and then changes + its tools, the fingerprint no longer matches and any verifier should refuse (see + [`bankr-integration.md`](bankr-integration.md)). + +## The three probe categories + +### C-01 β€” Tool-output injection +Does the server try to hijack the agent that calls it? +- **Static (1.1):** scan every tool name, description, and `inputSchema` for injection-shaped + content β€” invisible/zero-width unicode, instruction mimicry ("ignore previous + instructions…"), and markdown tricks. Deterministic; makes no calls. +- **Dynamic (1.2):** issue benign bait calls to each tool and scan the outputs for injected + instructions echoed back to the agent. +- **Fail:** any HIGH-severity finding in either probe. + +### C-02 β€” Permission / egress overreach +Does the server do more than it declares? +- **Declared-permission honesty (2.1):** flag tools that declare `readOnlyHint: true` but whose + names carry destructive verbs (`send`, `delete`, `swap`, `sign`, `transfer`, …). +- **Unexpected egress (2.2):** run the server inside a hardened **default-deny Docker sandbox** + with a network sinkhole; any outbound attempt is a finding. +- **Fail:** any HIGH-severity finding in 2.1, or any finding in 2.2. +- **Skipped** (not failed) only when 2.1 passes and 2.2 could not run because no Docker sandbox + was available β€” which caps the grade at **B**. + +### C-03 β€” Sensitive-data handling +Does the server leak secrets it was exposed to? +- **Output leak (3.1):** plant canary values in the environment and seed the working directory + with fake secrets, exercise the tools, and scan outputs for any canary echo. +- **Egress leak (3.2):** scan the egress-sandbox capture for canary bytes. Degrades to + `partial` without a sandbox; never silently dropped. +- **Fail:** any canary surfacing in either probe. + +### Finding kinds +`invisible-unicode`, `instruction-mimicry`, `markdown-trick` (C-01) Β· `permission-mislabel`, +`egress` (C-02) Β· `canary` (C-03). Each finding carries a severity and a snippet. + +## Grade decision logic + +| Grade | Condition | +|-------|-----------| +| **A** | C-01 pass Β· C-02 pass Β· C-03 pass. | +| **B** | C-01 pass Β· C-02 **skipped** (no sandbox / remote target) Β· C-03 pass. Injection passed; egress unverified. | +| **C** | Reserved β€” not assigned by the current logic. | +| **D** | C-02 **fail** (unexpected egress / overreach) while C-01 and C-03 pass. Egress is serious but not proven exfiltration, so the grade caps at D. | +| **F** | C-01 **fail** (injection) **or** C-03 **fail** (data leak). Active injection or a leak harms an agent that trusts the server. | + +A grade is always paired with a rationale string explaining *why* β€” the harness never emits a +bare letter. + +## The evidence bundle + +`--json` (and the published record) emit a canonical `EvidenceBundle`: + +- `grade`, `gradeRationale` +- `categories[]` β€” each with probe results and findings +- `toolDefs[]` (canonicalized name/description/inputSchema) and `toolDefsFingerprint` (bytes32) +- `methodologyVersion`, `ranAt`, resolved version +- harness info (Docker availability, stdio isolation mode) and a reproducibility disclaimer + +Because the bundle is canonical and content-addressed (its CID is a hash of its bytes), two +honest runs of the same harness against the same server produce the same bundle and the same +fingerprint. + +## Publishing a grade + +The harness **grades and hands off**; it does not mint. Publishing a grade means recording it +as an **EAS (Ethereum Attestation Service) attestation on Base**, which binds: + +`serverRef` Β· `toolDefsFingerprint` Β· overall grade Β· per-category verdicts (C-01/C-02/C-03) Β· +evidence CID (the bundle, pinned to IPFS) Β· methodology version Β· run timestamp Β· resolved +version. + +Run the harness with `POLYGRAPH_API_URL=https://polygraph.so` to pin the evidence bundle and +receive a browser hand-off to sign the attestation, or request publication at +[polygraph.so](https://polygraph.so). Once attested, the grade is discoverable by ref and +verifiable onchain by any agent. + +## How much to trust it (disclosed limits) + +- **Reproducibility is the anchor.** Open + deterministic harness β‡’ a false grade is + falsifiable by re-running it. +- **A published grade is forgeable by its signer.** Trust comes from reproducibility and the + fingerprint recheck, not from the signature alone. +- **Evasion is the residual limit:** a server that detects the test context could pass grading + and misbehave in production. +- Independent/unforgeable upgrades (staked bonds, zkTLS, TEE-backed runs, independent + re-grading) are roadmap, not claimed today. + +The canonical, versioned methodology lives at **[polygraph.so](https://polygraph.so)**; the +open-source harness is the source of truth for the exact probe and grade logic. From 15bf70c4b8ccc17a264b3641ea268d2d879b5c7c Mon Sep 17 00:00:00 2001 From: Ruben Sousa Dinis Date: Tue, 16 Jun 2026 12:07:43 +0100 Subject: [PATCH 2/8] polygraph skill: reflect live published grades MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Behavioral grades are now live via `polygraphso check` / `list` (A–F across graded servers), so replace the "rolling out / not yet available" framing and the stale example outputs with the real current CLI output, including the shipped grades. Co-Authored-By: Claude Opus 4.8 (1M context) --- polygraph/SKILL.md | 24 +++++++++++++----------- polygraph/references/cli.md | 24 ++++++++++++++++++------ 2 files changed, 31 insertions(+), 17 deletions(-) diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index 67773b40f5..e6a9f09a8e 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -61,23 +61,25 @@ A sub-second lookup against published grades β€” **one command before your agent anything:** ```bash -npx polygraphso check npm/@modelcontextprotocol/server-filesystem +$ npx polygraphso check npm/@modelcontextprotocol/server-filesystem +β†’ polygraph: A Β· litmus-v2 Β· 2026-06-11 +β†’ details β†’ polygraph.so/#checks ``` -Refs are **registry-prefixed** (the prefix disambiguates β€” `redis` exists on npm, PyPI, and -GitHub with different content): +Published grades are **live** and span the full range. Browse the graded set with +`polygraphso list`: ```bash -polygraphso check npm/@modelcontextprotocol/server-filesystem -polygraphso check pypi/mcp-server-git -polygraphso check github/anthropic/mcp-server-foo -polygraphso list # every tracked server + adoption tier + status -polygraphso list --json | jq '.servers[] | select(.adoption_tier == "top10")' +$ polygraphso list +npm/@modelcontextprotocol/server-filesystem A +npm/@upstash/context7-mcp D +npm/@playwright/mcp F ``` -Behavioral grades are rolling out β€” tracked-but-ungraded servers report -`polygraph: not yet available` with a `notify` link, and the result lands once the litmus -harness has graded them. Full CLI reference: [`references/cli.md`](references/cli.md). +Refs are **registry-prefixed** β€” the prefix disambiguates (`redis` exists on npm, PyPI, and +GitHub with different content): `npm/…`, `pypi/…`, `github/…`. A tracked-but-ungraded server +reports `not available yet` with a notify link, and its grade lands as the litmus harness +covers more of the ecosystem. Full CLI reference: [`references/cli.md`](references/cli.md). --- diff --git a/polygraph/references/cli.md b/polygraph/references/cli.md index 13c39a55de..9f70d74214 100644 --- a/polygraph/references/cli.md +++ b/polygraph/references/cli.md @@ -21,20 +21,32 @@ npx polygraphso check npm/@modelcontextprotocol/server-filesystem # sub-second npm i -g polygraphso # or install globally polygraphso check // # latest published grade -polygraphso list [--json] # every tracked server + adoption tier + status +polygraphso list [--json] # every graded server + its grade polygraphso --version polygraphso --help ``` -Example output: +Example output (published grades are live): ``` -β†’ tracked Β· top 10 adoption -β†’ polygraph: A Β· version 0.1.0 Β· https://base.easscan.org/attestation/view/ +$ polygraphso check npm/@modelcontextprotocol/server-filesystem +β†’ polygraph: A Β· litmus-v2 Β· 2026-06-11 +β†’ details β†’ polygraph.so/#checks + +$ polygraphso list +npm/@modelcontextprotocol/server-filesystem A +npm/@upstash/context7-mcp D +npm/@playwright/mcp F + +$ polygraphso list --json | jq -r '.servers[] | "\(.polygraph) \(.server_ref)"' +A npm/@modelcontextprotocol/server-filesystem +D npm/@upstash/context7-mcp +F npm/@playwright/mcp ``` -Tracked-but-ungraded servers report `polygraph: not yet available` with a notify link; -behavioral grades are rolling out as the harness grades each server. +A tracked-but-ungraded server reports `not available yet` with a +`polygraph.so/notify?for=` link; its grade lands as the litmus harness covers more of the +ecosystem. Config: `POLYGRAPH_API_URL` overrides the lookup endpoint (useful for local testing). From 96de432aa57db844e4f5d7d9f200b0eb182ebae6 Mon Sep 17 00:00:00 2001 From: Ruben Sousa Dinis Date: Tue, 16 Jun 2026 14:52:30 +0100 Subject: [PATCH 3/8] Fix SKILL.md YAML: remove colon in description "Triggers on:" inside the plain-scalar description made YAML read it as a nested mapping ("mapping values are not allowed in this context"). Reword to "Triggers on mentions of" (no colon), matching the zerion skill convention. Co-Authored-By: Claude Opus 4.8 (1M context) --- polygraph/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index e6a9f09a8e..f16c04e009 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -1,6 +1,6 @@ --- name: polygraph -description: Behavioral trust grades (A–F) for MCP servers and AI tools. Use when an agent needs to check whether an MCP server is safe before using it, look up a server's published grade, get a project graded, verify an onchain attestation before trusting or paying a server, or understand why a server received a grade. Polygraph connects to a server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on: MCP server safety, is this MCP server safe, tool trust, prompt injection, tool poisoning, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. +description: Behavioral trust grades (A–F) for MCP servers and AI tools. Use when an agent needs to check whether an MCP server is safe before using it, look up a server's published grade, get a project graded, verify an onchain attestation before trusting or paying a server, or understand why a server received a grade. Polygraph connects to a server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on mentions of MCP server safety, is this MCP server safe, tool trust, prompt injection, tool poisoning, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. emoji: πŸ§ͺ tags: [security, mcp, trust, grade, attestation, base, prompt-injection, agent-safety] visibility: public From 3eca93f745ce7ccbc61eac5646932b612d8169f4 Mon Sep 17 00:00:00 2001 From: Ruben Sousa Dinis Date: Tue, 16 Jun 2026 15:04:06 +0100 Subject: [PATCH 4/8] polygraph skill: tighten scope, grade scale, and trust framing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address review feedback: - Scope to MCP servers (drop "AI tools" β€” the whole harness is MCP-specific). - Make the remote/Docker-less B-cap explicit and frame it as a property of the measurement, not a knock (a remote B is not "worse than" a local A). - Stop hardcoding named third-party grades; keep one live first-party A as proof and treat the live set / attestation as the point-in-time source of truth. - Present the live scale as A/B/D/F; note C/E are not assigned (C reserved). - Elevate runtime verify-before-trust above the get-graded CTA and surface the evasion caveat at the trust decision. Co-Authored-By: Claude Opus 4.8 (1M context) --- polygraph/SKILL.md | 94 ++++++++++++++++------------- polygraph/references/cli.md | 20 +++--- polygraph/references/methodology.md | 7 ++- 3 files changed, 68 insertions(+), 53 deletions(-) diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index f16c04e009..31cef575b1 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -1,6 +1,6 @@ --- name: polygraph -description: Behavioral trust grades (A–F) for MCP servers and AI tools. Use when an agent needs to check whether an MCP server is safe before using it, look up a server's published grade, get a project graded, verify an onchain attestation before trusting or paying a server, or understand why a server received a grade. Polygraph connects to a server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on mentions of MCP server safety, is this MCP server safe, tool trust, prompt injection, tool poisoning, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. +description: Behavioral trust grades (A–F) for MCP servers. Use when an agent needs to check whether an MCP server is safe before using it, verify an onchain attestation before trusting or paying a server, look up a server's published grade, get a project graded, or understand why a server received a grade. Polygraph connects to an MCP server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on mentions of MCP server safety, is this MCP server safe, tool poisoning, prompt injection, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. emoji: πŸ§ͺ tags: [security, mcp, trust, grade, attestation, base, prompt-injection, agent-safety] visibility: public @@ -9,7 +9,7 @@ visibility: public # Polygraph: Behavioral Trust Grades for MCP Servers Agents wire up third-party MCP servers and then trust whatever those servers' tools -return. Polygraph tests a server's **behavior** before your agent does, and assigns a +return. Polygraph tests an MCP server's **behavior** before your agent does, and assigns a letter grade **A–F** backed by reproducible evidence. A passing grade is a **measurement, not a guarantee** β€” it says "this exact tool surface @@ -44,14 +44,19 @@ Polygraph connects to a server the way an agent would β€” **stdio** for local pa | Grade | Meaning | |-------|---------| | **A** | Passed all three categories. No injection, no unexpected egress, no data leak. | -| **B** | Injection checks passed; egress **not verified** (no Docker sandbox, or a remote target). Capped at B by design. | -| **C** | Reserved β€” not currently assigned. | -| **D** | Unexpected egress / permission overreach, but no injection or leak. Serious, not proven exfiltration β†’ capped at D. | -| **F** | Disqualifying: active tool-output injection (C-01) or a sensitive-data leak (C-03). This is a server that would harm an agent that trusts it. | +| **B** | Injection and data-leak checks passed; **egress was not verified.** The ceiling for any run without a local Docker sandbox β€” including every remote (HTTP) server, which can't be sandboxed. | +| **D** | Unexpected egress / permission overreach, but no injection or leak. Serious, but not proven exfiltration β†’ capped at D. | +| **F** | Disqualifying: active tool-output injection (C-01) or a sensitive-data leak (C-03) β€” a server that would harm an agent that trusts it. | -(There is no E.) Every grade ships with a plain-English **rationale** β€” never a bare letter. -See [`references/methodology.md`](references/methodology.md) for the full decision logic and -each probe in depth. +**Reading a B.** Under the current methodology, egress can only be observed by running the +server in a local default-deny sandbox β€” so a **remote MCP server caps at B** no matter how +clean it is. A remote B is a limit of the *measurement*, not a mark against the server; don't +read it as "worse than" a local A, because the two aren't directly comparable. (Grades **C** +and **E** are not assigned today; **C** is reserved.) + +Every grade ships with a plain-English **rationale** β€” never a bare letter. See +[`references/methodology.md`](references/methodology.md) for the full decision logic and each +probe in depth. --- @@ -66,27 +71,46 @@ $ npx polygraphso check npm/@modelcontextprotocol/server-filesystem β†’ details β†’ polygraph.so/#checks ``` -Published grades are **live** and span the full range. Browse the graded set with -`polygraphso list`: - -```bash -$ polygraphso list -npm/@modelcontextprotocol/server-filesystem A -npm/@upstash/context7-mcp D -npm/@playwright/mcp F -``` +Grades are **live** and span the full range. Browse the current graded set with +`polygraphso list`, or at [polygraph.so](https://polygraph.so). A grade is **point-in-time +evidence** β€” treat your own run, or the live attestation, as the source of truth rather than +any letter copied into a doc. Refs are **registry-prefixed** β€” the prefix disambiguates (`redis` exists on npm, PyPI, and GitHub with different content): `npm/…`, `pypi/…`, `github/…`. A tracked-but-ungraded server -reports `not available yet` with a notify link, and its grade lands as the litmus harness -covers more of the ecosystem. Full CLI reference: [`references/cli.md`](references/cli.md). +reports `not available yet` with a notify link. Full CLI reference: +[`references/cli.md`](references/cli.md). + +--- + +## Verify before you trust (Bankr integration) + +The highest-value use at runtime: **gate an MCP server through its grade before your agent +uses it, pays it, or routes a transaction through it.** Polygraph is the *verify* step; Bankr +is the *execute* step. Two checks, both required: + +1. **Grade meets your bar.** Default: accept A/B, refuse D/F. (A remote server's ceiling is B β€” + see "Reading a B" above, and don't penalize it for that.) +2. **Fingerprint still matches.** An attestation is only valid for the exact tool surface it + graded. Recompute the server's **live** tool-surface fingerprint and require it to equal the + attested one before acting β€” a built-in rug-pull check against a graded-then-swapped server. + +Drop the `verify_attestation` MCP tool in front of execution, or use the `gateDecision` helper. + +> **Carry this into the decision:** a grade is a *measurement, not a guarantee.* A server that +> detects the test context could behave during grading and misbehave in production β€” **evasion** +> is the disclosed residual limit. Keep Bankr's own transaction-verification guards on, even +> for an A. + +Full patterns, the MCP server config, and a worked "verify-then-execute" example: +[`references/bankr-integration.md`](references/bankr-integration.md). --- ## β˜… Get your project graded -This is the CTA β€” **run the open harness on your own MCP server, get an A–F grade plus a -reproducible evidence bundle, and publish it onchain so agents can verify it:** +**Run the open harness on your own MCP server, get an A–F grade plus a reproducible evidence +bundle, and publish it onchain so agents can verify it:** ```bash # Grade your server end-to-end (npm ref, https URL, or local path) @@ -119,9 +143,10 @@ polygraphso-litmus litmus ./path/to/local-mcp-server --json ``` - **Node β‰₯ 18.** **Docker is optional** but recommended β€” without it the egress probe (C-02) - is skipped and the grade is **capped at B**. -- **Exit codes are CI-friendly:** non-zero on a failing grade (D/F), zero on A/B/C β€” drop it - into a pipeline to gate dependencies. + is skipped and the grade is **capped at B** (as is any remote/HTTP target, which can't be + sandboxed). +- **Exit codes are CI-friendly:** non-zero on a failing grade (D/F), zero on A/B β€” drop it into + a pipeline to gate dependencies. Flags, env vars, `--json` output, and the `check` / `challenge` / `list` subcommands are all in [`references/cli.md`](references/cli.md). @@ -143,23 +168,8 @@ the grade with a one-paragraph rationale: ``` On a failure the report surfaces the top HIGH-severity findings (tool name, finding kind, the -offending snippet). [`references/methodology.md`](references/methodology.md) maps every -grade and finding kind to its cause. - ---- - -## Verify before you trust (Bankr integration) - -This is why polygraph matters for agents: **gate an MCP server through its grade before your -agent uses it, pays it, or routes a transaction through it.** Polygraph is the *verify* step; -Bankr is the *execute* step. - -The trust anchor is the **tool-surface fingerprint**: an attestation is only meaningful if the -server you're about to call still has the surface that was graded. The agent recomputes the -live fingerprint and requires it to equal the attested one before acting β€” a built-in -rug-pull check. Drop the `verify_attestation` MCP tool in front of execution, or use the -`gateDecision` helper. Full patterns, the MCP server config, and a worked -"verify-then-execute" example: [`references/bankr-integration.md`](references/bankr-integration.md). +offending snippet). [`references/methodology.md`](references/methodology.md) maps every grade +and finding kind to its cause. --- diff --git a/polygraph/references/cli.md b/polygraph/references/cli.md index 9f70d74214..f75cde9174 100644 --- a/polygraph/references/cli.md +++ b/polygraph/references/cli.md @@ -26,22 +26,22 @@ polygraphso --version polygraphso --help ``` -Example output (published grades are live): +Grades are live. Example output (the list rows are **illustrative** β€” a grade is point-in-time +evidence, so the live set at `polygraphso list` / polygraph.so is the source of truth): ``` $ polygraphso check npm/@modelcontextprotocol/server-filesystem β†’ polygraph: A Β· litmus-v2 Β· 2026-06-11 β†’ details β†’ polygraph.so/#checks -$ polygraphso list +$ polygraphso list # every graded server + its grade npm/@modelcontextprotocol/server-filesystem A -npm/@upstash/context7-mcp D -npm/@playwright/mcp F +npm/@scope/example-search-mcp D +npm/@scope/example-browser-mcp F $ polygraphso list --json | jq -r '.servers[] | "\(.polygraph) \(.server_ref)"' A npm/@modelcontextprotocol/server-filesystem -D npm/@upstash/context7-mcp -F npm/@playwright/mcp +… ``` A tracked-but-ungraded server reports `not available yet` with a @@ -95,9 +95,11 @@ fingerprint. - **Node β‰₯ 18.** - **Docker optional** β€” without it the egress probe (C-02) is skipped and the grade is capped - at **B**. With `LITMUS_STDIO_ISOLATION=docker`, isolation is mandatory. -- **Exit codes:** non-zero on a failing grade (**D/F**), zero on **A/B/C** β€” drop `litmus` into - CI to gate a dependency on its behavioral grade. + at **B**. A **remote/HTTP target also caps at B**, since it can't be sandboxed for egress β€” + that's a property of the measurement, not a knock against the server. With + `LITMUS_STDIO_ISOLATION=docker`, isolation is mandatory. +- **Exit codes:** non-zero on a failing grade (**D/F**), zero on a passing grade (**A/B**) β€” + drop `litmus` into CI to gate a dependency on its behavioral grade. ### Human output diff --git a/polygraph/references/methodology.md b/polygraph/references/methodology.md index 5e06e36dda..e7ce6048b7 100644 --- a/polygraph/references/methodology.md +++ b/polygraph/references/methodology.md @@ -33,8 +33,11 @@ Does the server do more than it declares? - **Unexpected egress (2.2):** run the server inside a hardened **default-deny Docker sandbox** with a network sinkhole; any outbound attempt is a finding. - **Fail:** any HIGH-severity finding in 2.1, or any finding in 2.2. -- **Skipped** (not failed) only when 2.1 passes and 2.2 could not run because no Docker sandbox - was available β€” which caps the grade at **B**. +- **Skipped** (not failed) only when 2.1 passes and 2.2 could not run β€” because no Docker + sandbox was available, or because the target is a **remote/HTTP server** that can't be + sandboxed for egress. Either way the grade caps at **B**. This is a property of the + measurement, not a knock against the server: a remote B is not "worse than" a local A, since + egress was never observed on the remote one. ### C-03 β€” Sensitive-data handling Does the server leak secrets it was exposed to? From b39aabf5630c4fec3f40256771b21ec9228535f2 Mon Sep 17 00:00:00 2001 From: Ruben Sousa Dinis Date: Tue, 16 Jun 2026 15:15:21 +0100 Subject: [PATCH 5/8] polygraph skill: match published @polygraphso/litmus@0.2.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Verified the skill against the now-published package. Remove the `challenge` command and the `check ` form β€” neither exists in the published CLI (commands: litmus/check/list; flags --json/--bearer/--header/--allow-state-changing and env POLYGRAPH_API_URL/LITMUS_BEARER/LITMUS_STDIO_ISOLATION all confirmed present). Co-Authored-By: Claude Opus 4.8 (1M context) --- polygraph/SKILL.md | 4 ++-- polygraph/references/cli.md | 7 +++---- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index 31cef575b1..65ee0008db 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -148,8 +148,8 @@ polygraphso-litmus litmus ./path/to/local-mcp-server --json - **Exit codes are CI-friendly:** non-zero on a failing grade (D/F), zero on A/B β€” drop it into a pipeline to gate dependencies. -Flags, env vars, `--json` output, and the `check` / `challenge` / `list` subcommands are all -in [`references/cli.md`](references/cli.md). +Flags, env vars, `--json` output, and the `check` / `list` subcommands are all in +[`references/cli.md`](references/cli.md). --- diff --git a/polygraph/references/cli.md b/polygraph/references/cli.md index f75cde9174..06baeb11e0 100644 --- a/polygraph/references/cli.md +++ b/polygraph/references/cli.md @@ -64,14 +64,13 @@ npx -y -p @polygraphso/litmus polygraphso-litmus litmus ```bash polygraphso-litmus litmus # grade a server end-to-end -polygraphso-litmus check # look up a published grade -polygraphso-litmus challenge # dispute a grade by re-running it +polygraphso-litmus check # look up a published grade polygraphso-litmus list # list published grades polygraphso-litmus --version | --help ``` -`challenge` is the teeth behind reproducibility: re-run the harness against a server that -carries a grade and, if your result disagrees, you have a falsification anchored to the same +Reproducibility is the teeth: re-run `litmus` against a server that already carries a grade +and, if your result disagrees, that's a falsification anchored to the same tool-surface fingerprint. ### Flags (`litmus`) From d3b35d98dfc9c366aa87305f691ed089195403fa Mon Sep 17 00:00:00 2001 From: Ruben Dinis Date: Thu, 25 Jun 2026 15:46:41 +0100 Subject: [PATCH 6/8] polygraph skill: add the CI gate (GitHub Action) for servers + skills Add references/ci-gate.md (the polygraphso/litmus@v1 Action that fails a build when an MCP server or an Agent Skill grades D/F) and a 'Gate your CI on grades' section + reference link in SKILL.md. Co-Authored-By: Claude --- polygraph/SKILL.md | 23 +++++- polygraph/references/ci-gate.md | 139 ++++++++++++++++++++++++++++++++ 2 files changed, 161 insertions(+), 1 deletion(-) create mode 100644 polygraph/references/ci-gate.md diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index 65ee0008db..f52a18f2d6 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -153,6 +153,27 @@ Flags, env vars, `--json` output, and the `check` / `list` subcommands are all i --- +## Gate your CI on grades + +Turn the grade into a build check: the **polygraph CI gate** fails a build when an MCP server or an +Agent Skill grades D/F. Add the GitHub Action to a repo β€” + +```yaml +- uses: polygraphso/litmus@v1 + with: + servers: | + npm/@modelcontextprotocol/server-filesystem + skills: | + ./my-skill +``` + +β€” or run it anywhere with `npx @polygraphso/litmus ci`. It auto-discovers MCP servers +(`.mcp.json` / `.vscode` / `.cursor`) and skills (`SKILL.md` dirs), grades each, and fails on D/F; +un-gradeable targets warn unless `strict`. Full setup, inputs, and the run-anywhere command: +[`references/ci-gate.md`](references/ci-gate.md). + +--- + ## Why a server got grade X Every run prints the methodology, the per-category verdict, the tool-surface fingerprint, and @@ -194,4 +215,4 @@ and finding kind to its cause. - **Lookup CLI:** `npx polygraphso check //` Β· https://www.npmjs.com/package/polygraphso - **Grading harness:** `@polygraphso/litmus` (open source β€” see polygraph.so for the repo) - **Onchain proof:** EAS attestations on Base -- **References:** [`methodology.md`](references/methodology.md) Β· [`cli.md`](references/cli.md) Β· [`bankr-integration.md`](references/bankr-integration.md) +- **References:** [`methodology.md`](references/methodology.md) Β· [`cli.md`](references/cli.md) Β· [`bankr-integration.md`](references/bankr-integration.md) Β· [`ci-gate.md`](references/ci-gate.md) diff --git a/polygraph/references/ci-gate.md b/polygraph/references/ci-gate.md new file mode 100644 index 0000000000..a56a844923 --- /dev/null +++ b/polygraph/references/ci-gate.md @@ -0,0 +1,139 @@ +# Polygraph CI gate (GitHub Action) + +Polygraph grades MCP servers and Agent Skills; the **CI gate** turns that grade into a build +check. Add it to a repo and the build **fails when an MCP server or a Skill it ships grades D/F** β€” +the same falsifiable grade described in [`../SKILL.md`](../SKILL.md), enforced on every pull request. + +It wraps the open `@polygraphso/litmus` harness, so the gate is **reproducible**: anyone can re-run +it and the verdict must match. A grade is a *measurement, not a guarantee* β€” the gate catches a +target that misbehaves under the probes, not one that evades them. + +--- + +## Add it to a repo + +```yaml +# .github/workflows/mcp-gate.yml +name: mcp-gate +on: [pull_request] +permissions: + contents: read +jobs: + gate: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: polygraphso/litmus@v1 + with: + # Auto-discovers MCP servers (.mcp.json / .vscode/mcp.json / .cursor/mcp.json) + # and skills (SKILL.md dirs). Or list them explicitly: + servers: | + npm/@modelcontextprotocol/server-filesystem + skills: | + ./my-skill + # min-grade: B # stricter than the default D/F gate + # strict: "true" # also fail on targets that can't be graded +``` + +That is the whole setup. On each PR the action grades every MCP server **and** every skill, and +fails the job on any **D** or **F**. + +--- + +## How the gate decides + +**MCP servers** β€” for each, in order: + +1. **Published-grade lookup** β€” a sub-second check for an existing polygraph grade (the same data + as `npx polygraphso check`). If one exists, it is used directly. +2. **Behavioral run** β€” if the server is not graded yet, the action runs the open harness in CI. + GitHub runners provide Docker, so the egress probe is exercised for local/npm servers (no B cap), + and the server is graded fresh. + +**Agent Skills** β€” each `SKILL.md` bundle is graded by the **static** skill grader +(`runSkillLitmus`): a scan of its bytes, no execution, no Docker, no network. Fast and deterministic. + +**Un-gradeable** β€” a target that can't be reached (a credential-gated server) or whose launch +command can't be mapped to a ref is reported and **warns** (it does not fail the build) unless you +set `strict: true`. + +Gate result (servers and skills share one gate and one exit code): + +| Outcome | Build | +|---|---| +| Every target grades **A / B** (or β‰₯ `min-grade`) | passes (exit 0) | +| Any target grades **D / F** (or below `min-grade`) | **fails** (exit 1) | +| A target cannot be graded | warns + passes, unless `strict: true` | + +A **remote (HTTP) server caps at B** and passes β€” that is a limit of the measurement, not a mark +against the server (see "Reading a B" in [`../SKILL.md`](../SKILL.md)). + +--- + +## Inputs + +| Input | Default | Description | +|---|---|---| +| `servers` | β€” | Explicit MCP refs (newline- or comma-separated). Merged with auto-discovery. | +| `skills` | β€” | Explicit skill directories (newline- or comma-separated). Merged with auto-discovery. | +| `discover` | `true` | Discover MCP servers from config files and skills from `SKILL.md`. | +| `min-grade` | β€” | Minimum acceptable grade (`A`–`D`). Default gates on D/F. | +| `strict` | `false` | Treat un-gradeable targets as failures, not warnings. | +| `working-directory` | `.` | Directory scanned for MCP config files and `SKILL.md` bundles. | +| `version` | pinned | `@polygraphso/litmus` version to run. | +| `bearer` | β€” | Token passed through to a gated remote (HTTPS) server. | + +Outputs: `result` (`pass` / `fail`), `failed` (count), and `report` (a JSON array of per-target +results, each with its `kind` of `server` or `skill`) β€” read them from a later step via +`steps..outputs.*`. + +--- + +## Discovery + +The action reads the standard MCP config files and maps each server's launch command to a +registry-prefixed ref, and walks the repo for `SKILL.md` bundles: + +| Target | Discovered as | +|---|---| +| `{ "command": "npx", "args": ["-y", "@scope/srv"] }` | server `npm/@scope/srv` | +| `{ "command": "uvx", "args": ["srv-mcp"] }` | server `pypi/srv-mcp` | +| `{ "url": "https://example.com/mcp" }` | server β€” the HTTPS endpoint (remote) | +| a directory containing `SKILL.md` | skill β€” that directory | +| a bare binary / local script | reported as **un-gradeable** (never silently skipped) | + +`node_modules`/`.git`/`dist`/etc. are pruned from the skill walk, and anything that can't be mapped +is surfaced rather than dropped β€” so coverage stays honest. + +--- + +## Run it anywhere (not just GitHub) + +The gate is a plain command in the harness, so it also works in any other CI or as a pre-commit +check: + +```bash +# Gate the MCP servers and skills discovered in this repo: +npx @polygraphso/litmus ci + +# Or name targets, fail below B, treat un-gradeable as a failure: +npx @polygraphso/litmus ci --server npm/@scope/your-mcp --skill ./your-skill --min-grade B --strict +``` + +It exits non-zero on a gated target, so any pipeline can use it. `--json` emits the full per-target +report; `--no-discover` and `--no-lookup` narrow what it does. + +--- + +## Honest limits (carry these into your pipeline) + +- **Reproducibility is the trust anchor.** The harness is open and deterministic, so the gate's + verdict is falsifiable β€” not a black box. +- A passing gate means *these targets did not misbehave under these probes* β€” **not** that they are + safe in every situation. A skill grade is a **static** read of its text and bundle; a server grade + is behavioral. **Evasion** (a server that detects the test context) is the disclosed residual limit. +- The gate does not replace your own runtime guards (for example, Bankr's transaction-verification + checks β€” see [`bankr-integration.md`](bankr-integration.md)). + +See [`../SKILL.md`](../SKILL.md) for the grade scale and [`methodology.md`](methodology.md) for the +probes behind each grade. From c84909a8bdadd4af47c9bb532fa2e458c934df2b Mon Sep 17 00:00:00 2001 From: Ruben Dinis Date: Thu, 25 Jun 2026 15:55:45 +0100 Subject: [PATCH 7/8] =?UTF-8?q?polygraph=20skill:=20refresh=20to=20current?= =?UTF-8?q?=20=E2=80=94=20litmus-v9,=20the=20C-04=20probe=20category,=20th?= =?UTF-8?q?e=20ci=20command?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bring the skill up to date with the published @polygraphso/litmus: four probe categories (adds C-04 adversarial-input handling), methodology version litmus-v9 in the illustrative outputs, and the ci command in the CLI reference. Co-Authored-By: Claude --- polygraph/SKILL.md | 21 ++++++++++++++------- polygraph/references/cli.md | 15 +++++++++------ polygraph/references/methodology.md | 27 +++++++++++++++++++++------ 3 files changed, 44 insertions(+), 19 deletions(-) diff --git a/polygraph/SKILL.md b/polygraph/SKILL.md index f52a18f2d6..4eaeb2d12d 100644 --- a/polygraph/SKILL.md +++ b/polygraph/SKILL.md @@ -1,6 +1,6 @@ --- name: polygraph -description: Behavioral trust grades (A–F) for MCP servers. Use when an agent needs to check whether an MCP server is safe before using it, verify an onchain attestation before trusting or paying a server, look up a server's published grade, get a project graded, or understand why a server received a grade. Polygraph connects to an MCP server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), and sensitive-data leak (C-03) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on mentions of MCP server safety, is this MCP server safe, tool poisoning, prompt injection, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server. +description: Behavioral trust grades (A–F) for MCP servers. Use when an agent needs to check whether an MCP server is safe before using it, verify an onchain attestation before trusting or paying a server, look up a server's published grade, get a project graded, or understand why a server received a grade. Polygraph connects to an MCP server the way an agent would, fingerprints its exact tool surface, and runs behavioral probes β€” prompt-injection (C-01), permission/egress overreach (C-02), sensitive-data leak (C-03), and adversarial-input handling (C-04) β€” then publishes a reproducible grade as an onchain EAS attestation on Base. Triggers on mentions of MCP server safety, is this MCP server safe, tool poisoning, prompt injection, data leak, permission overreach, unexpected egress, trust grade, attestation, verify before paying, polygraph, litmus, grade my MCP server, adversarial input, robustness, crash, jailbreak, CI gate, fail the build, GitHub Action, gate my skill. emoji: πŸ§ͺ tags: [security, mcp, trust, grade, attestation, base, prompt-injection, agent-safety] visibility: public @@ -26,7 +26,7 @@ anyone can re-run it and disprove a bad grade. That falsifiability is the whole Polygraph connects to a server the way an agent would β€” **stdio** for local packages, **Streamable HTTP** for remote URLs β€” fingerprints its exact tool surface -(`tools/list` β†’ canonical JSON β†’ sha256 β†’ `bytes32`), then runs three probe categories: +(`tools/list` β†’ canonical JSON β†’ sha256 β†’ `bytes32`), then runs four probe categories: - **C-01 β€” Tool-output injection.** Does the server try to hijack the agent? Static scan of tool names/descriptions/schemas for injection-shaped content (invisible unicode, @@ -38,14 +38,21 @@ Polygraph connects to a server the way an agent would β€” **stdio** for local pa - **C-03 β€” Sensitive-data handling.** Does the server leak secrets? Plants canary values in the environment and working directory, exercises the tools, and scans both tool outputs and egress for any canary that surfaces. +- **C-04 β€” Adversarial-input handling.** Does the server stay robust under hostile input? Runs + two probes on non-state-changing tools, with no Docker required: stress-tests each tool with + malformed and oversized inputs (fails if the server crashes, hangs, or leaks an uncaught + stack trace β€” a clean validation error or benign result passes); and feeds jailbreak-pattern + strings and scans the server's **outputs** with the C-01 injection scanners, failing only if + the server emits injection-shaped content it did not merely reflect from the input (a verbatim + echo is excluded). A C-04 failure caps the overall grade at D. ### Grade scale | Grade | Meaning | |-------|---------| -| **A** | Passed all three categories. No injection, no unexpected egress, no data leak. | +| **A** | Passed all four categories. No injection, no unexpected egress, no data leak, no adversarial-input failure. | | **B** | Injection and data-leak checks passed; **egress was not verified.** The ceiling for any run without a local Docker sandbox β€” including every remote (HTTP) server, which can't be sandboxed. | -| **D** | Unexpected egress / permission overreach, but no injection or leak. Serious, but not proven exfiltration β†’ capped at D. | +| **D** | Unexpected egress / permission overreach (C-02) **or** an adversarial-input robustness failure (C-04: crash, internals-leak, or amplification). No injection or leak β†’ capped at D. | | **F** | Disqualifying: active tool-output injection (C-01) or a sensitive-data leak (C-03) β€” a server that would harm an agent that trusts it. | **Reading a B.** Under the current methodology, egress can only be observed by running the @@ -67,7 +74,7 @@ anything:** ```bash $ npx polygraphso check npm/@modelcontextprotocol/server-filesystem -β†’ polygraph: A Β· litmus-v2 Β· 2026-06-11 +β†’ polygraph: A Β· litmus-v9 Β· 2026-06-24 β†’ details β†’ polygraph.so/#checks ``` @@ -182,10 +189,10 @@ the grade with a one-paragraph rationale: ``` β†’ litmus Β· npm/@modelcontextprotocol/server-filesystem β†’ version 0.1.0 -β†’ C-01 pass Β· C-02 pass Β· C-03 pass +β†’ C-01 pass Β· C-02 pass Β· C-03 pass Β· C-04 pass β†’ fingerprint 0x1a2b3c4d…5e6f7890 β†’ grade: A - All three categories passed. No injection, no unexpected egress, no data leak. + All four categories passed. No injection, no unexpected egress, no data leak. ``` On a failure the report surfaces the top HIGH-severity findings (tool name, finding kind, the diff --git a/polygraph/references/cli.md b/polygraph/references/cli.md index 06baeb11e0..fc6d0b184d 100644 --- a/polygraph/references/cli.md +++ b/polygraph/references/cli.md @@ -31,7 +31,7 @@ evidence, so the live set at `polygraphso list` / polygraph.so is the source of ``` $ polygraphso check npm/@modelcontextprotocol/server-filesystem -β†’ polygraph: A Β· litmus-v2 Β· 2026-06-11 +β†’ polygraph: A Β· litmus-v9 Β· 2026-06-24 β†’ details β†’ polygraph.so/#checks $ polygraphso list # every graded server + its grade @@ -63,12 +63,15 @@ npx -y -p @polygraphso/litmus polygraphso-litmus litmus ### Commands ```bash -polygraphso-litmus litmus # grade a server end-to-end -polygraphso-litmus check # look up a published grade -polygraphso-litmus list # list published grades +polygraphso-litmus litmus # grade a server end-to-end +polygraphso-litmus check # look up a published grade +polygraphso-litmus list # list published grades +polygraphso-litmus ci [--server ] [--skill ] [--min-grade ] [--strict] # gate a build on D/F (servers + skills) polygraphso-litmus --version | --help ``` +The `ci` command gates a build on the grades of a repo's MCP servers and skills β€” see [`ci-gate.md`](ci-gate.md). + Reproducibility is the teeth: re-run `litmus` against a server that already carries a grade and, if your result disagrees, that's a falsification anchored to the same tool-surface fingerprint. @@ -105,10 +108,10 @@ fingerprint. ``` β†’ litmus Β· npm/@modelcontextprotocol/server-filesystem β†’ version 0.1.0 -β†’ C-01 pass Β· C-02 pass Β· C-03 pass +β†’ C-01 pass Β· C-02 pass Β· C-03 pass Β· C-04 pass β†’ fingerprint 0x1a2b3c4d…5e6f7890 β†’ grade: A - All three categories passed. No injection, no unexpected egress, no data leak. + All four categories passed. No injection, no unexpected egress, no data leak. ``` On failure the summary lists the top HIGH-severity findings (tool name, finding kind, diff --git a/polygraph/references/methodology.md b/polygraph/references/methodology.md index e7ce6048b7..6c51518fd3 100644 --- a/polygraph/references/methodology.md +++ b/polygraph/references/methodology.md @@ -1,7 +1,7 @@ # Polygraph Methodology β€” how a server gets its grade Polygraph runs the **litmus** harness: connect to an MCP server the way an agent would, -fingerprint its exact tool surface, run three behavioral probe categories, and assign an +fingerprint its exact tool surface, run four behavioral probe categories, and assign an **A–F** grade with a deterministic, content-addressed evidence bundle. The harness is open source and the run is reproducible β€” that is what makes a grade trustworthy. @@ -15,7 +15,7 @@ source and the run is reproducible β€” that is what makes a grade trustworthy. its tools, the fingerprint no longer matches and any verifier should refuse (see [`bankr-integration.md`](bankr-integration.md)). -## The three probe categories +## The four probe categories ### C-01 β€” Tool-output injection Does the server try to hijack the agent that calls it? @@ -47,18 +47,33 @@ Does the server leak secrets it was exposed to? `partial` without a sandbox; never silently dropped. - **Fail:** any canary surfacing in either probe. +### C-04 β€” Adversarial-input handling +Does the server stay robust under hostile input? Both probes run only on non-state-changing +tools and require no Docker sandbox. +- **Malformed/oversized (3.1):** stress each tool with malformed and oversized inputs. **Fail** + if the server crashes, hangs, or leaks an uncaught stack trace; a clean validation error or a + benign result passes. +- **Jailbreak amplification (3.2):** feed jailbreak-pattern strings and scan the server's + **outputs** with the C-01 injection scanners. **Fail only** if the server emits + injection-shaped content it did not merely reflect from the input β€” a verbatim echo is + excluded, so an honest echo/summarize tool is not penalized. +- **Fail:** any finding in 3.1 or any amplification finding in 3.2. +- A C-04 failure **caps the overall grade at D** (a robustness failure, not proven injection or + exfiltration). + ### Finding kinds `invisible-unicode`, `instruction-mimicry`, `markdown-trick` (C-01) Β· `permission-mislabel`, -`egress` (C-02) Β· `canary` (C-03). Each finding carries a severity and a snippet. +`egress` (C-02) Β· `canary` (C-03) Β· `crash`, `internals-leak`, `amplification` (C-04). +Each finding carries a severity and a snippet. ## Grade decision logic | Grade | Condition | |-------|-----------| -| **A** | C-01 pass Β· C-02 pass Β· C-03 pass. | +| **A** | C-01 pass Β· C-02 pass Β· C-03 pass Β· C-04 pass. | | **B** | C-01 pass Β· C-02 **skipped** (no sandbox / remote target) Β· C-03 pass. Injection passed; egress unverified. | | **C** | Reserved β€” not assigned by the current logic. | -| **D** | C-02 **fail** (unexpected egress / overreach) while C-01 and C-03 pass. Egress is serious but not proven exfiltration, so the grade caps at D. | +| **D** | C-02 **fail** (unexpected egress / overreach) **or** C-04 **fail** (crash / internals-leak / amplification) while C-01 and C-03 pass. A robustness or overreach failure is serious but not proven injection or exfiltration, so the grade caps at D. | | **F** | C-01 **fail** (injection) **or** C-03 **fail** (data leak). Active injection or a leak harms an agent that trusts the server. | A grade is always paired with a rationale string explaining *why* β€” the harness never emits a @@ -83,7 +98,7 @@ fingerprint. The harness **grades and hands off**; it does not mint. Publishing a grade means recording it as an **EAS (Ethereum Attestation Service) attestation on Base**, which binds: -`serverRef` Β· `toolDefsFingerprint` Β· overall grade Β· per-category verdicts (C-01/C-02/C-03) Β· +`serverRef` Β· `toolDefsFingerprint` Β· overall grade Β· per-category verdicts (C-01/C-02/C-03/C-04) Β· evidence CID (the bundle, pinned to IPFS) Β· methodology version Β· run timestamp Β· resolved version. From f353528e19e70771d2656376c89b27cdfd607273 Mon Sep 17 00:00:00 2001 From: Ruben Dinis Date: Fri, 26 Jun 2026 12:17:00 +0100 Subject: [PATCH 8/8] polygraph skill: add catalog.json + logo, list in README Adds the required catalog.json (slug=polygraph, install type bankr) so the skill appears in the Bankr Discover catalog, a square logo.svg, and a README table row. Rebased onto current main. --- README.md | 1 + polygraph/catalog.json | 24 ++++++++++++++++++++++++ polygraph/logo.svg | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 58 insertions(+) create mode 100644 polygraph/catalog.json create mode 100644 polygraph/logo.svg diff --git a/README.md b/README.md index 2b9345819a..fd5bec5801 100644 --- a/README.md +++ b/README.md @@ -73,6 +73,7 @@ Bankr Skills equip builders with plug-and-play tools to build more powerful agen | [Zyfai](https://zyf.ai) | [zyfai](zyfai/) | Earn yield on any Ethereum wallet on Base, Arbitrum, and Plasma. Deploys a non-custodial Safe subaccount linked to the user's EOA with automated rebalancing across DeFi protocols. Session keys for gasless automation. | | [Aeon](https://github.com/aaronjmars/aeon) |
aeon suite (24 skills)
  • aeon-autoresearch β€” Evolve any installed skill by generating four improved variations along separate theses, scoring on a weighted rubric, applying the winner.
  • aeon-deal-flow β€” Weekly funding round tracker across configurable verticals β€” primary-source required, filters re-announcements and rumors.
  • aeon-deep-research β€” Exhaustive multi-source synthesis with explicit source credibility tiers, per-finding confidence, mandatory adversarial section.
  • aeon-defi-monitor β€” Watchlist monitor over DeFi pools, lending markets, and vaults with APR/utilization/TVL/health-factor thresholds.
  • aeon-defi-overview β€” Daily DeFi read with regime verdict (RISK-ON / NEUTRAL / RISK-OFF), sustainable-vs-incentive yield decomposition, fee leaders.
  • aeon-distribute-tokens β€” Batch token payouts via the Bankr Wallet API with per-recipient idempotency, two-phase resolveβ†’execute, dry-run preview, and recovery from partial runs.
  • aeon-hacker-news-digest β€” Top HN stories filtered by configurable interests, with comment-mined insights and themed clustering.
  • aeon-huggingface-trending β€” Trending HF models, datasets, and spaces, filtered by license sanity and dedup with a "why notable" line per pick.
  • aeon-last30 β€” Cross-platform social research over the last 30 days across Reddit, X/Twitter, Hacker News, Polymarket, and the open web.
  • aeon-monitor-kalshi β€” Watchlist-driven Kalshi monitor with cross-venue arb detection vs paired Polymarket markets (fee + slippage adjusted).
  • aeon-monitor-polymarket β€” Watchlist-driven Polymarket monitor surfacing only markets with material 24h moves, volume spikes, or fresh watched-account comments.
  • aeon-monitor-runners β€” Top 5 tokens that ran hardest in the past 24h across major chains via GeckoTerminal, with pump-risk filters.
  • aeon-narrative-tracker β€” Track rising, peaking, and fading narratives with mindshare + velocity and explicit positioning calls (FRONT-RUN / RIDE / FADE / WATCH / IGNORE).
  • aeon-on-chain-monitor β€” Watchlist monitor over blockchain addresses and contracts: transfers, approvals, upgrades, gas spends, MEV interactions.
  • aeon-paper-pick β€” One AI/ML paper to read today from Hugging Face Papers β€” picks for novelty + falsifiability + reproducibility.
  • aeon-reg-monitor β€” Track legislation and regulatory actions affecting prediction markets, crypto, and AI agents β€” triaged by Stage Γ— Impact.
  • aeon-rss-digest β€” Daily roll-up across configurable RSS/Atom/JSON feeds with canonical-URL dedup, themed clustering, weighted ranking.
  • aeon-skill-evals β€” Validate any installed skill's output against an assertion manifest. Bootstrap mode generates a starter manifest from recent runs.
  • aeon-skill-repair β€” Auto-diagnose and fix a failing or degraded installed skill; matched-playbook fixes with verification block.
  • aeon-skill-security-scan β€” Audit installed Bankr skills for shell injection, secret exfiltration, traversal, prompt-override payloads, obfuscation.
  • aeon-token-movers β€” Daily top movers, losers, and trending coins from CoinGecko with signal enrichment and pump-risk flags.
  • aeon-token-pick β€” One token and one prediction-market pick per run, scored on a quantified rubric with falsifiable thesis, sizing guidance, and kill criterion.
  • aeon-unlock-monitor β€” Weekly token unlock tracker quantifying supply pressure via Absorption Ratio with one-line market reads per event.
  • aeon-vuln-scanner β€” Audit trending repos for exploitable vulnerabilities with Semgrep + TruffleHog + osv-scanner + Slither and responsible-disclosure flow.
| Specialized agent skill suite: research, monitors, market/token picks, regulatory & deal-flow tracking, plus meta skills that scan, eval, repair, and evolve other installed skills. Each sub-skill installs independently from its own folder (expand the Skill column for the full list). | | [Starchild](https://starchild.software) | [starchild-dao](starchild-dao/) | Hold-to-govern for the open-source Starchild companion's $STARCHILD token on Base. List proposals, check your weight, and cast gasless for/against EIP-712 votes β€” propose by holding 10M (no staking, no locking; weight = your live balance). Public by design; the private app never touches it. | +| [Polygraph](https://polygraph.so) | [polygraph](polygraph/) | Behavioral trust grades (A–F) for MCP servers and Agent Skills. Check a server before your agent uses it (`npx polygraphso check `), grade your own with the open litmus harness (tool-output injection, permission/egress, data-leak, and adversarial-input probes), verify the onchain EAS attestation on Base before executing, and gate CI on grades with the GitHub Action `polygraphso/litmus@v1`. Reproducible by design β€” anyone can re-run the harness and disprove a bad grade. | ## Adding a Skill diff --git a/polygraph/catalog.json b/polygraph/catalog.json new file mode 100644 index 0000000000..87d54654de --- /dev/null +++ b/polygraph/catalog.json @@ -0,0 +1,24 @@ +{ + "schemaVersion": 1, + "slug": "polygraph", + "provider": "Polygraph", + "providerUrl": "https://polygraph.so", + "logo": "logo.svg", + "demo": { + "title": "demo.sh", + "language": "bash", + "code": "# Check a server's published grade before your agent uses it\nnpx polygraphso check npm/@modelcontextprotocol/server-filesystem\n# β†’ polygraph: A Β· litmus-v9\n\n# Grade your own MCP server end-to-end (A–F + reproducible evidence)\nnpx -y -p @polygraphso/litmus polygraphso-litmus litmus npm/@your-scope/your-mcp-server\n\n# Gate CI: fail the build on a D/F MCP server or skill\nnpx @polygraphso/litmus ci" + }, + "setup": [ + "Install: `install the polygraph skill from https://github.com/BankrBot/skills/tree/main/polygraph`", + "Check any MCP server before your agent uses it: `npx polygraphso check //`", + "Verify before you trust: require the live tool-surface fingerprint to match the attestation before executing", + "Gate CI on grades with the GitHub Action `polygraphso/litmus@v1`, or run `npx @polygraphso/litmus ci` in any pipeline", + "Source: https://github.com/BankrBot/skills/tree/main/polygraph" + ], + "install": { + "type": "bankr", + "repoPath": "polygraph", + "command": "install the polygraph skill from https://github.com/BankrBot/skills/tree/main/polygraph" + } +} diff --git a/polygraph/logo.svg b/polygraph/logo.svg new file mode 100644 index 0000000000..94761cf72e --- /dev/null +++ b/polygraph/logo.svg @@ -0,0 +1,33 @@ + + + + + + + + + + + + + +