Skip to content

docs: design spec for #119 — autonomous threat discovery#129

Open
yasirhamza wants to merge 3 commits into
mainfrom
docs/issue-119-discover-design
Open

docs: design spec for #119 — autonomous threat discovery#129
yasirhamza wants to merge 3 commits into
mainfrom
docs/issue-119-discover-design

Conversation

@yasirhamza
Copy link
Copy Markdown
Owner

@yasirhamza yasirhamza commented Apr 16, 2026

Summary

Captures the brainstorming-approved design for /update-rules discover: scrape 5 vendor sources (3 RSS + 2 HTML blog indices), extract named threats via regex + LLM hybrid, fan out parallel update-rules-research-threat subagents per discovered name. SIRs flow through the existing pipeline unchanged from Step 3 onwards.

Prerequisites both merged 2026-04-16:

Key design choices (each selected from 2–3 alternatives during brainstorming)

  • Cursor granularity: per-source (each source advances independently; robust to partial failures)
  • Source list (v1): 5 sources via hybrid paths, URLs probed live on 2026-04-17:
    • RSS path (Python, deterministic): securelist, welivesecurity, google-tag
    • Web path (LLM via WebFetch): zimperium (RSS moved), lookout (Cloudflare-fronted)
  • Extraction: scripts/discover_extract.py (new) for RSS — regex (CVE + category-suffix + camelcase + two-word-with-context) + static denylist + rule-index cross-reference. Web path uses LLM with same rule set documented in the skill.
  • Research fan-out: parallel (up to N=5 concurrent subagents, per the existing dispatcher's established pattern)
  • Dropped from v1: full-with-discover composite mode (YAGNI; users can run full then discover back-to-back)

Skeptical independent review

Went through a skeptical architect-review pass that probed live RSS URLs and found 2 of 5 broken as originally specified. Amendment commit (4e80361) addresses:

  • Hybrid RSS-Python + Web-LLM instead of RSS-only (fixes broken zimperium + Cloudflare lookout)
  • Two-word threat-name pattern added (catches Silver Fox, Lazarus Group class)
  • Cursor semantics made precise (fetch-vs-parse failure, CMS migration fallback)
  • Denylist guard list expanded (guards against Silver/Fox/Cozy/Lazarus being accidentally added as "that's just English")
  • Python extraction golden-fixture tests catch regex drift that LLM-only markdown couldn't
  • Source-URL binding lint prevents a skill edit silently re-pointing a source at attacker.com

Original commit: e28e854. Amendment: 4e80361.

Closes #119

Implementation plan handling

This PR merges with Closes #119 to keep issue tracking flat (same pattern as #117's spec PR #123). The implementation plan produced by superpowers:writing-plans will live on a separate long-lived Draft PR that stays open through implementation work — the plan is versioned and reviewable in git without forcing it onto main before the feature actually ships.

Test plan

  • No code changes in this PR, docs-only
  • Spec self-review (placeholder / consistency / scope / ambiguity) passed
  • Independent skeptical architect review; all CRITICAL and MAJOR findings addressed in-commit
  • Implementation plan produced via `superpowers:writing-plans` after spec approval (lives on separate Draft PR, not merged with this)

🤖 Generated with Claude Code

Yasir and others added 3 commits April 17, 2026 00:28
Captures the brainstorming-approved design for /update-rules discover:

- New discover skill producing work list; existing research-threat skill
  does the heavy lifting per extracted threat name.
- 5 vendor-original RSS sources (securelist, welivesecurity, blog.zimperium,
  lookout threat-intel, blog.google/TAG). Aggregators and HTML-only sources
  deferred from v1.
- Per-source cursors in feed-state.json (partial-failure robust). Schema
  update in submodule adds a top-level 'discover' block.
- Regex extraction (CVE + category-suffix + camelcase) + static denylist
  + rule-index cross-reference + category-context boost for ranking.
- Parallel fan-out up to N=5 research-threat subagents; failures logged
  but don't abort the run.
- full-with-discover composite mode dropped (YAGNI).
- Prerequisites #126 and #128 both merged; this spec is unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revisions per independent architect review:

- **Hybrid extraction architecture** (reviewer a1, a2, h2): RSS-Python path
  for securelist/welivesecurity/google-tag via new
  scripts/discover_extract.py; Web-LLM path for zimperium/lookout via
  WebFetch on HTML blog indices. Zimperium's RSS no longer exists;
  Lookout is Cloudflare-fronted. WebFetch's browser path handles both.
- **Extraction pattern 4 added** (reviewer c1): two-word threat names
  (Silver Fox, Lazarus Group) with STRICT category-context requirement
  (must be within 5 words of a malware keyword; no context → dropped).
- **Cursor semantics made precise** (reviewer d1, d2): explicit
  fetch-vs-parse failure distinction; CMS-migration recovery via
  timestamp-only fallback when last_post_url vanishes and timestamp is
  90+ days stale.
- **Denylist guard list expanded** (reviewer c2): adds Silver, Fox, Cozy,
  Lazarus, Sandworm to the guard — natural-English words that are also
  real threat-family or APT names.
- **Python extraction golden-fixture tests** (reviewer e1): RSS path now
  has byte-identical-output tests over 3 committed fixture XMLs, catching
  regex drift / denylist-addition regressions that an LLM-only markdown
  approach couldn't.
- **Source-URL binding lint** (reviewer e1): guards against a skill edit
  re-pointing a source at attacker.com.
- **Source-diversity math dropped** (reviewer g1): ceil(N*0.4) over-
  engineered for N=5; simple first-N slice; add back if analysts report
  one source drowning others.
- **Input honesty** (reviewer a2): spec admits welivesecurity and
  google-tag expose description-only (no full body); extraction operates
  on whatever the source makes available.

No decisions reversed; just tightened against reviewer's concrete probes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Major design revision driven by a second skeptical-reviewer round that
surfaced two critical gaps in the earlier LLM-primary design I proposed:

1. Single-word threat names (Bitter, Anatsa, Pegasus, Hook, Joker,
   Hermit, ...) — canonical APT and malware names — were being missed
   by the original 4-pattern regex. Per user direction, these must be
   discoverable.
2. LLM-primary extraction across all sources introduced an injection
   attack surface that didn't exist in the regex-only design.

Resolved by taking the reviewer's proposed middle-ground: regex-enhanced
(patterns 1-6) with conditional LLM fallback only for posts where regex
returned zero candidates. Key additions:

- **Pattern 5** — known-families reverse match. Hand-curated YAML list
  of verified-real names (~50 starting entries, expand organically
  when dogfood surfaces misses).
- **Pattern 6** — single-word capitalized + strict 3-word context gate.
  Catches Bitter APT, Hook banker, Joker trojan shapes.
- **LLM fallback** — conditional (only if regex returns zero hits for
  the post). ~20-75 LLM calls per run; trivial cost at weekly cadence.
- **XPIA resistance test** (per user ask for belt-and-suspenders) —
  fixture with 5 classes of prompt-injection payloads; test asserts
  LLM output is valid JSON, contains none of the adversary-chosen
  strings, and passes a structural token-shape validator that rejects
  candidates with shell metacharacters, URLs, or unreasonable length
  regardless of what the LLM returned. This structural validator is
  the non-LLM-dependent defense line.
- **Drift-honesty paragraph** — fuzzy LLM tests only cover curated
  threats; monthly log review is part of the maintenance contract.
- **Cost threshold** — design targets weekly-or-less cadence; any
  faster cadence requires revisiting.

User pushed back on full injection-defense hardening (verbatim-substring
filter, explicit threat-model section) on the basis that sources are
editorially-reviewed vendor blogs. Kept only the token-shape validator
(trivially cheap structural defense) and good-prompting hygiene
(structured prompt, strict JSON schema). Dropped verbatim-substring
filter and threat-model section as unwarranted ceremony at the trust
level of these 5 sources.

Implementation phases reorganized: Phase 2 ships regex-only core
(byte-identical goldens preserve drift detection); Phase 3 lands the
skill + LLM fallback + XPIA tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant