docs: design spec for #119 — autonomous threat discovery#129
Open
yasirhamza wants to merge 3 commits into
Open
docs: design spec for #119 — autonomous threat discovery#129yasirhamza wants to merge 3 commits into
yasirhamza wants to merge 3 commits into
Conversation
Captures the brainstorming-approved design for /update-rules discover: - New discover skill producing work list; existing research-threat skill does the heavy lifting per extracted threat name. - 5 vendor-original RSS sources (securelist, welivesecurity, blog.zimperium, lookout threat-intel, blog.google/TAG). Aggregators and HTML-only sources deferred from v1. - Per-source cursors in feed-state.json (partial-failure robust). Schema update in submodule adds a top-level 'discover' block. - Regex extraction (CVE + category-suffix + camelcase) + static denylist + rule-index cross-reference + category-context boost for ranking. - Parallel fan-out up to N=5 research-threat subagents; failures logged but don't abort the run. - full-with-discover composite mode dropped (YAGNI). - Prerequisites #126 and #128 both merged; this spec is unblocked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revisions per independent architect review: - **Hybrid extraction architecture** (reviewer a1, a2, h2): RSS-Python path for securelist/welivesecurity/google-tag via new scripts/discover_extract.py; Web-LLM path for zimperium/lookout via WebFetch on HTML blog indices. Zimperium's RSS no longer exists; Lookout is Cloudflare-fronted. WebFetch's browser path handles both. - **Extraction pattern 4 added** (reviewer c1): two-word threat names (Silver Fox, Lazarus Group) with STRICT category-context requirement (must be within 5 words of a malware keyword; no context → dropped). - **Cursor semantics made precise** (reviewer d1, d2): explicit fetch-vs-parse failure distinction; CMS-migration recovery via timestamp-only fallback when last_post_url vanishes and timestamp is 90+ days stale. - **Denylist guard list expanded** (reviewer c2): adds Silver, Fox, Cozy, Lazarus, Sandworm to the guard — natural-English words that are also real threat-family or APT names. - **Python extraction golden-fixture tests** (reviewer e1): RSS path now has byte-identical-output tests over 3 committed fixture XMLs, catching regex drift / denylist-addition regressions that an LLM-only markdown approach couldn't. - **Source-URL binding lint** (reviewer e1): guards against a skill edit re-pointing a source at attacker.com. - **Source-diversity math dropped** (reviewer g1): ceil(N*0.4) over- engineered for N=5; simple first-N slice; add back if analysts report one source drowning others. - **Input honesty** (reviewer a2): spec admits welivesecurity and google-tag expose description-only (no full body); extraction operates on whatever the source makes available. No decisions reversed; just tightened against reviewer's concrete probes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Major design revision driven by a second skeptical-reviewer round that surfaced two critical gaps in the earlier LLM-primary design I proposed: 1. Single-word threat names (Bitter, Anatsa, Pegasus, Hook, Joker, Hermit, ...) — canonical APT and malware names — were being missed by the original 4-pattern regex. Per user direction, these must be discoverable. 2. LLM-primary extraction across all sources introduced an injection attack surface that didn't exist in the regex-only design. Resolved by taking the reviewer's proposed middle-ground: regex-enhanced (patterns 1-6) with conditional LLM fallback only for posts where regex returned zero candidates. Key additions: - **Pattern 5** — known-families reverse match. Hand-curated YAML list of verified-real names (~50 starting entries, expand organically when dogfood surfaces misses). - **Pattern 6** — single-word capitalized + strict 3-word context gate. Catches Bitter APT, Hook banker, Joker trojan shapes. - **LLM fallback** — conditional (only if regex returns zero hits for the post). ~20-75 LLM calls per run; trivial cost at weekly cadence. - **XPIA resistance test** (per user ask for belt-and-suspenders) — fixture with 5 classes of prompt-injection payloads; test asserts LLM output is valid JSON, contains none of the adversary-chosen strings, and passes a structural token-shape validator that rejects candidates with shell metacharacters, URLs, or unreasonable length regardless of what the LLM returned. This structural validator is the non-LLM-dependent defense line. - **Drift-honesty paragraph** — fuzzy LLM tests only cover curated threats; monthly log review is part of the maintenance contract. - **Cost threshold** — design targets weekly-or-less cadence; any faster cadence requires revisiting. User pushed back on full injection-defense hardening (verbatim-substring filter, explicit threat-model section) on the basis that sources are editorially-reviewed vendor blogs. Kept only the token-shape validator (trivially cheap structural defense) and good-prompting hygiene (structured prompt, strict JSON schema). Dropped verbatim-substring filter and threat-model section as unwarranted ceremony at the trust level of these 5 sources. Implementation phases reorganized: Phase 2 ships regex-only core (byte-identical goldens preserve drift detection); Phase 3 lands the skill + LLM fallback + XPIA tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Captures the brainstorming-approved design for
/update-rules discover: scrape 5 vendor sources (3 RSS + 2 HTML blog indices), extract named threats via regex + LLM hybrid, fan out parallelupdate-rules-research-threatsubagents per discovered name. SIRs flow through the existing pipeline unchanged from Step 3 onwards.Prerequisites both merged 2026-04-16:
threat_researchadded toallowed-sources.json(submodule rules#7). Unblocks threat-research-sourced IOCs from reachingioc-data/*.yml.Key design choices (each selected from 2–3 alternatives during brainstorming)
securelist,welivesecurity,google-tagzimperium(RSS moved),lookout(Cloudflare-fronted)scripts/discover_extract.py(new) for RSS — regex (CVE + category-suffix + camelcase + two-word-with-context) + static denylist + rule-index cross-reference. Web path uses LLM with same rule set documented in the skill.full-with-discovercomposite mode (YAGNI; users can runfullthendiscoverback-to-back)Skeptical independent review
Went through a skeptical architect-review pass that probed live RSS URLs and found 2 of 5 broken as originally specified. Amendment commit (
4e80361) addresses:Original commit:
e28e854. Amendment:4e80361.Closes #119
Implementation plan handling
This PR merges with
Closes #119to keep issue tracking flat (same pattern as #117's spec PR #123). The implementation plan produced bysuperpowers:writing-planswill live on a separate long-lived Draft PR that stays open through implementation work — the plan is versioned and reviewable in git without forcing it onto main before the feature actually ships.Test plan
🤖 Generated with Claude Code