Version: v2.3.9, repo HEAD 326a2b489411a20ed742ff13701be39ba00063c8, static mode (--no-llm), running via the repo's own Dockerfile.
Observation: scanning a private, hand-vetted skill library (ground truth: benign), ~90% of sampled HIGH findings were the pattern that #145 described — security documentation being flagged as threats. #145 is closed, but the pattern is still live at this HEAD. Examples:
- A credential-hygiene procedure document (markdown describing how to check
.env files and key handling) → flagged Privilege Escalation (PE3), HIGH.
- A shell-script security comment — literally the text "URL is user-supplied and untrusted… never eval'd… no credentials" — → flagged Rogue Agent (RA1), confidence 0.9.
- An uninstaller deleting its own downloaded models/cache → Tool Misuse (TM1), 0.95.
- Onboarding tone guidance in a docs file ("don't lecture users about principles") → Anti-Refusal (AR2).
We saw the same behavior on public corpora: scanning anthropics/skills as a presumed-benign control produced 260 findings / overall CRITICAL, with OOXML .xsd schema files flagged as Prompt Injection.
Why it matters: documentation about security is a routine part of any well-maintained skill library, and it's exactly the corpus that scores worst. When vetted-benign and official first-party repos all rate CRITICAL/DO_NOT_INSTALL, the static verdict stops carrying signal and users learn to override it — which defeats the gate.
Suggested direction: context-gate the semantic string rules — e.g. downweight or exempt matches inside comments and prose/markdown documentation files, or require a code-execution context for PE/RA/AR rules to fire at HIGH.
Happy to provide the sanitized findings JSON if useful.
Version: v2.3.9, repo HEAD
326a2b489411a20ed742ff13701be39ba00063c8, static mode (--no-llm), running via the repo's own Dockerfile.Observation: scanning a private, hand-vetted skill library (ground truth: benign), ~90% of sampled HIGH findings were the pattern that #145 described — security documentation being flagged as threats. #145 is closed, but the pattern is still live at this HEAD. Examples:
.envfiles and key handling) → flagged Privilege Escalation (PE3), HIGH.We saw the same behavior on public corpora: scanning
anthropics/skillsas a presumed-benign control produced 260 findings / overall CRITICAL, with OOXML.xsdschema files flagged as Prompt Injection.Why it matters: documentation about security is a routine part of any well-maintained skill library, and it's exactly the corpus that scores worst. When vetted-benign and official first-party repos all rate CRITICAL/DO_NOT_INSTALL, the static verdict stops carrying signal and users learn to override it — which defeats the gate.
Suggested direction: context-gate the semantic string rules — e.g. downweight or exempt matches inside comments and prose/markdown documentation files, or require a code-execution context for PE/RA/AR rules to fire at HIGH.
Happy to provide the sanitized findings JSON if useful.