Skip to content

Static analyzers mass-flag security documentation and .env references as Privilege Escalation / Rogue Agent (extends #145) #251

Description

@csgrullon

Version: v2.3.9, repo HEAD 326a2b489411a20ed742ff13701be39ba00063c8, static mode (--no-llm), running via the repo's own Dockerfile.

Observation: scanning a private, hand-vetted skill library (ground truth: benign), ~90% of sampled HIGH findings were the pattern that #145 described — security documentation being flagged as threats. #145 is closed, but the pattern is still live at this HEAD. Examples:

  • A credential-hygiene procedure document (markdown describing how to check .env files and key handling) → flagged Privilege Escalation (PE3), HIGH.
  • A shell-script security comment — literally the text "URL is user-supplied and untrusted… never eval'd… no credentials" — → flagged Rogue Agent (RA1), confidence 0.9.
  • An uninstaller deleting its own downloaded models/cache → Tool Misuse (TM1), 0.95.
  • Onboarding tone guidance in a docs file ("don't lecture users about principles") → Anti-Refusal (AR2).

We saw the same behavior on public corpora: scanning anthropics/skills as a presumed-benign control produced 260 findings / overall CRITICAL, with OOXML .xsd schema files flagged as Prompt Injection.

Why it matters: documentation about security is a routine part of any well-maintained skill library, and it's exactly the corpus that scores worst. When vetted-benign and official first-party repos all rate CRITICAL/DO_NOT_INSTALL, the static verdict stops carrying signal and users learn to override it — which defeats the gate.

Suggested direction: context-gate the semantic string rules — e.g. downweight or exempt matches inside comments and prose/markdown documentation files, or require a code-execution context for PE/RA/AR rules to fire at HIGH.

Happy to provide the sanitized findings JSON if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions