-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
Extend the blocklist functionality to include a static/curated blocklist for domains that serve low-quality or misinformation content. This is separate from the existing offenders list, which is reactive (populated after prompt injection detections).
Current Behavior
The offenders list only blocks domains after they've been flagged for prompt injections (detection_count >= 3, high confidence, etc.). There's no way to proactively block known low-quality sources.
Proposed Behavior
Add a static blocklist that:
- Blocks domains immediately without fetching
- Returns a clear response indicating the domain is blocked as a low-quality source
- Is configurable (via config.toml or a separate file)
- Runs before the fetch step (same place as offenders check in
core.py)
Use Case
Services like Grokopedia (Grok's Wikipedia alternative) that produce unreliable or AI-hallucinated "encyclopedia" content. When an agent hands off a Grokopedia link, Shutter should reject it immediately rather than:
- Wasting tokens fetching the content
- Potentially extracting and passing along misinformation
Suggested Implementation
-
Add a
blocklisttable or config section with:domain: The blocked domainreason: Why it's blocked (e.g., "misinformation", "low-quality", "ai-slop")added_date: When it was added
-
Check the static blocklist before the offenders list in
core.py -
Return a distinct
prompt_injection.typelike"domain_blocklisted"(vs"domain_blocked"for offenders) -
Provide CLI commands:
shutter blocklist add <domain> --reason <reason>shutter blocklist remove <domain>shutter blocklist list
Initial Blocklist Candidates
grokopedia.com/grokipedia.com(Grok's Wikipedia)- Other AI-generated encyclopedia/wiki farms
Notes
This differs from the offenders list in that:
- Offenders: Reactive, evidence-based (detected injections)
- Blocklist: Proactive, curated (known bad actors)
Both serve the goal of protecting downstream agents from bad content.