Part of the Nebula Forge security tools suite.
ClusterIQ groups ECS-lite alerts by signal fingerprint, then scores each cluster contextually to determine whether it should be escalated, reviewed, or suppressed as noise. Unlike naive deduplication, ClusterIQ never suppresses a cluster solely because its signals match — context always wins.
Two alerts with identical signals (same process name, same destination IP, same event action) receive different verdicts if one carries a TI tag or involves a rare user:
Cluster A — powershell.exe / C2_IP → ESCALATE (TI indicator present)
Cluster B — powershell.exe / C2_IP → REVIEW (off-hours, rare user)
Cluster C — powershell.exe / C2_IP → SUPPRESSED (known user, business hours, no TI)
- Signal fingerprinting — configurable cluster-by fields (process.name, event.action, destination IP, etc.)
- Fuzzy cluster merging — token-level Jaccard similarity above a configurable threshold
- Five context dimensions — TI indicators, critical asset heuristic, user anomaly, time-of-day, hit rate
- Context-first verdict engine — TI tags always escalate regardless of similarity score
- Exact deduplication —
/api/deduplicatewith sliding time-window - Session library — persistent SQLite storage, search, pagination, export
- Export — JSON and Markdown per session
- CLI — offline analysis without the web UI
- Integration — accepts ECS-lite directly from LogNorm (port 5006)
cd ClusterIQ
pip install -r requirements.txt
cp config.example.yaml config.yaml # optional
python app.pyOpen http://127.0.0.1:5009.
- Paste or upload ECS-lite alerts (JSON array or NDJSON).
- Check the fields to cluster by, set the similarity threshold.
- Click Cluster Alerts.
- Results appear as color-coded cluster cards:
- Red — Escalate
- Yellow — Review
- Grey — Suppressed
- Click any card to open the detail modal (Overview, Context Scores, Members).
# Cluster alerts, print summary
python cli.py --alerts alerts.json
# Custom fields and threshold
python cli.py --alerts alerts.json --fields process.name,event.action --threshold 0.8
# Save output as Markdown
python cli.py --alerts alerts.json --output session.md
# Deduplicate with 5-minute window
python cli.py --dedup --alerts alerts.json --window 300
# Print as JSON
python cli.py --alerts alerts.json --format json| Method | Endpoint | Description |
|---|---|---|
| GET | /api/health |
Health check |
| POST | /api/cluster |
Cluster alerts |
| POST | /api/deduplicate |
Remove exact duplicates |
| GET | /api/sessions |
List sessions (paginated) |
| GET | /api/session/<id> |
Get a single session |
| DELETE | /api/session/<id> |
Delete a session |
| GET | /api/session/<id>/export |
Export (JSON or Markdown) |
{
"alerts": [{...ECS-lite...}],
"similarity_threshold": 0.75,
"cluster_by": ["process.name", "event.action", "network.destination.ip"],
"label": "Monday SOC triage",
"save": true
}Also accepts alerts_json (raw JSON string) and multipart/form-data.
Response:
{
"success": true,
"session_id": "uuid",
"clusters": [...],
"original_count": 847,
"cluster_count": 12,
"suppressed_count": 821,
"review_count": 18,
"escalate_count": 8,
"noise_reduction_pct": 96.8
}{"alerts": [...], "window_seconds": 300}Response: {"success": true, "unique": [...], "removed": 103, "original": 206}
| Condition | Verdict |
|---|---|
| TI indicator in any member | escalate (always, overrides similarity) |
| Critical asset + risk ≥ 65% | escalate |
| Rare/unknown user ≥ 65% | escalate |
| Off-hours ≥ 30% of members | review |
| Uncommon user 20–65% | review |
| Elevated asset risk 30–65% | review |
| No anomalous context | suppressed |
Context overrides signal similarity at every level.
| Dimension | Description |
|---|---|
ti_tags |
Threat-intel indicator patterns in tags, rule.tags, threat.indicator.* |
has_critical_asset |
Hostname heuristic: dc-, exchange, sql, prod-, srv-, backup, etc. |
user_anomaly |
Users appearing in ≤ 1% of session alerts |
asset_risk |
Fraction of cluster members on critical assets |
time_anomaly |
Fraction of members outside Mon–Fri 09:00–17:00 |
hit_rate_anomaly |
Cluster size relative to session average |
- Fingerprint each alert by extracting the
cluster_byfields. - Group exactly by SHA-256 hash of the fingerprint.
- Merge near-similar groups where token-level Jaccard similarity ≥ threshold.
- Score context per cluster across five dimensions.
- Assign verdict — escalate → review → suppressed in priority order.
| Key | Default | Description |
|---|---|---|
port |
5009 |
HTTP port |
db_path |
./clusteriq.db |
SQLite database |
clustering.default_threshold |
0.75 |
Similarity threshold |
clustering.default_fields |
[process.name, event.action, network.destination.ip] |
Default cluster-by |
clustering.max_alerts |
50000 |
Input cap |
clustering.auto_save |
true |
Persist sessions automatically |
integrations.lognorm_url |
http://127.0.0.1:5006 |
LogNorm endpoint |
Add to nebula-dashboard/config.yaml:
tools:
clusteriq:
label: "ClusterIQ"
url: "http://127.0.0.1:5009"
health_path: "/api/health"
description: "Contextual alert clustering engine"
category: "Detection"This project is licensed under the MIT License — see the LICENSE file for details.
Built by Rootless-Ghost
Part of the Nebula Forge security tools suite.