ClusterIQ — Contextual Alert Clustering Engine

Part of the Nebula Forge security tools suite.

ClusterIQ groups ECS-lite alerts by signal fingerprint, then scores each cluster contextually to determine whether it should be escalated, reviewed, or suppressed as noise. Unlike naive deduplication, ClusterIQ never suppresses a cluster solely because its signals match — context always wins.

Core Differentiator

Two alerts with identical signals (same process name, same destination IP, same event action) receive different verdicts if one carries a TI tag or involves a rare user:

Cluster A — powershell.exe / C2_IP  →  ESCALATE  (TI indicator present)
Cluster B — powershell.exe / C2_IP  →  REVIEW    (off-hours, rare user)
Cluster C — powershell.exe / C2_IP  →  SUPPRESSED (known user, business hours, no TI)

Features

Signal fingerprinting — configurable cluster-by fields (process.name, event.action, destination IP, etc.)
Fuzzy cluster merging — token-level Jaccard similarity above a configurable threshold
Five context dimensions — TI indicators, critical asset heuristic, user anomaly, time-of-day, hit rate
Context-first verdict engine — TI tags always escalate regardless of similarity score
Exact deduplication — /api/deduplicate with sliding time-window
Session library — persistent SQLite storage, search, pagination, export
Export — JSON and Markdown per session
CLI — offline analysis without the web UI
Integration — accepts ECS-lite directly from LogNorm (port 5006)

Quick Start

cd ClusterIQ
pip install -r requirements.txt
cp config.example.yaml config.yaml   # optional
python app.py

Open http://127.0.0.1:5009.

Usage

Web UI

Paste or upload ECS-lite alerts (JSON array or NDJSON).
Check the fields to cluster by, set the similarity threshold.
Click Cluster Alerts.
Results appear as color-coded cluster cards:
- Red — Escalate
- Yellow — Review
- Grey — Suppressed
Click any card to open the detail modal (Overview, Context Scores, Members).

CLI

# Cluster alerts, print summary
python cli.py --alerts alerts.json

# Custom fields and threshold
python cli.py --alerts alerts.json --fields process.name,event.action --threshold 0.8

# Save output as Markdown
python cli.py --alerts alerts.json --output session.md

# Deduplicate with 5-minute window
python cli.py --dedup --alerts alerts.json --window 300

# Print as JSON
python cli.py --alerts alerts.json --format json

API Reference

Method	Endpoint	Description
GET	`/api/health`	Health check
POST	`/api/cluster`	Cluster alerts
POST	`/api/deduplicate`	Remove exact duplicates
GET	`/api/sessions`	List sessions (paginated)
GET	`/api/session/<id>`	Get a single session
DELETE	`/api/session/<id>`	Delete a session
GET	`/api/session/<id>/export`	Export (JSON or Markdown)

POST /api/cluster

{
  "alerts":               [{...ECS-lite...}],
  "similarity_threshold": 0.75,
  "cluster_by":           ["process.name", "event.action", "network.destination.ip"],
  "label":                "Monday SOC triage",
  "save":                 true
}

Also accepts alerts_json (raw JSON string) and multipart/form-data.

Response:

{
  "success":             true,
  "session_id":          "uuid",
  "clusters":            [...],
  "original_count":      847,
  "cluster_count":       12,
  "suppressed_count":    821,
  "review_count":        18,
  "escalate_count":      8,
  "noise_reduction_pct": 96.8
}

POST /api/deduplicate

{"alerts": [...], "window_seconds": 300}

Response: {"success": true, "unique": [...], "removed": 103, "original": 206}

Verdict Logic

Condition	Verdict
TI indicator in any member	escalate (always, overrides similarity)
Critical asset + risk ≥ 65%	escalate
Rare/unknown user ≥ 65%	escalate
Off-hours ≥ 30% of members	review
Uncommon user 20–65%	review
Elevated asset risk 30–65%	review
No anomalous context	suppressed

Context overrides signal similarity at every level.

Context Score Dimensions

Dimension	Description
`ti_tags`	Threat-intel indicator patterns in tags, rule.tags, threat.indicator.*
`has_critical_asset`	Hostname heuristic: dc-, exchange, sql, prod-, srv-, backup, etc.
`user_anomaly`	Users appearing in ≤ 1% of session alerts
`asset_risk`	Fraction of cluster members on critical assets
`time_anomaly`	Fraction of members outside Mon–Fri 09:00–17:00
`hit_rate_anomaly`	Cluster size relative to session average

Clustering Algorithm

Fingerprint each alert by extracting the cluster_by fields.
Group exactly by SHA-256 hash of the fingerprint.
Merge near-similar groups where token-level Jaccard similarity ≥ threshold.
Score context per cluster across five dimensions.
Assign verdict — escalate → review → suppressed in priority order.

Configuration

Key	Default	Description
`port`	`5009`	HTTP port
`db_path`	`./clusteriq.db`	SQLite database
`clustering.default_threshold`	`0.75`	Similarity threshold
`clustering.default_fields`	`[process.name, event.action, network.destination.ip]`	Default cluster-by
`clustering.max_alerts`	`50000`	Input cap
`clustering.auto_save`	`true`	Persist sessions automatically
`integrations.lognorm_url`	`http://127.0.0.1:5006`	LogNorm endpoint

Nebula Forge Integration

Add to nebula-dashboard/config.yaml:

tools:
  clusteriq:
    label:       "ClusterIQ"
    url:         "http://127.0.0.1:5009"
    health_path: "/api/health"
    description: "Contextual alert clustering engine"
    category:    "Detection"

License

This project is licensed under the MIT License — see the LICENSE file for details.

Built by Rootless-Ghost

Part of the Nebula Forge security tools suite.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
core		core
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
app.py		app.py
cli.py		cli.py
config.example.yaml		config.example.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ClusterIQ — Contextual Alert Clustering Engine

Core Differentiator

Features

Quick Start

Usage

Web UI

CLI

API Reference

POST /api/cluster

POST /api/deduplicate

Verdict Logic

Context Score Dimensions

Clustering Algorithm

Configuration

Nebula Forge Integration

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ClusterIQ — Contextual Alert Clustering Engine

Core Differentiator

Features

Quick Start

Usage

Web UI

CLI

API Reference

POST /api/cluster

POST /api/deduplicate

Verdict Logic

Context Score Dimensions

Clustering Algorithm

Configuration

Nebula Forge Integration

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages