Rogue — AI Agent Evaluator & Red Team Platform

Stress-test your AI agents before attackers do.

Discord Community · Quick Start · Documentation

Two Ways to Harden Your Agent

🎯 Automatic Evaluation

Test your agent against business policies and expected behaviors.

Define scenarios & expected outcomes
Verify compliance with business rules
Watch live conversations as Rogue probes your agent
Get detailed pass/fail reports with reasoning

Best for: Regression testing, behavior validation, policy compliance

🔴 Red Teaming

Simulate adversarial attacks to find security vulnerabilities.

75+ vulnerabilities across 12 security categories
20 attack techniques (encoding, social engineering, injection)
CVSS-based risk scoring
8 compliance frameworks (OWASP, MITRE, NIST, GDPR, EU AI Act)

Best for: Security audits, penetration testing, compliance reporting

Architecture

Rogue operates on a client-server architecture with multiple interfaces:

Component	Description
Server	Core evaluation & red team logic
TUI	Modern terminal interface (Go + Bubble Tea)
Web UI	Gradio-based web interface
CLI	Non-interactive mode for CI/CD pipelines

rogue-demo.mp4

Supported Protocols

Protocol	Transport	Description
A2A	HTTP	Google's Agent-to-Agent protocol
MCP	SSE, STREAMABLE_HTTP	Model Context Protocol via `send_message` tool

See examples in examples/ for reference implementations.

🔥 Quick Start

Prerequisites

uvx — Install uv
Python 3.10+
LLM API key (OpenAI, Anthropic, or Google)

Installation

# TUI (recommended)
uvx rogue-ai

# Web UI
uvx rogue-ai ui

# CLI / CI/CD
uvx rogue-ai cli

Try It With the Example Agent

# All-in-one: starts both Rogue and a sample T-shirt store agent
uvx rogue-ai --example=tshirt_store

Configure in the UI:

Agent URL: http://localhost:10001
Mode: Choose Automatic Evaluation or Red Teaming

Running Modes

Mode	Command	Description
Default	`uvx rogue-ai`	Server + TUI
Server	`uvx rogue-ai server`	Backend only
TUI	`uvx rogue-ai tui`	Terminal client
Web UI	`uvx rogue-ai ui`	Gradio interface
CLI	`uvx rogue-ai cli`	Non-interactive (CI/CD)

Server Options

uvx rogue-ai server --host 0.0.0.0 --port 8000 --debug

CLI Options

uvx rogue-ai cli \
  --evaluated-agent-url http://localhost:10001 \
  --judge-llm openai/gpt-4o-mini \
  --business-context-file ./.rogue/business_context.md

Option	Description
`--config-file`	Path to config JSON
`--evaluated-agent-url`	Agent endpoint (required)
`--judge-llm`	LLM for evaluation (required)
`--business-context`	Context string or `--business-context-file`
`--input-scenarios-file`	Scenarios JSON
`--output-report-file`	Report output path
`--deep-test-mode`	Extended testing

Red Teaming

Scan Types

Type	Vulnerabilities	Attacks	Time
Basic	5 curated	6	~2-3 min
Full	75+	40+	~30-45 min
Custom	User-selected	User-selected	Varies

Compliance Frameworks

OWASP LLM Top 10 — Prompt injection, sensitive data exposure, excessive agency
MITRE ATLAS — Adversarial threat landscape for AI systems
NIST AI RMF — AI risk management framework
ISO/IEC 42001 — AI management system standard
EU AI Act — European AI regulation compliance
GDPR — Data protection requirements
OWASP API Top 10 — API security best practices

Attack Categories

Category	Examples
Encoding	Base64, ROT13, Leetspeak
Social Engineering	Roleplay, trust building
Injection	Prompt injection, SQL injection
Semantic	Goal redirection, context poisoning
Technical	Gray-box probing, permission escalation

Risk Scoring (CVSS-based)

Each vulnerability receives a 0-10 risk score based on:

Impact — Severity if exploited
Exploitability — Success rate likelihood
Human Factor — Manual exploitation potential
Complexity — Attack difficulty

Reproducible Scans

# Use random seeds for reproducible results
uvx rogue-ai cli --random-seed 42

Perfect for regression testing and validating security fixes.

Configuration

Environment Variables

OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-..."
GOOGLE_API_KEY="..."

Config File (`.rogue/user_config.json`)

{
  "evaluated_agent_url": "http://localhost:10001",
  "judge_llm": "openai/gpt-4o-mini"
}

Key Features

Feature	Description
🔄 Dynamic Scenarios	Auto-generate tests from business context
👀 Live Monitoring	Watch agent conversations in real-time
📊 Comprehensive Reports	Markdown, CSV, JSON exports
🔍 Multi-Faceted Testing	Policy compliance + security vulnerabilities
🤖 Model Support	OpenAI, Anthropic, Google (via LiteLLM)
🛡️ CVSS Scoring	Industry-standard risk assessment
🔁 Reproducible	Deterministic scans with random seeds

Documentation

Quick Reference — One-page cheat sheet
Red Team Workflow — Technical deep-dive
Implementation Status — Feature breakdown
Attack Mapping — Vulnerability coverage

Contributing

Fork the repository
Create a branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push (git push origin feature/amazing-feature)
Open a Pull Request

License

Licensed under a proprietary license — see LICENSE.

Free for personal and internal use. Commercial hosting requires licensing. Contact: admin@qualifire.ai

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github		.github
.rogue		.rogue
.vscode		.vscode
docs		docs
examples		examples
packages		packages
rogue		rogue
sdks/python		sdks/python
.bandit.yaml		.bandit.yaml
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.mypy.ini		.mypy.ini
.python-version		.python-version
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
VERSION		VERSION
cli_docs.md		cli_docs.md
flow.png		flow.png
freddy-rogue.png		freddy-rogue.png
lefthook.yaml		lefthook.yaml
pyproject.toml		pyproject.toml
rogue-tui.mp4		rogue-tui.mp4
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rogue — AI Agent Evaluator & Red Team Platform

Two Ways to Harden Your Agent

🎯 Automatic Evaluation

🔴 Red Teaming

Architecture

Supported Protocols

🔥 Quick Start

Prerequisites

Installation

Try It With the Example Agent

Running Modes

Server Options

CLI Options

Red Teaming

Scan Types

Compliance Frameworks

Attack Categories

Risk Scoring (CVSS-based)

Reproducible Scans

Configuration

Environment Variables

Config File (`.rogue/user_config.json`)

Key Features

Documentation

Contributing

License

About

Uh oh!

Releases 20

Packages

Uh oh!

Contributors 7

Languages

License

qualifire-dev/rogue

Folders and files

Latest commit

History

Repository files navigation

Rogue — AI Agent Evaluator & Red Team Platform

Two Ways to Harden Your Agent

🎯 Automatic Evaluation

🔴 Red Teaming

Architecture

Supported Protocols

🔥 Quick Start

Prerequisites

Installation

Try It With the Example Agent

Running Modes

Server Options

CLI Options

Red Teaming

Scan Types

Compliance Frameworks

Attack Categories

Risk Scoring (CVSS-based)

Reproducible Scans

Configuration

Environment Variables

Config File (.rogue/user_config.json)

Key Features

Documentation

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 20

Packages 0

Uh oh!

Contributors 7

Languages

Config File (`.rogue/user_config.json`)

Packages