Skip to content

qualifire-dev/rogue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Rogue β€” AI Agent Evaluator & Red Team Platform

Tests

Stress-test your AI agents before attackers do.

Discord Community Β· Quick Start Β· Documentation


Two Ways to Harden Your Agent

🎯 Automatic Evaluation

Test your agent against business policies and expected behaviors.

  • Define scenarios & expected outcomes
  • Verify compliance with business rules
  • Watch live conversations as Rogue probes your agent
  • Get detailed pass/fail reports with reasoning

Best for: Regression testing, behavior validation, policy compliance

πŸ”΄ Red Teaming

Simulate adversarial attacks to find security vulnerabilities.

  • 75+ vulnerabilities across 12 security categories
  • 20 attack techniques (encoding, social engineering, injection)
  • CVSS-based risk scoring
  • 8 compliance frameworks (OWASP, MITRE, NIST, GDPR, EU AI Act)

Best for: Security audits, penetration testing, compliance reporting


Architecture

Rogue operates on a client-server architecture with multiple interfaces:

Component Description
Server Core evaluation & red team logic
TUI Modern terminal interface (Go + Bubble Tea)
Web UI Gradio-based web interface
CLI Non-interactive mode for CI/CD pipelines
rogue-demo.mp4

Supported Protocols

Protocol Transport Description
A2A HTTP Google's Agent-to-Agent protocol
MCP SSE, STREAMABLE_HTTP Model Context Protocol via send_message tool

See examples in examples/ for reference implementations.


πŸ”₯ Quick Start

Prerequisites

  • uvx β€” Install uv
  • Python 3.10+
  • LLM API key (OpenAI, Anthropic, or Google)

Installation

# TUI (recommended)
uvx rogue-ai

# Web UI
uvx rogue-ai ui

# CLI / CI/CD
uvx rogue-ai cli

Try It With the Example Agent

# All-in-one: starts both Rogue and a sample T-shirt store agent
uvx rogue-ai --example=tshirt_store

Configure in the UI:

  • Agent URL: http://localhost:10001
  • Mode: Choose Automatic Evaluation or Red Teaming

Running Modes

Mode Command Description
Default uvx rogue-ai Server + TUI
Server uvx rogue-ai server Backend only
TUI uvx rogue-ai tui Terminal client
Web UI uvx rogue-ai ui Gradio interface
CLI uvx rogue-ai cli Non-interactive (CI/CD)

Server Options

uvx rogue-ai server --host 0.0.0.0 --port 8000 --debug

CLI Options

uvx rogue-ai cli \
  --evaluated-agent-url http://localhost:10001 \
  --judge-llm openai/gpt-4o-mini \
  --business-context-file ./.rogue/business_context.md
Option Description
--config-file Path to config JSON
--evaluated-agent-url Agent endpoint (required)
--judge-llm LLM for evaluation (required)
--business-context Context string or --business-context-file
--input-scenarios-file Scenarios JSON
--output-report-file Report output path
--deep-test-mode Extended testing

Red Teaming

Scan Types

Type Vulnerabilities Attacks Time
Basic 5 curated 6 ~2-3 min
Full 75+ 40+ ~30-45 min
Custom User-selected User-selected Varies

Compliance Frameworks

  • OWASP LLM Top 10 β€” Prompt injection, sensitive data exposure, excessive agency
  • MITRE ATLAS β€” Adversarial threat landscape for AI systems
  • NIST AI RMF β€” AI risk management framework
  • ISO/IEC 42001 β€” AI management system standard
  • EU AI Act β€” European AI regulation compliance
  • GDPR β€” Data protection requirements
  • OWASP API Top 10 β€” API security best practices

Attack Categories

Category Examples
Encoding Base64, ROT13, Leetspeak
Social Engineering Roleplay, trust building
Injection Prompt injection, SQL injection
Semantic Goal redirection, context poisoning
Technical Gray-box probing, permission escalation

Risk Scoring (CVSS-based)

Each vulnerability receives a 0-10 risk score based on:

  • Impact β€” Severity if exploited
  • Exploitability β€” Success rate likelihood
  • Human Factor β€” Manual exploitation potential
  • Complexity β€” Attack difficulty

Reproducible Scans

# Use random seeds for reproducible results
uvx rogue-ai cli --random-seed 42

Perfect for regression testing and validating security fixes.


Configuration

Environment Variables

OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-..."
GOOGLE_API_KEY="..."

Config File (.rogue/user_config.json)

{
  "evaluated_agent_url": "http://localhost:10001",
  "judge_llm": "openai/gpt-4o-mini"
}

Key Features

Feature Description
πŸ”„ Dynamic Scenarios Auto-generate tests from business context
πŸ‘€ Live Monitoring Watch agent conversations in real-time
πŸ“Š Comprehensive Reports Markdown, CSV, JSON exports
πŸ” Multi-Faceted Testing Policy compliance + security vulnerabilities
πŸ€– Model Support OpenAI, Anthropic, Google (via LiteLLM)
πŸ›‘οΈ CVSS Scoring Industry-standard risk assessment
πŸ” Reproducible Deterministic scans with random seeds

Documentation


Contributing

  1. Fork the repository
  2. Create a branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

Licensed under a proprietary license β€” see LICENSE.

Free for personal and internal use. Commercial hosting requires licensing. Contact: admin@qualifire.ai