Skip to content

GhostIntruder/AgentRedTeam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentRedTeam

An Agentic AI Red Teaming Framework for OWASP LLM Top 10 Vulnerabilities

"You cannot govern what you cannot test. You cannot test what you do not understand."

Overview

AgentRedTeam is an open-source red teaming framework designed to systematically surface security vulnerabilities in agentic AI systems: systems that take autonomous actions, chain tools, and operate with reduced human oversight. Most existing red teaming tools were built for static LLM deployments: a user sends a prompt, a model responds, done. Agentic systems are fundamentally different. They plan. They call external tools. They execute multi-step tasks. They sometimes act before a human can intervene. This changes the attack surface entirely. AgentRedTeam targets vulnerabilities defined in the OWASP LLM Top 10 with a specific focus on how those vulnerabilities behave and escalate in agentic pipelines built on frameworks like LangChain and hosted on cloud infrastructure like AWS Bedrock.

Why This Matters for Governance

This project is not just a security tool. It is a research instrument. Current AI governance frameworks, including the EU AI Act, the NIST AI Risk Management Framework, and OWASP's own guidance, were largely designed with static LLM deployments in mind. As agentic AI systems move into production across critical sectors, the governance gap widens. Systematic red teaming is one of the few empirical methods available to make that gap visible, nameable, and actionable for policymakers. AgentRedTeam is built with that dual purpose: to test systems, and to generate evidence that governance frameworks can use.

Target Vulnerabilities (OWASP LLM Top 10: Agentic Focus) #VulnerabilityAgentic Risk EscalationLLM01Prompt InjectionInjected instructions can hijack tool calls and multi-step plansLLM02Insecure Output HandlingUnvalidated outputs passed between agents create cascading failuresLLM06Sensitive Information DisclosureAgentic memory and retrieval surfaces expose data across sessionsLLM08Excessive AgencyAgents granted broad permissions can take irreversible real-world actionsLLM09OverrelianceSystems with reduced human-in-the-loop create blind spots for oversight

Architecture agentredteam/ core/

prompt_injection/ — LLM01 test modules excessive_agency/ — LLM08 test modules output_handling/ — LLM02 test modules sensitive_disclosure/ — LLM06 test modules

agents/

langchain_agent.py — Test target: LangChain-based agentic setup bedrock_agent.py — Test target: AWS Bedrock agent

reports/

report_generator.py — Structured output for findings

governance/

framework_mapper.py — Maps findings to governance framework gaps

tests/ requirements.txt README.md

Current Status ModuleStatusPrompt Injection (LLM01) In developmentExcessive Agency (LLM08) In developmentOutput Handling (LLM02) PlannedSensitive Disclosure (LLM06) PlannedGovernance Framework Mapper PlannedReport Generator Planned This project is under active development as part of ongoing AI safety research. Contributions and feedback welcome.

Tech Stack

Language: Python 3.11+ Agentic Framework: LangChain Cloud AI: AWS Bedrock (Claude, Titan) Testing: pytest Reporting: Markdown and JSON structured output

Roadmap

Complete prompt injection test suite for LangChain agents Complete excessive agency test suite Build governance framework mapper (EU AI Act, NIST AI RMF, OWASP) Add AWS Bedrock agent test targets Publish findings as a research note and preprint Add CI/CD pipeline for automated test runs

Motivation and Background This framework is being developed as part of a broader research agenda at the intersection of offensive AI security and AI governance. The central argument driving this work: governance frameworks cannot keep pace with agentic AI deployment if they lack empirical grounding in how these systems actually fail. Red teaming is not just a security practice. It is a policy research method. This project is particularly interested in how governance gaps affect Global South contexts, where agentic AI systems are being deployed with even less regulatory oversight and fewer institutional safeguards.

About the Author I'm Cynthia, a cybersecurity professional transitioning into AI safety research. My background is a mix of things that do not always go together: a Political Science degree with a foreign policy focus, five years leading a civil society organization, and the last few years working in offensive security. I built AgentRedTeam because I kept seeing the same gap: people writing AI governance policy without testing how these systems actually break, and security people testing without thinking about what the findings mean for policy. This project is my attempt to sit in that middle space. I am particularly interested in what AI governance gets wrong for Global South contexts, and I write and research at that intersection. LinkedIn: https://www.linkedin.com/in/jatto-cynthia? Email: jattohephzibah@gmail.com

Contributing This is an open research project. If you work in AI security, AI safety, or AI governance and want to collaborate, open an issue or reach out directly.

License MIT License. See LICENSE for details.

AgentRedTeam is built for research and learning. Please use it responsibly and only on systems you have explicit permission to test.

About

An agentic AI red teaming framework for OWASP LLM Top 10 vulnerabilities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages