llm-jailbreaks

Here are 13 public repositories matching this topic...

msoedov / agentic_security

Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪

agent-framework ai-red-team prompt-testing llm-security llm-vulnerabilities llm-evaluation llm-fuzzing llm-evaluation-framework llm-guardrails llm-scanner llm-jailbreaks llm-fuzzer llm-fuzzer-aggregator agent-security

Updated Dec 24, 2025
Python

CryptoAILab / JailbreakEval

Star

[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.

llm-safety llm-jailbreaks

Updated Apr 1, 2025
Python

whitecircle-ai / circle-guard-bench

Star

First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)

benchmarking benchmark ai jailbreak safeguard guardrail guardrails large-language-models llm large-language-model llm-security llm-eval llm-evaluation llm-as-a-judge llm-jailbreaks

Updated Dec 3, 2025
Python

BirdsAreFlyingCameras / GPT-5_Jailbreak_PoC

Star

A working POC of a GPT-5 jailbreak via PROMISQROUTE (Prompt-based Router Open-Mode Manipulation) with a barebones C2 server & agent generation demo.

proof-of-concept jailbreak malware poc working chatgpt gpt-5 chatgpt-jailbreak gpt5 malware-generation llm-jailbreaks gpt5-jailbreak gpt5-jailbreak-working llm-malware-generation

Updated Sep 21, 2025
C

TrustAI-laboratory / LMAP

Star

LMAP (large language model mapper) is like NMAP for LLM, is an LLM Vulnerability Scanner and Zero-day Vulnerability Fuzzer.

ai security-scanner vulnerability-scanner infosectools llm llms ai-red-team llm-security llm-vulnerabilities llm-fuzzing llm-guardrails owasp-llm-top-10 llm-scanner llm-jailbreaks llm-fuzzer llm-fuzzer-aggregator

Updated Oct 16, 2024

UCSB-NLP-Chang / SemanticSmooth

Star

Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'

large-language-models llm-jailbreaks

Updated Jun 9, 2024
Python

yiksiu-chan / SpeakEasy

Star

[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

machine-learning ai-safety large-language-models llm-jailbreaks

Updated Sep 27, 2025
Python

DifABD / Agent-Jailbreaking-Agents

Star

🔍 Investigate LLM agent jailbreaking using a dual-agent framework to analyze persuasive strategies and model resistance in a controlled environment.

sqlalchemy feedback jailbreak malware loop poc gpt messages lineage working ai-agents adversarial-attacks chatgpt llama-cpp chatgpt-jailbreak gpt5 llm-jailbreaks llm-malware-generation

Updated Jan 5, 2026
Python

SandyyyZheng / JailbreakSystem

Star

JailbreakSystem--2025 Graduate Design for HFUT

python vue graduate-project llm-jailbreaks

Updated May 12, 2025
Vue

RafaelParonis / jailbench

Star

🔍 Benchmark jailbreak resilience in LLMs with JailBench for clear insights and improved model defenses against jailbreak attempts.

python flask analytics openai alignment model-evaluation ai-safety security-testing red-teaming model-robustness anthropic litellm content-safety llm-jailbreaks tool-calling llm-benchmark ai-evals textual-tui

Updated Jan 5, 2026
Python

4n4s4zi / llm-jailbreaking

Star

Chain-of-thought hijacking via template token injection for LLM censorship bypass (GPT-OSS)

llm-jailbreaks gpt-oss

Updated Sep 27, 2025
Python

vibheksoni / jailbench

Star

Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.

Updated Aug 12, 2025
Python

1lmao / TAP-Tree-of-Attacks-with-Pruning

Star

Debugged version for Tree of Attacks: Jailbreaking Black-Box LLMs Automatically paper and added GPU optimization.

llm-jailbreaks

Updated Nov 24, 2025
Python

Improve this page

Add a description, image, and links to the llm-jailbreaks topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-jailbreaks topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-jailbreaks

Here are 13 public repositories matching this topic...

msoedov / agentic_security

CryptoAILab / JailbreakEval

whitecircle-ai / circle-guard-bench

BirdsAreFlyingCameras / GPT-5_Jailbreak_PoC

TrustAI-laboratory / LMAP

UCSB-NLP-Chang / SemanticSmooth

yiksiu-chan / SpeakEasy

DifABD / Agent-Jailbreaking-Agents

SandyyyZheng / JailbreakSystem

RafaelParonis / jailbench

4n4s4zi / llm-jailbreaking

vibheksoni / jailbench

1lmao / TAP-Tree-of-Attacks-with-Pruning

Improve this page

Add this topic to your repo