A defensive security platform built for the Apart Research Defensive Acceleration Hackathon. AI Mesh enables rapid adversarial testing of untrusted AI models using real-time threat intelligence and AI-powered risk assessment.
AI Mesh addresses the critical need to quickly evaluate the security posture of untrusted AI models before deployment. The system:
- Collects real-time threat intelligence from security communities (Reddit r/ChatGPTJailbreak, HackerNews)
- Generates adversarial test prompts based on current attack techniques
- Tests untrusted models with these prompts
- Uses AI-powered judges to assess response safety
- Provides comprehensive risk reports in 10-20 seconds
Quick Model Testing
- Test any untrusted AI model in 10-20 seconds
- Register models via OpenAI-compatible or custom API endpoints
- Automatic adversarial prompt generation from latest threats
- AI judge evaluates responses for safety violations
Real-Time Threat Intelligence
- Live scanning of r/ChatGPTJailbreak (5 min intervals)
- HackerNews security monitoring (10 min intervals)
- Automatic threat enrichment via Groq AI
- Pattern detection for emerging attack vectors
Risk Assessment
- Multi-layer analysis (heuristic + AI-powered)
- Risk scoring: SAFE, LOW, MEDIUM, HIGH, CRITICAL
- Detailed vulnerability breakdown per test prompt
- Actionable security recommendations
Core Components:
ModelTester- Orchestrates 3-step testing workflow (generate prompts, test model, assess responses)ThreatIntelligence- Collects and enriches security threats from external sourcesGroqService- Powers AI judge and threat analysis using llama-3.1-8b-instantSecurityState- Central state management and event trackingUntrustedModelRegistry- Manages registered models and test results
Technology Stack:
- Next.js 16 with TypeScript
- Groq AI (llama-3.1-8b-instant)
- Tailwind CSS
- Server-Sent Events for real-time updates
# Clone repository
git clone https://github.com/ticsture/ai-mesh.git
cd ai-mesh
# Install dependencies
npm install
# Configure Groq API key
echo "GROQ_API_KEY=your_groq_api_key" > .env.local
# Build and run
npm run build
npm run devAccess at http://localhost:3000/models-test
Testing an Untrusted Model:
- Navigate to
/models-test - Register model:
- Name: Your identifier
- Endpoint: Model's API URL
- Provider: openai-compatible, custom, anthropic, or groq
- API Key (optional): Authentication token
- Click "Quick Test (10-20s)"
- Review risk assessment and detailed findings
Test Workflow:
1. Fetch top 5 high-severity threats (instant)
2. Generate 15 adversarial prompts via Groq (3-5 seconds)
3. Test untrusted model with prompts (5-10 seconds)
4. AI judge evaluates responses (3-5 seconds)
5. Display comprehensive risk report
# Test model
POST /api/test-model
Body: { modelId: string }
# Register model
POST /api/models/register
Body: { name, endpoint, provider, model?, apiKey? }
# List models
GET /api/models/listsrc/
├── lib/adaptive-security/
│ ├── ModelTester.ts # Core testing workflow
│ ├── ThreatIntelligence.ts # Threat collection
│ ├── SecurityState.ts # State management
│ └── UntrustedModelRegistry.ts
├── app/
│ ├── api/
│ │ ├── test-model/route.ts
│ │ └── models/
│ └── models-test/page.tsx # Testing UI
└── services/
└── GroqService.ts # AI integration
Minimal AI usage by design:
- No automatic probing (only on-demand testing)
- Reduced threat scanning (2 sources, 5-10 min intervals)
- Efficient batch processing of test prompts
- Single AI judge call per test session
Environment variables:
GROQ_API_KEY=your_groq_api_keyThis platform is designed for security researchers and organizations that need to:
- Quickly vet untrusted AI models before integration
- Identify vulnerability to current jailbreak techniques
- Make informed decisions about model deployment
- Maintain up-to-date threat awareness
The focus is speed and actionability - getting from "unknown model" to "comprehensive risk assessment" in under 30 seconds.
Built for the Apart Research Defensive Acceleration Hackathon to demonstrate practical AI safety tooling that bridges the gap between threat intelligence and model evaluation.