Behavioral smoke tests for deployed AI agents β like a canary in the coal mine for your AI endpoints.
AgentCanary runs scheduled behavioral probes against your AI agent endpoints. Every 15 minutes it fires probe questions at your deployed agents, compares responses to expected keywords, and alerts you the moment behavior drifts β before your users notice.
βββββββββββββββ every 15 min βββββββββββββββββββββ
β pg_cron β ββββββββββββββββββΊ β probe-runner β
β (scheduler) β β (Deno Edge Fn) β
βββββββββββββββ βββββββββββ¬ββββββββββ
β POST probe questions
βββββββββββΌββββββββββ
β Your AI Agent β
β endpoint_url β
βββββββββββ¬ββββββββββ
β response
βββββββββββΌββββββββββ
β keyword check β
β pass / drift / β
β error β
βββββββββββ¬ββββββββββ
β
βββββββββββΌββββββββββ
β Supabase DB β
β probe_runs + β
β alerts table β
βββββββββββββββββββββ
Silent AI agent failure is a real production problem:
- 52% accuracy decline observed over 4 months in a published study of deployed LLMs
- A fintech team lost 12% conversion before detecting drift in their chat agent
- Standard uptime monitors (200 OK) miss behavioral regressions entirely
AgentCanary closes that gap.
- π Scheduled probing β pg_cron fires Supabase Edge Function every 15 min
- π Keyword baseline matching β define expected keywords per probe question
β οΈ Drift detection β flags when responses stop containing expected patterns- π Live dashboard β single HTML file, deployable to GitHub Pages
- π Webhook alerts β optional outbound webhook on drift/error
- π Pass rate tracking β per-canary health metrics over time
agent-canary/
βββ index.html # Single-file dashboard (Tailwind + Supabase JS CDN)
βββ supabase/
β βββ migrations/
β β βββ 20260612_initial_schema.sql # Full DB schema
β βββ functions/
β βββ probe-runner/
β βββ index.ts # Deno Edge Function
βββ README.md
# Or use the Supabase dashboard at supabase.comRun supabase/migrations/20260612_initial_schema.sql in the SQL editor.
supabase functions deploy probe-runner --no-verify-jwtINSERT INTO canaries (name, endpoint_url) VALUES (
'My GPT-4 Agent',
'https://api.openai.com/v1/chat/completions'
);
INSERT INTO probe_questions (canary_id, question, baseline_keywords)
VALUES (
'<canary-id>',
'What is 2+2?',
ARRAY['4', 'four']
);Update the Supabase URL + anon key in index.html, then open in browser or deploy to GitHub Pages.
| Table | Purpose |
|---|---|
canaries |
Agent endpoints to monitor |
probe_questions |
Questions + expected keywords per canary |
probe_runs |
Every probe result (pass/drift/error) |
alerts |
Drift/error events with optional webhook |
RLASAF12 Β· Part of the ABC-TOM builder system.