Companion code for Agents assemble. One agent is a hire. Many agents are a workforce. The Pragmatic Architect newsletter edition. A reference implementation of the six canonical multi-agent patterns, applied to autonomous incident response on a SaaS platform.
Incident response is the highest-leverage agentic workload in tech right now. It is short-lived, tool-heavy, requires coordination across many specialists, and has a non-negotiable human-in-the-loop gate before anything writes to production. If your multi-agent framework can handle this, it can handle the rest.
| File | Pattern | Role |
|---|---|---|
Program.cs |
— | Entry point that wires the full pipeline |
Agents/TriageAgent.cs |
Sequential | Classifies the alert, sets severity |
Agents/LogAgent.cs |
Concurrent | Reads logs (Loki/CloudWatch tool) |
Agents/MetricsAgent.cs |
Concurrent | Reads Prometheus metrics |
Agents/KbSearchAgent.cs |
Concurrent | Searches runbooks + past incidents |
Agents/DiagnosticAgent.cs |
Group Chat | Argues for a root cause |
Agents/KnowledgeAgent.cs |
Group Chat | Counter-argues from prior art |
Agents/LeadAgent.cs |
Group Chat | Decides when debate is over |
Agents/RemediationAgents.cs |
Handoff | DB / Network / App specialists |
Agents/CommsAgent.cs |
Sequential (final) | Drafts status-page update |
Orchestration/IncidentOrchestrator.cs |
Magnetic / Hierarchical | Top-level orchestrator |
Plugins/ObservabilityPlugin.cs |
— | Mock tools for logs/metrics/traces |
Plugins/RemediationPlugin.cs |
— | Mock tools for kubectl/db ops |
HumanGate/ApprovalGate.cs |
— | Human-in-the-loop approval |
- Sequential —
Triage → Investigate → Debate → Remediate → Announce - Concurrent —
LogAgent ‖ MetricsAgent ‖ KbSearchAgent - Group Chat —
DiagnosticAgent ⇌ KnowledgeAgentrefereed byLeadAgent - Handoff — Remediation routes to DB / Network / App specialist by symptom
- Magnetic / Orchestrator-Worker —
IncidentOrchestratorre-plans on tool failure - Hierarchical — Top orchestrator owns a team-of-teams; sub-managers expose summaries
dotnet --version # 8.0 or laterappsettings.json:
{
"AzureOpenAI": {
"Endpoint": "https://YOUR-RESOURCE.openai.azure.com/",
"DeploymentName": "gpt-4o",
"ApiKey": "..."
}
}dotnet restore
dotnet run -- ./samples/alert-db-cpu-spike.jsonYou'll see a streamed transcript of every agent's reasoning, the parallel investigation results, the root-cause debate, and an approval prompt before any remediation tool fires.
- Token budget per stage (hard caps)
- Tool allow-lists per agent (read-only by default)
- OpenTelemetry tracing on every invocation
- Nightly eval harness against historical incidents
- Feature-flagged kill switch ("advise-only" mode)
- Per-incident and per-day cost guardrails
MIT — see LICENSE. This is reference code; harden before production use.