Skip to content

eagleeyethinker/multi-agent-incident-response-csharp

Repository files navigation

Multi-Agent Incident Response — Semantic Kernel (C#)

Companion code for Agents assemble. One agent is a hire. Many agents are a workforce. The Pragmatic Architect newsletter edition. A reference implementation of the six canonical multi-agent patterns, applied to autonomous incident response on a SaaS platform.

Why this use case?

Incident response is the highest-leverage agentic workload in tech right now. It is short-lived, tool-heavy, requires coordination across many specialists, and has a non-negotiable human-in-the-loop gate before anything writes to production. If your multi-agent framework can handle this, it can handle the rest.

What's in here

File Pattern Role
Program.cs Entry point that wires the full pipeline
Agents/TriageAgent.cs Sequential Classifies the alert, sets severity
Agents/LogAgent.cs Concurrent Reads logs (Loki/CloudWatch tool)
Agents/MetricsAgent.cs Concurrent Reads Prometheus metrics
Agents/KbSearchAgent.cs Concurrent Searches runbooks + past incidents
Agents/DiagnosticAgent.cs Group Chat Argues for a root cause
Agents/KnowledgeAgent.cs Group Chat Counter-argues from prior art
Agents/LeadAgent.cs Group Chat Decides when debate is over
Agents/RemediationAgents.cs Handoff DB / Network / App specialists
Agents/CommsAgent.cs Sequential (final) Drafts status-page update
Orchestration/IncidentOrchestrator.cs Magnetic / Hierarchical Top-level orchestrator
Plugins/ObservabilityPlugin.cs Mock tools for logs/metrics/traces
Plugins/RemediationPlugin.cs Mock tools for kubectl/db ops
HumanGate/ApprovalGate.cs Human-in-the-loop approval

Patterns covered

  1. SequentialTriage → Investigate → Debate → Remediate → Announce
  2. ConcurrentLogAgent ‖ MetricsAgent ‖ KbSearchAgent
  3. Group ChatDiagnosticAgent ⇌ KnowledgeAgent refereed by LeadAgent
  4. Handoff — Remediation routes to DB / Network / App specialist by symptom
  5. Magnetic / Orchestrator-WorkerIncidentOrchestrator re-plans on tool failure
  6. Hierarchical — Top orchestrator owns a team-of-teams; sub-managers expose summaries

Prerequisites

dotnet --version    # 8.0 or later

appsettings.json:

{
  "AzureOpenAI": {
    "Endpoint": "https://YOUR-RESOURCE.openai.azure.com/",
    "DeploymentName": "gpt-4o",
    "ApiKey": "..."
  }
}

Run

dotnet restore
dotnet run -- ./samples/alert-db-cpu-spike.json

You'll see a streamed transcript of every agent's reasoning, the parallel investigation results, the root-cause debate, and an approval prompt before any remediation tool fires.

Production checklist

  • Token budget per stage (hard caps)
  • Tool allow-lists per agent (read-only by default)
  • OpenTelemetry tracing on every invocation
  • Nightly eval harness against historical incidents
  • Feature-flagged kill switch ("advise-only" mode)
  • Per-incident and per-day cost guardrails

License

MIT — see LICENSE. This is reference code; harden before production use.

About

Companion code for Agents assemble. One agent is a hire. Many agents are a workforce. The Pragmatic Architect newsletter edition. A reference implementation of the six canonical multi-agent patterns, applied to autonomous incident response on a SaaS platform.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages