"Do not hand a steering wheel to an engine."
This repository is a machine-to-machine infrastructure. It is not a TypeScript framework like NestJS, nor is it a CLI tool for human engineers. It is a Cognitive Harness—an executable protocol designed by humans, but read, interpreted, and executed exclusively by Large Language Models (LLMs).
By encoding engineering discipline (Lifecycle State Machines, Role Matrices, Vector-less Knowledge Graphs) into LLM-native formats, it transforms the AI from a simple code-completion oracle into an autonomous, self-navigating, and self-correcting engineering agent.
Effect Harness Agent is an agent-driven engineering framework for TypeScript/Effect projects, designed for sustainable software development. It integrates an Intent Gateway, a 6-phase Lifecycle State Machine, Contract-first OpenSpec design, and a drill-down LLM Wiki (Knowledge Graph) to prevent context bloat, enabling AI agents to autonomously build, test, and self-correct production-ready code.
Effect Harness Agent is an innovative agent-driven development workflow that bridges the gap between natural language requirements and production-ready TypeScript/Effect code. Built on Intent Gateway, Lifecycle State Machine, Knowledge Graph (LLM Wiki), and Specialized Skills Matrix, it enables sustainable, interruptible, self-correcting, and anti-bloat engineering closed-loops.
- 🎯 Intent-Driven: Natural language → Structured intent queues → Executable tasks
- 🔄 Lifecycle State Machine: Explorer → Propose → Review → Approval Gate (HITL) → Implement → Validation Gate → QA → Archive
- 🧠 Knowledge Graph: Hierarchical wiki system with bidirectional navigation
- 🛡️ Self-Correcting: Automatic guard hooks, failure recovery, human-in-the-loop checkpoints
- 📊 Contract-First: OpenSpec-based design before implementation
- 🔌 Skills Matrix: 25+ specialized skills providing domain expert capabilities
- 📈 Anti-Bloat Mechanism: Automatic knowledge extraction and archival to prevent information overload
Given that Effect Harness Agent is an agentic framework, it inherently consumes more tokens than a simple code completion tool. However, its architecture shifts costs from Rework & Blind Search to Planning & Guardrails, resulting in highly predictable and stable overall costs for complex tasks.
- Each turn requires the LLM to output the
<Cognitive_Brake>and read mandatory system contexts (e.g.,LIFECYCLE.md,AGENTS.md). This adds a fixed baseline "thinking tax" of ~500 Output Tokens and ~2000 Input Tokens per interaction. - The
Proposephase explicitly requires draftingexplore_report.mdandopenspec.md, consuming an extra ~1500 Output Tokens before a single line of code is written.
Note: Estimates assume a typical modern flagship LLM (e.g., GPT-4o, Claude 3.5/3.7 Sonnet, Gemini 1.5 Pro) pricing model.
| Paradigm | Behavior | Input Tokens | Output Tokens | Hidden Costs / Risks | Verdict |
|---|---|---|---|---|---|
| Pure Chat / Copilot | Jumps straight to coding with limited context. | ~5k | ~1k | High Rework Rate. Misses transaction boundaries, forgets existing enums. Requires human prompt corrections. | Cheap in Tokens, Expensive in Human Time. |
| Unconstrained Auto-Agent | Blindly searches the entire codebase (e.g., SearchCodebase or Grep without limits), loops endlessly on compile errors. |
100k+ | 10k+ | Disastrous. Burns through budget via massive context bloat and infinite loops before hitting platform limits. | Unpredictable & Dangerous. |
| Effect Harness Agent | Pays the "Thinking Tax", limits searches (Wiki≤3, Code≤8), designs openspec.md, and STOPs at Approval Gates. |
~30k | ~6k | Highly predictable. Architectural errors are intercepted early by humans; syntax errors are digested by Shift-Left Validation. | The Sweet Spot. Optimized for high-quality delivery with controlled token spend. |
| Scenario Profile | Typical Turns | Input Tokens | Output Tokens | Expected Cost / Task |
|---|---|---|---|---|
@patch (Small Bugfix) |
1-2 Turns | ~5k - 8k | ~1k - 2k | $0.05 - $0.15 |
@standard (New Feature) |
4-6 Turns | ~20k - 40k | ~4k - 8k | $0.30 - $0.80 |
@learn (Doc QA) |
1 Turn | ~3k - 5k | ~500 | $0.02 - $0.05 |
Total Token Cost = (Base Context + Context Funnel Payload) × Turns + (Artifact Generation + Code Generation + Cognitive Brake)
To minimize costs:
- Use Shortcuts: Use
@patchinstead of@standardfor trivial changes to skip Phase 1-3. - Provide Explicit Scopes: Include
--scope src/Foo.javain your prompt. This triggers Rule 0 (Direct Read), bypassing the entire Knowledge Funnel drill-down process and saving thousands of Input Tokens. - Respect the Brakes: When the agent STOPs at the Validation Gate, ensure your local environment is ready before allowing it to compile, preventing retry loops.
Three Fundamental Problems Solved by Effect Harness Agent:
- Context Bloat Out of Control: LLM blind searching in large codebases leads to token waste and attention dispersion → Solved through Knowledge Graph + Budgeted Navigation
- Requirement Drift & Unauthorized Modifications: Agent free-play causes cross-domain pollution and contract corruption → Solved through Intent Gateway + Role Matrix Guards
- Knowledge Fragmentation & Unsustainability: Conversation memory loss, documentation desynchronization, index bloat → Solved through WAL Write-back + Auto-Refactoring
Design Philosophy: Encode engineering discipline into LLM-executable protocols, enabling machine-to-machine self-coordination, self-correction, and self-evolution.
flowchart TB
subgraph Input["📥 Input Layer"]
User[👤 User Requirements]
Shortcut["⚡ Shortcuts<br/>@read/@patch/@standard"]
end
subgraph Gateway["🎯 Intent Gateway Layer (ROUTER)"]
IG[Intent Gateway<br/>Intent Classifier]
Profile{Execution Profile Selection}
LEARN[LEARN<br/>Read-only Q&A]
PATCH[PATCH<br/>Small Changes]
STANDARD[STANDARD<br/>Full Lifecycle]
end
subgraph Context["🔍 Context Collection Layer (FUNNEL)"]
DirectRead[Direct Read<br/>When scope explicit]
Funnel[Knowledge Funnel<br/>Sitemap→Index→Doc]
Budget[Budget Control<br/>Wiki≤3, Code≤8]
Escalation[Escalation Protocol<br/>Escalation Card]
end
subgraph Knowledge["🧠 Knowledge Graph Layer (LLM Wiki)"]
KG[KNOWLEDGE_GRAPH.md<br/>Root Node]
DomainIndex["Domain Indices<br/>api/data/domain etc."]
Docs[Specific Documents]
Archive[Archive Zone<br/>Cold Storage]
end
subgraph Lifecycle["⚙️ Lifecycle Engine Layer (WORKFLOW)"]
LaunchSpec[Launch Spec<br/>State Machine Table]
Phase1[1_Explorer<br/>Clarify Requirements]
Phase2[2_Propose<br/>Freeze Contracts]
Phase3[3_Review<br/>Technical Review]
ApprovalGate[Approval Gate<br/>HITL Checkpoint]
Phase4[4_Implement<br/>Implement per Contract]
Phase5[5_QA<br/>Test Validation]
Phase6[6_Archive<br/>Knowledge Extraction]
end
subgraph Roles["🎭 Role Matrix Layer (ROLE MATRIX)"]
Ambiguity[Ambiguity Gatekeeper<br/>Ambiguity Guard]
ReqEngineer[Requirement Engineer<br/>Requirements]
SysArchitect[System Architect<br/>Architecture]
LeadEngineer[Lead Engineer<br/>Code & Shift-Left]
CodeReviewer[Code Reviewer<br/>Quality QA]
FocusGuard[Focus Guard<br/>Anti-Drift Guard]
KnowledgeExt[Knowledge Extractor<br/>Unified WAL]
SecuritySentinel[Security Sentinel<br/>Security Sentinel]
DocCurator[Documentation Curator<br/>Doc Curator]
SkillCurator[Skill Graph Curator<br/>Skills]
Librarian[Librarian<br/>GC & Compaction]
KnowledgeArch[Knowledge Architect<br/>Knowledge Architect]
end
subgraph Hooks["🛡️ Hook Correction Layer (HOOKS)"]
PreHook[pre_hook<br/>Load Rule Sets]
GuardHook[guard_hook<br/>Execution Guard]
PostHook[post_hook<br/>Post-Audit]
FailHook[fail_hook<br/>Failure Rollback]
LoopHook[loop_hook<br/>Queue Loop]
end
subgraph Skills["🔧 Skills Matrix Layer (SKILLS)"]
SkillIndex[trae-skill-index<br/>Master Skill Index]
EffectSkills[TypeScript/Effect Skills<br/>25+ Professional Capabilities]
end
subgraph Scripts["📜 Script Tools Layer (SCRIPTS)"]
Gates["Gate Scripts<br/>ambiguity_gate.py etc."]
WikiTools[Wiki Tools<br/>linter/compactor]
Engine[Engine Helper<br/>engine.py]
end
User --> IG
Shortcut --> IG
IG --> Profile
Profile -->|LEARN| DirectRead
Profile -->|PATCH| LaunchSpec
Profile -->|STANDARD| LaunchSpec
DirectRead --> Funnel
Funnel --> Budget
Budget --> Escalation
KG --> DomainIndex
DomainIndex --> Docs
Docs --> Archive
LaunchSpec --> Phase1
Phase1 --> Phase2
Phase2 --> Phase3
Phase3 --> ApprovalGate
ApprovalGate --> Phase4
Phase4 --> Phase5
Phase5 --> Phase6
Phase6 --> LaunchSpec
Phase1 -.->|Mount| Ambiguity
Phase1 -.->|Mount| ReqEngineer
Phase1 -.->|Mount| FocusGuard
Phase2 -.->|Mount| SysArchitect
Phase3 -.->|Mount| SysArchitect
Phase4 -.->|Mount| LeadEngineer
Phase4 -.->|Mount| FocusGuard
Phase4 -.->|Mount| SecuritySentinel
Phase5 -.->|Mount| CodeReviewer
Phase5 -.->|Mount| DocCurator
Phase6 -.->|Mount| KnowledgeExt
Phase6 -.->|Mount| DocCurator
Phase6 -.->|Mount| SkillCurator
Phase6 -.->|Mount| Librarian
Phase1 -.->|Trigger| PreHook
Phase4 -.->|Trigger| GuardHook
Phase5 -.->|Trigger| PostHook
Phase3 -.->|Trigger| FailHook
Phase5 -.->|Trigger| FailHook
Phase6 -.->|Trigger| LoopHook
Ambiguity -.->|Invoke| Gates
GuardHook -.->|Invoke| Gates
PostHook -.->|Invoke| WikiTools
KnowledgeArch -.->|Invoke| WikiTools
Phase1 -.->|Query| SkillIndex
Phase2 -.->|Query| SkillIndex
Phase4 -.->|Query| EffectSkills
| Layer | Component | Responsibility | Key File |
|---|---|---|---|
| Input | Intent Gateway | Natural language → Structured intents + Execution profiles | ROUTER.md |
| Context | Knowledge Funnel | Bidirectional navigation (forward retrieval + reverse write-back) | CONTEXT_FUNNEL.md |
| Knowledge | LLM Wiki | Fractal knowledge graph (Sitemap/Index/Docs/Archive) | KNOWLEDGE_GRAPH.md |
| Process | Lifecycle Engine | 6-phase state machine + breakpoint resume | LIFECYCLE.md |
| Roles | Role Matrix | Dynamic virtual role mounting + gate guards | ROLE_MATRIX.md |
| Correction | Hooks System | Pre/guard/post/fail/loop interception | HOOKS.md |
| Capability | Skills Matrix | 25+ domain-specific expert capabilities | trae-skill-index |
| Tools | Script Tools | Deterministic quality checks + auxiliary tools | scripts/ |
- TypeScript 5.0+
- Python 3.8+ (for script tools)
- Git
Start with AGENTS.md - the master entry point defining execution discipline with hard constraints and navigation rules.
Core Constraints Quick Reference:
- Budget Limits: Wiki ≤ 3 docs, Code ≤ 8 files (same-file pagination doesn't count)
- Cognitive Brake: Mandatory
<Cognitive_Brake>XML block before any action to enforce Role, Scope, and Budget awareness - Approval & Validation Gates: Must STOP for human confirmation before writing code (
Approval Gate) and before heavy compilation (Validation Gate) - Anti-Looping: Max 3 retries for scripts/linters; STRICT MAX 2 retries for compilation. Exceeding thresholds MUST request human intervention
- Scope Guard: Cannot modify files outside
focus_card.mdagreed scope without explicit authorization
The Intent Gateway transforms natural language into executable queues, supporting three execution profiles:
| Profile | Use Case | Lifecycle Entry | Artifacts |
|---|---|---|---|
| LEARN | Read-only explanation, code understanding | ❌ No | None |
| PATCH | Small changes, bug fixes (LOW risk) | ✅ Minimal | Slim Spec + Change Log |
| STANDARD | MEDIUM/HIGH risk, wide blast radius | ✅ Full 6-phase | Full OpenSpec + Approval Gate |
Shortcuts (Explicit Routing):
@read / @learn → Force LEARN mode (read-only)
@patch / @quickfix → Force PATCH mode (small changes)
@standard → Force STANDARD mode (full lifecycle)
Shortcut DSL Examples:
@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API
Rule 0: Direct Read when scope is explicit (MUST)
- If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
- ✅ Do direct read first
- ❌ Do NOT start with Knowledge Graph drill-down
Rule 1: Otherwise, use Knowledge Funnel (MUST)
- Read root: KNOWLEDGE_GRAPH.md
- Drill down via: CONTEXT_FUNNEL.md
- If unsure which skill to use, consult: trae-skill-index
Common Domain Indices:
- API Design →
.agents/llm_wiki/wiki/api/index.md - Data Models →
.agents/llm_wiki/wiki/data/index.md - Domain Logic →
.agents/llm_wiki/wiki/domain/index.md - Architecture →
.agents/llm_wiki/wiki/architecture/index.md
Complete one STANDARD task following the Lifecycle:
stateDiagram-v2
[*] --> Explorer: Clarify Requirements
Explorer --> Propose: Freeze Contracts
Propose --> Review: Technical Review
Review --> ApprovalGate: HITL Checkpoint
ApprovalGate --> Implement: Implement per Contract
Implement --> ValidationGate: STOP & Request Compile
ValidationGate --> QA: Test Validation
QA --> Archive: Knowledge Extraction (Same Session)
Archive --> [*]: Queue Complete
Review --> Propose: fail_hook(review failed)
QA --> Implement: fail_hook(compile/test failed, max 2 retries)
note right of ApprovalGate
MEDIUM/HIGH Risk:
Must wait for human confirmation
Status=WAITING_APPROVAL
end note
classDef explorerClass fill:#e1f5ff,stroke:#333
classDef proposeClass fill:#fff4e6,stroke:#333
classDef reviewClass fill:#ffe6e6,stroke:#333
classDef approvalClass fill:#fff9e6,stroke:#333
classDef implementClass fill:#e6ffe6,stroke:#333
classDef validationClass fill:#fff9e6,stroke:#333
classDef qaClass fill:#f0e6ff,stroke:#333
classDef archiveClass fill:#e6f0ff,stroke:#333
class Explorer explorerClass
class Propose proposeClass
class Review reviewClass
class ApprovalGate approvalClass
class Implement implementClass
class ValidationGate validationClass
class QA qaClass
class Archive archiveClass
Breakpoint Resume Mechanism:
- Launch Spec persisted at
router/runs/launch_spec_*.md - First action after session interruption: read this file to restore state
- Status enum:
PENDING,IN_PROGRESS,DONE,WAITING_APPROVAL,FAILED
Goal: Create read-only endpoints (DTO/Controller/Service) without table structure changes
graph LR
A[Explorer<br/>Clarify Requirements] --> B[Propose<br/>OpenSpec]
B --> C[Review<br/>Technical Review]
C --> D[Approval<br/>HITL Checkpoint] --> E[Implement<br/>Per Contract]
E --> V[Validation<br/>STOP Gate]
V --> F[QA<br/>Test Validation]
F --> G[Archive<br/>Update Index]
style A fill:#e1f5ff,stroke:#333,stroke-width:2px
style B fill:#fff4e6,stroke:#333,stroke-width:2px
style C fill:#ffe6e6,stroke:#333,stroke-width:2px
style D fill:#fff9e6,stroke:#333,stroke-width:2px
style E fill:#e6ffe6,stroke:#333,stroke-width:2px
style V fill:#fff9e6,stroke:#333,stroke-width:2px
style F fill:#f0e6ff,stroke:#333,stroke-width:2px
style G fill:#e6f0ff,stroke:#333,stroke-width:2px
Key Deliverables:
- ✅
explore_report.md- Scope & impact analysis + Core Context Anchors - ✅
openspec.md- API contract with JSON examples, acceptance criteria - ✅ Implementation following contract (no over-engineering)
- ✅ Unit tests with coverage evidence
- ✅ Update API index in
wiki/api/(WAL mechanism)
Mounted Roles (resolved from role_matrix.json):
- Explorer:
ambiguity_gatekeeper,requirement_engineer,focus_guard - Propose:
system_architect - Implement:
lead_engineer,focus_guard,security_sentinel - QA:
code_reviewer,documentation_curator,accessibility_auditor,visual_critic,performance_warden - Archive:
knowledge_extractor,documentation_curator,skill_graph_curator,librarian
Goal: New endpoint with table structure & index modifications
Critical Path:
- Propose: Freeze both API & Data contracts simultaneously (field semantics, constraints, index design, compatibility strategy)
- Review: SQL risk assessment, index utilization, implicit conversion checks, authorization risks
- QA: Regression tests covering core queries & edge cases
- Archive: Update both
wiki/api/andwiki/data/indices, synchronize ER diagrams
Activated Skills (frontend / Effect-native; backend SQL skills are archived):
effect-schema-as-contract- Schema design for shared contracts (frontend ↔ API)consumed-api-contracts- Pinning + tracking server contracts the client consumeseffect-schema-form- Form binding when the schema also drives a UI
Goal: Fix defects ensuring reproducibility, regressability, and traceability
stateDiagram-v2
[*] --> Explorer
Explorer --> Implement: Identify Root Cause
Implement --> QA: Write Failing Test First
QA --> Implement: fail_hook (test failed)
Implement --> QA: Fix to Pass Test
QA --> Archive: Regression Test Suite
Archive --> [*]
note right of QA
TDD Approach:
1. Write failing test first
2. Fix to pass test
3. Add regression tests
end note
Workflow:
- Explorer: Minimal reproduction path + root cause hypothesis + impact analysis (whether Propose/contract update needed)
- QA: Write failing test BEFORE fix (TDD approach)
- Implement: Fix implementation to pass test
- Archive: Record pattern in
wiki/testing/orreviews/, update related API/Domain indices if necessary
Profile: PATCH (LOW risk) or STANDARD (MEDIUM/HIGH risk)
Goal: Optimize SQL/performance without changing external behavior
Focus Areas:
- Propose: Document "behavior unchanged" constraints + rollback strategy
- Review: SQL standards & index utilization as top priority
- QA: Comparative evidence (performance benchmarks + correctness)
- Archive: Extract reusable performance rules to
preferences/
Activated Skills:
bundle-budget-standard- Bundle-size guards (top priority for client perf)effect-fiber-and-stream- Concurrency / scheduling primitives when refactoring hot pathscode-review-checklist- Refactoring review surface
Goal: Improve maintainability without introducing requirement drift
Guardrails:
- Explicit "what's in / what's out" scope definition (Focus Card)
- Cross-domain modifications require explicit authorization (guard_hook)
- Architecture decisions written back to
wiki/architecture/
Activated Roles:
- Ambiguity Gatekeeper - Ambiguity guard
- Focus Guard - Anti-drift guard
- Knowledge Architect - Knowledge architect (if Wiki refactoring needed)
Goal: Server-led delivery with optional UI/QA parallel work
sequenceDiagram
participant S as Server Agent
participant H as Human (HITL)
participant U as UI Agent
participant Q as QA Agent
S->>S: Explorer → Propose
S->>H: Request Approval (Freeze Contract)
H->>S: ✅ Approved
par Parallel Execution
S->>S: Implement Code
U->>U: Build UI from API Contract
Q->>Q: Write Tests from Acceptance Criteria
end
S->>S: QA → Archive
Key Handoff Points:
- Approval Gate Phase: Frozen OpenSpec becomes single source of truth, acts as "starting gun" for parallel collaboration
- Minimal Handoff: API Contract (JSON examples), Acceptance Criteria (Given/When/Then), Error Codes
- Server Cohesion: Internal details remain encapsulated (not forced outward)
Goal: Perform read-only analysis and assessment of the codebase, producing structured audit reports
Constraints:
- ❌ No code modifications
- ❌ No Wiki writes
- ❌ No launch spec generation
- ❌ No lifecycle entry
Allowed Operations:
- ✅ Read-only retrieval and reading
- ✅ Run tests/builds (but do not modify any tracked files)
Output Requirements: Each conclusion must include evidence (file path + line range) and impact/recommendations
Typical Scenarios: Architecture review, code quality scanning, technical debt assessment
- Goal: Answer questions based on Wiki/requirement documents
- Method: Drill down through knowledge funnel, output answers with citations
- Citations: Wiki/requirement paragraphs, supplement with code references when needed
- Does NOT trigger lifecycle
- Goal: Convert Q&A conclusions into executable intent queues
- Critical Step: Must ask user whether to "launch" first
- After Confirmation: Generate launch spec and enter lifecycle
- Without Confirmation: Output answer only, no side effects
Typical Scenarios: Query business rules, understand API usage, confirm architecture decisions
The Intent Gateway transforms natural language requirements into structured intent queues that drive the entire lifecycle.
Not every request needs the full lifecycle. The gateway selects an execution profile:
| Profile | When to Use | Lifecycle Entry | Artifacts |
|---|---|---|---|
| LEARN | Read-only explanation, code understanding | No | None |
| PATCH | Small changes, bug fixes (LOW risk) | Minimal | Slim Spec or Change Log |
| STANDARD | MEDIUM/HIGH risk, wide blast radius | Full 6-phase | Full OpenSpec + Approval Gate |
Users can override automatic routing with explicit shortcuts:
@read/@learn: Force ProfileLEARN(read-only, no write-back)@patch/@quickfix: Force ProfilePATCH(small change mode)@standard: Force ProfileSTANDARD(full lifecycle)
Shortcuts can be composed with flags to express common workflows as a small DSL.
Syntax:
@<profile> <flags...> -- <natural language request or question>
Flags (order-independent):
- Scope / read:
--scope <path|glob|symbol>: explicit scope (file/dir/symbol)--direct: force direct reads (do not start with Knowledge Graph drill-down)--funnel: force the funnel even if scope is explicit--depth shallow|normal|deep: explanation depth (LEARN only)
- Risk / artifacts:
--risk low|medium|high: explicit risk override--slim: force Slim Spec (PATCH only, or STANDARD with--risk low)--changelog: use Change Log only (PATCH only)--evidence required|optional|none: evidence requirement (default: PATCH=required)
- Launch / write-back:
--launch: force lifecycle launch (STANDARD only)--no-launch: force no launch--writeback: allow wiki/WAL write-back (not allowed for LEARN)--no-writeback: forbid write-back (default)
- Verification:
--test "<cmd>": required verification command + evidence--no-test: skip tests (LEARN only; PATCH requires an explicit justification)
- DocQA actionize:
--actionize: convert DocQA into an executable STANDARD queue (requires confirmation)--yes: auto-confirm--actionize/--launch(team use with caution)
Conflict rules (MUST enforce):
@learnMUST NOT be combined with--launchor--writeback.--launchMUST be used with@standardonly.--slimrequires--risk low(or implied low risk in PATCH).--actionizeMUST ask for confirmation unless--yesis present.
Examples:
@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API
@learn --funnel -- what is the API design standard? --actionize
The gateway maps requests to a small set of top-level intents:
| Intent | When to Use | Default Profile | Launch Spec | Write-back |
|---|---|---|---|---|
Learn |
"Explain/read/understand this code" with explicit scope | LEARN | No | No |
Change |
"Modify code" (feature, refactor, bugfix) | PATCH or STANDARD | Yes (STANDARD only) | Optional (Archive) |
DocQA |
"What is the rule/process/template?" | LEARN | No | No (unless actionized) |
Audit |
"Assess the codebase" (read-only review/risk scan) | LEARN | No | No |
Rule 0: Direct Read when scope is explicit (MUST)
- If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
- ✅ Do direct read first
- ❌ Do NOT start with Knowledge Graph drill-down
- Use funnel only if background context needed after first read
Rule 1: Otherwise, use Knowledge Funnel (MUST)
- Read root: KNOWLEDGE_GRAPH.md
- Drill down via: CONTEXT_FUNNEL.md
- If unsure which skill to use, consult: trae-skill-index
Budgeted Navigation (MUST)
For Change and Audit intents, uncontrolled exploration is forbidden.
Default budgets:
- Wiki budget: 3 documents
- Code budget: 8 files
- Pagination reads within the same file do NOT count as additional file reads
Saturation Gate (Stop Reading When Enough) Stop reading and move to decision/implementation when ANY is met:
- Template acquired: any 2 of (route shape, DTO validation style, service entry pattern, mapper/sql pattern, table field pattern)
- Integration point acquired: a concrete example of the dependency usage
- Executable chain acquired: a known good call chain exists and the remaining work is a mechanical extension
Stop-Wiki (MUST) If 3 consecutive wiki reads are "no-gain", the Agent MUST stop wiki navigation and proceed with a minimal, standards-compliant decision.
Stop-Code (MUST) Code reading must monotonically shrink scope. If scope does not shrink for 2 consecutive code reads, the Agent MUST stop reading and trigger Escalation Protocol.
Escalation Protocol (MUST) If budgets are exhausted OR stop rules trigger and success criteria are not met, the Agent MUST request human help instead of continuing to read.
Escalation Card format:
- Consumed:
wiki X/3,code Y/8 - Confirmed facts (<= 5 bullets)
- Missing info (<= 2 bullets, must be specific)
- Why it is blocking (one sentence)
- Proposed next targets (<= 5 file paths / keywords)
- Request:
wiki +1orcode +2(small step) - Fallback if still missing: pick one of:
- ask 1 critical question
- request a concrete anchor (class/table/entrypoint) from human
- deliver a minimal viable plan with explicit risks
When escalation blocks the workflow, set the intent row in launch_spec_*.md to WAITING_APPROVAL and include a link to the relevant artifact.
When Profile is STANDARD, the Change intent expands into:
| Code | Phase | Notes |
|---|---|---|
Explore.Req |
Explorer | Clarify requirements + scope anchors |
Propose.API |
Propose → Review | API contract and design |
Propose.Data |
Propose → Review | Database schema changes |
Implement.Code |
Implement → QA | Code changes |
QA.Test |
QA | Tests + evidence |
Status values: PENDING, IN_PROGRESS, DONE, WAITING_APPROVAL, FAILED
# Launch Spec - {YYYYMMDD_HHMMSS}
## State Machine
| Intent | Status | Phase | Artifact/Log | Failed_Reason |
|---|---|---|---|---|
| Explore.Req | IN_PROGRESS | 1_Explorer | `explore_report.md` | - |
| Propose.API | PENDING | - | - | - |
| Implement.Code | PENDING | - | - | - |
## Breakpoint Resume
- If session interrupted/human delayed: First action is to read this file upon wake-up.
- If `WAITING_APPROVAL` exists: Enter Approval checkpoint, read corresponding `openspec.md`, wait for human confirmation, then switch status to `IN_PROGRESS` and proceed to Implement.
- If `FAILED` exists: Stop automatic progression, report `Failed_Reason` to human and request intervention.Key Discipline: The state machine table drives workflow progression. Only update Status/Phase/Failed_Reason fields to avoid checkbox matching failures and state confusion.
| Mechanism | Trigger Point | Trigger Condition | Effect | Evaluation Method |
|---|---|---|---|---|
| Cognitive_Brake | Before any action | Protocol enforcement | Forces LLM to explicitly reason about roles, boundaries, budgets, and next steps before generating tools or code | XML CoT parsing |
| pre_hook | Before entering new phase | Phase transition | Load relevant rule sets + output Decision-First Preflight + budgets | Required output format |
| guard_hook | During implementation/modification | Style violations, permission breaches, cross-domain pollution, budget exhaustion | Immediate block, require rewrite or authorization; enforce Anti-runaway guard | Standard skill review + Budget rules |
| fail_hook | Any phase failure | Compilation/test/review failures | State downgrade rollback; log failure reason to openspec.md; trigger retry count |
Objective logs (compilation/test output) |
| Max Retries | Inside fail_hook | Same phase consecutive failures reach threshold | Force stop and request human intervention (Max 3 for scripts, STRICT MAX 2 for compilation) | Failure count reaches threshold |
| Approval Gate (HITL) | After Review passes | Need to enter Implement | "Freeze contract", human authorizes whether to proceed | Human confirmation (YES/NO + modification feedback) |
| Doc Consistency Gate | post_hook / Archive | Wiki hallucination & contract corruption risk | Read-only validation (schema_checker.py + wiki_linter.py), trigger fail_hook on FAIL |
Script exit codes (non-zero = FAIL) |
| Archive Write-back | Task completion | New/changed knowledge needs persistence | Extract stable knowledge from Spec, archive hot documents, update indices (WAL mechanism) | Rule validation, connectivity check |
| Preferences Memory | Before/after Archive | Representative human ratings/feedback | Persist experience as preferences/anti-patterns to wiki/preferences/index.md, effective in next pre_hook |
Human rating + textual reasoning |
| Non-Convergence Fallback | Workflow stuck repeating same action | Doc rewrite or linter failure loop | Stop repeating, run deterministic verification, report mismatch, request human intervention | Evidence-based mismatch detection |
The harness ships 35 active skills under .agents/skills/. The canonical list is trae-skill-index/SKILL.md — what follows is a tour of the major sections.
- intent-gateway — entry node for every task. Routes natural language into the correct intent + profile + role mounts. Also handles
Audit.Codebase,QA.Doc,QA.Doc.Actionize. - trae-skill-index — central index for skill discovery.
- skill-graph-manager — maintains bidirectional links between related skills.
- react-component-architecture — composition, prop contracts, server/client boundary, ref forwarding.
- effect-react-patterns — entry node for the seven
effect-*skills below. - effect-runtime-and-layers — single
ManagedRuntime+ Layer composition. Read FIRST before any other Effect skill. - effect-schema-as-contract —
Schema.Structas the contract between frontend and API. - effect-tagged-errors-and-match — typed error unions + exhaustive matching.
- effect-platform-httpclient —
@effect/platformHTTP client conventions. - effect-fiber-and-stream — concurrency, schedules, and Streams.
- effect-vitest-testing — Vitest + Effect testing patterns (TestClock, Layer mocking).
- effect-schema-form — schema-driven form binding.
- effect-tsgo-diagnostics — catalog of
@effect/tsgodiagnostics mapped to harness role / phase / severity.
- assertion-not-snapshot-tests — assert intent, not output shape.
- zustand-slice-architecture — slice patterns when Zustand is the chosen store.
- god-component-decomposition — heuristics for breaking up oversized React components.
- schema-migration-versioned-documents — versioned-document migration pattern.
- form-validation-zod — Zod-based validation when projects choose Zod over Effect Schema.
- global-engineering-standards — master index for code-generation standards.
- utils-usage-standard — utility usage patterns.
- tsdoc-standard — unified TSDoc style.
- eslint-prettier-standard — code-style enforcement.
- linter-severity-standard — severity rules (FAIL / WARN / IGNORE) for downstream linters.
- code-review-checklist — review checklist (security / performance / maintainability).
- api-documentation-rules — API doc generation & archival.
- consumed-api-contracts — pinning + tracking server contracts the client consumes.
- accessibility-wcag-aa-standard — WCAG 2.2 AA conformance rules.
- design-token-standard — design token discipline.
- responsive-breakpoint-standard — breakpoint conventions.
- bundle-budget-standard — bundle-size budgets per route.
- visual-regression-standard — Lost-Pixel discipline.
- e2e-playwright-standard — Playwright conventions.
- state-management-pattern — when to reach for Zustand vs Context vs Effect.
- devops-lifecycle-master → use
intent-gateway+.agents/PROTOCOL.md. - devops-testing-standard → use
effect-vitest-testing.
8 first-generation
devops-*skills, plusprd-task-splitterandproduct-manager-expert, were archived 2026-05-09. Their responsibilities are now covered by roles inrole_matrix.json. See.agents/skills/_archive/README.md.
The skill-to-phase mapping is now driven by role_matrix.json, not by skill names. The phase mounts roles; the roles invoke skills + gates. See .agents/workflow/ROLE_MATRIX.md (auto-generated from JSON) for the canonical mapping.
| Phase | Mounted roles |
|---|---|
| Explorer | ambiguity_gatekeeper, requirement_engineer, focus_guard |
| Propose | system_architect |
| Review | system_architect |
| Approval (HITL) | — (human checkpoint) |
| Implement | lead_engineer, focus_guard, security_sentinel |
| QA | code_reviewer, documentation_curator, accessibility_auditor, visual_critic, performance_warden |
| Archive | knowledge_extractor, documentation_curator, skill_graph_curator, librarian |
effect-harness-agent/
├── .agents/
│ ├── PROTOCOL.md # 📜 SSOT for routing, lifecycle, hooks, scenarios
│ ├── QUICKSTART.md # 1-page agent quickstart (eager-load on session start)
│ │
│ ├── router/
│ │ ├── CONTEXT_FUNNEL.md # Knowledge funnel + reverse write-back
│ │ ├── ROUTER.md # ⚠️ Deprecated stub — see PROTOCOL.md
│ │ └── runs/ # Launch specs (gitignored)
│ │
│ ├── workflow/
│ │ ├── role_matrix.json # 🎭 SSOT — roles, gate mounts, phase routing
│ │ ├── role_matrix.schema.json # JSON Schema for IDE validation
│ │ ├── ROLE_MATRIX.md # Human-readable view (auto-generated)
│ │ ├── LIFECYCLE.md # ⚠️ Deprecated stub — see PROTOCOL.md
│ │ ├── HOOKS.md # ⚠️ Deprecated stub — see PROTOCOL.md
│ │ ├── ARCHIVE_WAL.md # WAL conventions
│ │ ├── bundle-budget.json # Default bundle budgets
│ │ ├── lighthouse-budget.json # Default Web Vitals budgets
│ │ └── runs/ # Gates reports + engine_state.json (gitignored)
│ │
│ ├── llm_wiki/
│ │ ├── KNOWLEDGE_GRAPH.md # 🗺️ Root node (mandatory entry)
│ │ ├── purpose.md
│ │ ├── schema/
│ │ │ ├── openspec_schema.md
│ │ │ └── subagent_contract_schema.md
│ │ ├── wiki/ # Active knowledge domains
│ │ │ ├── api-contracts/ # External APIs the client consumes
│ │ │ ├── routes/ # Frontend route map
│ │ │ ├── components/ # Component catalog
│ │ │ ├── design-system/ # Tokens, primitives
│ │ │ ├── flows/ # User flows
│ │ │ ├── runtime/ # ManagedRuntime composition
│ │ │ ├── layers/ # Effect Layer definitions
│ │ │ ├── services/ # Service interfaces
│ │ │ ├── schemas/ # Effect Schema definitions
│ │ │ ├── architecture/ # ADRs (WAL fragments)
│ │ │ ├── specs/ # Active openspec.md
│ │ │ └── preferences/ # Project preferences
│ │ └── archive/ # Cold storage (extracted specs)
│ │
│ ├── skills/ # 35 active skills + _archive/
│ │ ├── intent-gateway/
│ │ ├── trae-skill-index/ # Central index (every skill linked here)
│ │ ├── effect-runtime-and-layers/
│ │ ├── effect-schema-as-contract/
│ │ ├── effect-vitest-testing/
│ │ └── ... (30+ more — see trae-skill-index)
│ │
│ └── scripts/
│ ├── gates/ # Process (.py) + frontend (.ts) gates
│ │ ├── run.py # Unified runner (resolves mounts from role_matrix.json)
│ │ ├── ambiguity_gate.py, scope_guard.py, secrets_linter.py, ...
│ │ ├── skill_schema_checker.py, openspec_gate.py, ...
│ │ └── a11y_gate.ts, visual_regression_gate.ts, tsgo_gate.ts, ...
│ ├── harness/
│ │ ├── engine.py # Lifecycle queue + state machine
│ │ └── self_test.py # Role-matrix consistency
│ ├── wiki/ # Knowledge-graph linters + compactor
│ └── tools/ # GC + snapshot helpers
│
├── tests/ # Pytest contract tests for parsing gates
├── examples/
│ └── standard-run-walkthrough/ # Documentation-only end-to-end run example
│
├── AGENTS.md # 📌 Master rule entry (read every session)
├── CLAUDE.md # Claude Code-specific entry point
├── PROTOCOL.md # (mirror lives at .agents/PROTOCOL.md)
├── ENGINEERING_MANUAL.md # Detailed engineering manual
├── CONTRIBUTING.md # Contributor workflow
├── CODE_OF_CONDUCT.md # Contributor Covenant 2.1 (by reference)
├── SECURITY.md # Vulnerability disclosure policy
├── SKILLS_GOVERNANCE.md # Skill lifecycle + schema rules
├── CHANGELOG.md # Significant changes by date
└── LICENSE # MIT
These scripts provide deterministic quality checks (report only, don't modify files):
python .agents/scripts/wiki/wiki_linter.pyChecks: Dead links, orphaned files, index length warnings
python .agents/scripts/wiki/schema_checker.pyChecks: Missing key sections, JSON example presence
python .agents/scripts/wiki/pref_tag_checker.pyChecks: Rule tag conventions for precise retrieval
python .agents/scripts/gates/run.py --intent <intent> --profile <profile> --phase <phase>Function: Automatically run relevant gate scripts based on current phase
Always start from Knowledge Graph Root → drill down through indices. Fallback search only when indices fail.
Cross-domain modifications require explicit authorization in openspec.md and confirmation during Review/HITL phases.
Failure rollback + max retry threshold (3 attempts for scripts, STRICT MAX 2 for compilation). Stop and request human intervention when threshold reached. Never infinite loop.
- Specs must be archived after extraction
- Stable knowledge must be extracted to indices
- Indices exceeding 500 lines must be split into subdirectories
- Archive execution transitions automatically in the same session, using targeted
git diff <files>oropenspec.mdto avoid context overload.
This is a pre-1.0 harness. Adopt with eyes open:
- Pre-release status — current version is
v0.x. The protocol surface (lifecycle phases, role matrix schema, gate exit codes) is mostly stable but not yet frozen. Pin a commit if you depend on byte-for-byte stability. tsgo_gate.tsenforces tsgo configuration, not diagnostics directly — it verifies that@effect/tsgois wired intotsconfig.jsonso that tsserver-aware editors (VS Code, JetBrains, Zed) surface the ~70 Effect-specific diagnostics in real time. Upstream@effect/tsgois currently a Language Service plugin without a CLI runner, so harness-side diagnostic enforcement is upstream-blocked. The gate is feature-complete for the current upstream surface; when atsgo checkCLI ships, the gate will swap to a diagnostic runner (see the migration note in the gate header). For authoritative type errors today, use your project'stsc --noEmit.- Effect Schema drift detection is opt-in —
effect_schema_drift_gate.tsonly runs when you wire it intorole_matrix.jsonfor your project. There is no global default because schema drift policy depends on whether your project treats schemas as published contracts. - TS gates assume peer dependencies —
a11y_gate,visual_regression_gate,bundle_budget_gate,web_vitals_gate,console_error_gateneed Playwright + axe + Lost-Pixel + Lighthouse + ts-morph installed in the target project. See.agents/scripts/gates/package.jsonfor the peer-dep list. - Skill governance is manual — the 40+ skills in
.agents/skills/follow a stable → deprecated → archive lifecycle (see SKILLS_GOVERNANCE.md), but moves are still operator-driven. Frontmatter is now CI-validated byskill_schema_checker.py. - LLM-native by design — there is no human CLI, web UI, or REPL. If you are evaluating this with "developer tool" expectations, see the Critical Positioning Statement at the top of this file.
- Windows path quoting — gate scripts run on Linux/macOS/Windows, but if you invoke them from PowerShell, double-quote any path containing spaces. WSL is recommended on Windows for the smoothest experience.
If you hit a limitation not listed here, open an issue — known-limitations is a living section.
- 📌 Master rules — AGENTS.md (read every session)
- 📜 Operational protocol (SSOT) — .agents/PROTOCOL.md (routing, lifecycle, hooks, scenarios)
- 🚀 Agent quickstart — .agents/QUICKSTART.md
- 🤖 Claude Code entry — CLAUDE.md
- 📘 Engineering Manual — ENGINEERING_MANUAL.md
- 🗺️ Knowledge Graph — .agents/llm_wiki/KNOWLEDGE_GRAPH.md
- 📝 Spec template — .agents/llm_wiki/schema/openspec_schema.md
- 🔍 Context Funnel — .agents/router/CONTEXT_FUNNEL.md
- 🎭 Role Matrix (SSOT) — .agents/workflow/role_matrix.json (human view: ROLE_MATRIX.md)
- 📚 Skill governance — SKILLS_GOVERNANCE.md
- 🎬 Walkthrough example — examples/standard-run-walkthrough/
Legacy
.agents/router/ROUTER.md,.agents/workflow/LIFECYCLE.md, and.agents/workflow/HOOKS.mdare deprecated stubs that point at PROTOCOL.md sections — do not add new content there.
Contributions are welcome. Start with CONTRIBUTING.md for the full PR workflow, gate/skill authoring guide, and self-test instructions. By participating you agree to the Code of Conduct. Security issues should follow SECURITY.md (do not open a public issue for vulnerabilities).
Quick checklist:
- Read First: AGENTS.md, .agents/PROTOCOL.md, and ENGINEERING_MANUAL.md for deeper context.
- Follow Lifecycle: All non-trivial changes go through the 6-phase lifecycle.
- Update Knowledge: Extract stable knowledge to the appropriate domain index.
- Run Diagnostics:
python3 .agents/scripts/harness/self_test.pyand the relevant gates. - Submit PR: Include
openspec.md(or a Slim Spec for@patch) for any change with MEDIUM or HIGH risk.
The harness follows semver. The current release is v0.1.0 — the first tagged release after a feature-complete polish pass on the React + Effect frontend scope. Pre-1.0 minor versions may include breaking changes to the protocol surface (lifecycle phases, role matrix shape, gate exit codes); pin a specific tag in your downstream projects if you need byte-for-byte stability. From v1.0.0 onward, breaking changes will follow a documented deprecation window of one minor release.
See CHANGELOG.md for the full release history.
This project is licensed under the MIT License - see the LICENSE file for details.
This framework draws inspiration from:
- OpenSpec: Contract-first development methodology
- Harness: Lifecycle state machines & hook systems
- LLM Wiki: Evolvable knowledge graphs with anti-bloat mechanisms
- Agentic Patterns: Autonomous agent workflows with human-in-the-loop checkpoints