Effect Harness Agent 🚀

An Agent-Driven Engineering Framework for TypeScript & Effect

⚠️ Critical Positioning Statement: An LLM-Native Operating System

"Do not hand a steering wheel to an engine."

This repository is a machine-to-machine infrastructure. It is not a TypeScript framework like NestJS, nor is it a CLI tool for human engineers. It is a Cognitive Harness—an executable protocol designed by humans, but read, interpreted, and executed exclusively by Large Language Models (LLMs).

By encoding engineering discipline (Lifecycle State Machines, Role Matrices, Vector-less Knowledge Graphs) into LLM-native formats, it transforms the AI from a simple code-completion oracle into an autonomous, self-navigating, and self-correcting engineering agent.

Effect Harness Agent is an agent-driven engineering framework for TypeScript/Effect projects, designed for sustainable software development. It integrates an Intent Gateway, a 6-phase Lifecycle State Machine, Contract-first OpenSpec design, and a drill-down LLM Wiki (Knowledge Graph) to prevent context bloat, enabling AI agents to autonomously build, test, and self-correct production-ready code.

Engineering Manual | Quick Start

📖 Overview

Effect Harness Agent is an innovative agent-driven development workflow that bridges the gap between natural language requirements and production-ready TypeScript/Effect code. Built on Intent Gateway, Lifecycle State Machine, Knowledge Graph (LLM Wiki), and Specialized Skills Matrix, it enables sustainable, interruptible, self-correcting, and anti-bloat engineering closed-loops.

✨ Key Features

🎯 Intent-Driven: Natural language → Structured intent queues → Executable tasks
🔄 Lifecycle State Machine: Explorer → Propose → Review → Approval Gate (HITL) → Implement → Validation Gate → QA → Archive
🧠 Knowledge Graph: Hierarchical wiki system with bidirectional navigation
🛡️ Self-Correcting: Automatic guard hooks, failure recovery, human-in-the-loop checkpoints
📊 Contract-First: OpenSpec-based design before implementation
🔌 Skills Matrix: 25+ specialized skills providing domain expert capabilities
📈 Anti-Bloat Mechanism: Automatic knowledge extraction and archival to prevent information overload

💰 Token Economics & Cost Model

Given that Effect Harness Agent is an agentic framework, it inherently consumes more tokens than a simple code completion tool. However, its architecture shifts costs from Rework & Blind Search to Planning & Guardrails, resulting in highly predictable and stable overall costs for complex tasks.

1. The "Thinking Tax" (Where Costs Increase)

Each turn requires the LLM to output the <Cognitive_Brake> and read mandatory system contexts (e.g., LIFECYCLE.md, AGENTS.md). This adds a fixed baseline "thinking tax" of ~500 Output Tokens and ~2000 Input Tokens per interaction.
The Propose phase explicitly requires drafting explore_report.md and openspec.md, consuming an extra ~1500 Output Tokens before a single line of code is written.

2. The ROI: Comparing 3 Paradigms for a Complex Feature (e.g., Cross-Table Transaction)

Note: Estimates assume a typical modern flagship LLM (e.g., GPT-4o, Claude 3.5/3.7 Sonnet, Gemini 1.5 Pro) pricing model.

Paradigm	Behavior	Input Tokens	Output Tokens	Hidden Costs / Risks	Verdict
Pure Chat / Copilot	Jumps straight to coding with limited context.	~5k	~1k	High Rework Rate. Misses transaction boundaries, forgets existing enums. Requires human prompt corrections.	Cheap in Tokens, Expensive in Human Time.
Unconstrained Auto-Agent	Blindly searches the entire codebase (e.g., `SearchCodebase` or `Grep` without limits), loops endlessly on compile errors.	100k+	10k+	Disastrous. Burns through budget via massive context bloat and infinite loops before hitting platform limits.	Unpredictable & Dangerous.
Effect Harness Agent	Pays the "Thinking Tax", limits searches (`Wiki≤3, Code≤8`), designs `openspec.md`, and STOPs at Approval Gates.	~30k	~6k	Highly predictable. Architectural errors are intercepted early by humans; syntax errors are digested by Shift-Left Validation.	The Sweet Spot. Optimized for high-quality delivery with controlled token spend.

3. Scenario-Based Token Estimates (Using Latest Flagship Models)

Scenario Profile	Typical Turns	Input Tokens	Output Tokens	Expected Cost / Task
`@patch` (Small Bugfix)	1-2 Turns	~5k - 8k	~1k - 2k	$0.05 - $0.15
`@standard` (New Feature)	4-6 Turns	~20k - 40k	~4k - 8k	$0.30 - $0.80
`@learn` (Doc QA)	1 Turn	~3k - 5k	~500	$0.02 - $0.05

4. Token Optimization Formula

Total Token Cost = (Base Context + Context Funnel Payload) × Turns + (Artifact Generation + Code Generation + Cognitive Brake)

To minimize costs:

Use Shortcuts: Use @patch instead of @standard for trivial changes to skip Phase 1-3.
Provide Explicit Scopes: Include --scope src/Foo.java in your prompt. This triggers Rule 0 (Direct Read), bypassing the entire Knowledge Funnel drill-down process and saving thousands of Input Tokens.
Respect the Brakes: When the agent STOPs at the Validation Gate, ensure your local environment is ready before allowing it to compile, preventing retry loops.

🏗️ Architecture

Core Philosophy

Three Fundamental Problems Solved by Effect Harness Agent:

Context Bloat Out of Control: LLM blind searching in large codebases leads to token waste and attention dispersion → Solved through Knowledge Graph + Budgeted Navigation
Requirement Drift & Unauthorized Modifications: Agent free-play causes cross-domain pollution and contract corruption → Solved through Intent Gateway + Role Matrix Guards
Knowledge Fragmentation & Unsustainability: Conversation memory loss, documentation desynchronization, index bloat → Solved through WAL Write-back + Auto-Refactoring

Design Philosophy: Encode engineering discipline into LLM-executable protocols, enabling machine-to-machine self-coordination, self-correction, and self-evolution.

System Architecture Diagram

flowchart TB
    subgraph Input["📥 Input Layer"]
        User[👤 User Requirements]
        Shortcut["⚡ Shortcuts<br/>@read/@patch/@standard"]
    end
    
    subgraph Gateway["🎯 Intent Gateway Layer (ROUTER)"]
        IG[Intent Gateway<br/>Intent Classifier]
        Profile{Execution Profile Selection}
        LEARN[LEARN<br/>Read-only Q&A]
        PATCH[PATCH<br/>Small Changes]
        STANDARD[STANDARD<br/>Full Lifecycle]
    end
    
    subgraph Context["🔍 Context Collection Layer (FUNNEL)"]
        DirectRead[Direct Read<br/>When scope explicit]
        Funnel[Knowledge Funnel<br/>Sitemap→Index→Doc]
        Budget[Budget Control<br/>Wiki≤3, Code≤8]
        Escalation[Escalation Protocol<br/>Escalation Card]
    end
    
    subgraph Knowledge["🧠 Knowledge Graph Layer (LLM Wiki)"]
        KG[KNOWLEDGE_GRAPH.md<br/>Root Node]
        DomainIndex["Domain Indices<br/>api/data/domain etc."]
        Docs[Specific Documents]
        Archive[Archive Zone<br/>Cold Storage]
    end
    
    subgraph Lifecycle["⚙️ Lifecycle Engine Layer (WORKFLOW)"]
        LaunchSpec[Launch Spec<br/>State Machine Table]
        Phase1[1_Explorer<br/>Clarify Requirements]
        Phase2[2_Propose<br/>Freeze Contracts]
        Phase3[3_Review<br/>Technical Review]
        ApprovalGate[Approval Gate<br/>HITL Checkpoint]
        Phase4[4_Implement<br/>Implement per Contract]
        Phase5[5_QA<br/>Test Validation]
        Phase6[6_Archive<br/>Knowledge Extraction]
    end
    
    subgraph Roles["🎭 Role Matrix Layer (ROLE MATRIX)"]
        Ambiguity[Ambiguity Gatekeeper<br/>Ambiguity Guard]
        ReqEngineer[Requirement Engineer<br/>Requirements]
        SysArchitect[System Architect<br/>Architecture]
        LeadEngineer[Lead Engineer<br/>Code & Shift-Left]
        CodeReviewer[Code Reviewer<br/>Quality QA]
        FocusGuard[Focus Guard<br/>Anti-Drift Guard]
        KnowledgeExt[Knowledge Extractor<br/>Unified WAL]
        SecuritySentinel[Security Sentinel<br/>Security Sentinel]
        DocCurator[Documentation Curator<br/>Doc Curator]
        SkillCurator[Skill Graph Curator<br/>Skills]
        Librarian[Librarian<br/>GC & Compaction]
        KnowledgeArch[Knowledge Architect<br/>Knowledge Architect]
    end
    
    subgraph Hooks["🛡️ Hook Correction Layer (HOOKS)"]
        PreHook[pre_hook<br/>Load Rule Sets]
        GuardHook[guard_hook<br/>Execution Guard]
        PostHook[post_hook<br/>Post-Audit]
        FailHook[fail_hook<br/>Failure Rollback]
        LoopHook[loop_hook<br/>Queue Loop]
    end
    
    subgraph Skills["🔧 Skills Matrix Layer (SKILLS)"]
        SkillIndex[trae-skill-index<br/>Master Skill Index]
        EffectSkills[TypeScript/Effect Skills<br/>25+ Professional Capabilities]
    end
    
    subgraph Scripts["📜 Script Tools Layer (SCRIPTS)"]
        Gates["Gate Scripts<br/>ambiguity_gate.py etc."]
        WikiTools[Wiki Tools<br/>linter/compactor]
        Engine[Engine Helper<br/>engine.py]
    end
    
    User --> IG
    Shortcut --> IG
    IG --> Profile
    Profile -->|LEARN| DirectRead
    Profile -->|PATCH| LaunchSpec
    Profile -->|STANDARD| LaunchSpec
    
    DirectRead --> Funnel
    Funnel --> Budget
    Budget --> Escalation
    
    KG --> DomainIndex
    DomainIndex --> Docs
    Docs --> Archive
    
    LaunchSpec --> Phase1
    Phase1 --> Phase2
    Phase2 --> Phase3
    Phase3 --> ApprovalGate
    ApprovalGate --> Phase4
    Phase4 --> Phase5
    Phase5 --> Phase6
    Phase6 --> LaunchSpec
    
    Phase1 -.->|Mount| Ambiguity
    Phase1 -.->|Mount| ReqEngineer
    Phase1 -.->|Mount| FocusGuard
    Phase2 -.->|Mount| SysArchitect
    Phase3 -.->|Mount| SysArchitect
    Phase4 -.->|Mount| LeadEngineer
    Phase4 -.->|Mount| FocusGuard
    Phase4 -.->|Mount| SecuritySentinel
    Phase5 -.->|Mount| CodeReviewer
    Phase5 -.->|Mount| DocCurator
    Phase6 -.->|Mount| KnowledgeExt
    Phase6 -.->|Mount| DocCurator
    Phase6 -.->|Mount| SkillCurator
    Phase6 -.->|Mount| Librarian
    
    Phase1 -.->|Trigger| PreHook
    Phase4 -.->|Trigger| GuardHook
    Phase5 -.->|Trigger| PostHook
    Phase3 -.->|Trigger| FailHook
    Phase5 -.->|Trigger| FailHook
    Phase6 -.->|Trigger| LoopHook
    
    Ambiguity -.->|Invoke| Gates
    GuardHook -.->|Invoke| Gates
    PostHook -.->|Invoke| WikiTools
    KnowledgeArch -.->|Invoke| WikiTools
    
    Phase1 -.->|Query| SkillIndex
    Phase2 -.->|Query| SkillIndex
    Phase4 -.->|Query| EffectSkills

Core Components Breakdown

Layer	Component	Responsibility	Key File
Input	Intent Gateway	Natural language → Structured intents + Execution profiles	ROUTER.md
Context	Knowledge Funnel	Bidirectional navigation (forward retrieval + reverse write-back)	CONTEXT_FUNNEL.md
Knowledge	LLM Wiki	Fractal knowledge graph (Sitemap/Index/Docs/Archive)	KNOWLEDGE_GRAPH.md
Process	Lifecycle Engine	6-phase state machine + breakpoint resume	LIFECYCLE.md
Roles	Role Matrix	Dynamic virtual role mounting + gate guards	ROLE_MATRIX.md
Correction	Hooks System	Pre/guard/post/fail/loop interception	HOOKS.md
Capability	Skills Matrix	25+ domain-specific expert capabilities	trae-skill-index
Tools	Script Tools	Deterministic quality checks + auxiliary tools	scripts/

🚀 Quick Start

Prerequisites

TypeScript 5.0+
Python 3.8+ (for script tools)
Git

3-Minute Onboarding Guide

Step 1: Read Project Rules ⚡

Start with AGENTS.md - the master entry point defining execution discipline with hard constraints and navigation rules.

Core Constraints Quick Reference:

Budget Limits: Wiki ≤ 3 docs, Code ≤ 8 files (same-file pagination doesn't count)
Cognitive Brake: Mandatory <Cognitive_Brake> XML block before any action to enforce Role, Scope, and Budget awareness
Approval & Validation Gates: Must STOP for human confirmation before writing code (Approval Gate) and before heavy compilation (Validation Gate)
Anti-Looping: Max 3 retries for scripts/linters; STRICT MAX 2 retries for compilation. Exceeding thresholds MUST request human intervention
Scope Guard: Cannot modify files outside focus_card.md agreed scope without explicit authorization

Step 2: Understand Intent Gateway 🎯

The Intent Gateway transforms natural language into executable queues, supporting three execution profiles:

Profile	Use Case	Lifecycle Entry	Artifacts
LEARN	Read-only explanation, code understanding	❌ No	None
PATCH	Small changes, bug fixes (LOW risk)	✅ Minimal	Slim Spec + Change Log
STANDARD	MEDIUM/HIGH risk, wide blast radius	✅ Full 6-phase	Full OpenSpec + Approval Gate

Shortcuts (Explicit Routing):

@read / @learn     → Force LEARN mode (read-only)
@patch / @quickfix → Force PATCH mode (small changes)
@standard          → Force STANDARD mode (full lifecycle)

Shortcut DSL Examples:

@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API

Step 3: Navigate Knowledge Graph 🗺️

Rule 0: Direct Read when scope is explicit (MUST)

If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
- ✅ Do direct read first
- ❌ Do NOT start with Knowledge Graph drill-down

Rule 1: Otherwise, use Knowledge Funnel (MUST)

Read root: KNOWLEDGE_GRAPH.md
Drill down via: CONTEXT_FUNNEL.md
If unsure which skill to use, consult: trae-skill-index

Common Domain Indices:

API Design → .agents/llm_wiki/wiki/api/index.md
Data Models → .agents/llm_wiki/wiki/data/index.md
Domain Logic → .agents/llm_wiki/wiki/domain/index.md
Architecture → .agents/llm_wiki/wiki/architecture/index.md

Step 4: Run Your First Complete Cycle 🔄

Complete one STANDARD task following the Lifecycle:

stateDiagram-v2
    [*] --> Explorer: Clarify Requirements
    Explorer --> Propose: Freeze Contracts
    Propose --> Review: Technical Review
    Review --> ApprovalGate: HITL Checkpoint
    ApprovalGate --> Implement: Implement per Contract
    Implement --> ValidationGate: STOP & Request Compile
    ValidationGate --> QA: Test Validation
    QA --> Archive: Knowledge Extraction (Same Session)
    Archive --> [*]: Queue Complete
    
    Review --> Propose: fail_hook(review failed)
    QA --> Implement: fail_hook(compile/test failed, max 2 retries)
    
    note right of ApprovalGate
        MEDIUM/HIGH Risk:
        Must wait for human confirmation
        Status=WAITING_APPROVAL
    end note
    
    classDef explorerClass fill:#e1f5ff,stroke:#333
    classDef proposeClass fill:#fff4e6,stroke:#333
    classDef reviewClass fill:#ffe6e6,stroke:#333
    classDef approvalClass fill:#fff9e6,stroke:#333
    classDef implementClass fill:#e6ffe6,stroke:#333
    classDef validationClass fill:#fff9e6,stroke:#333
    classDef qaClass fill:#f0e6ff,stroke:#333
    classDef archiveClass fill:#e6f0ff,stroke:#333
    
    class Explorer explorerClass
    class Propose proposeClass
    class Review reviewClass
    class ApprovalGate approvalClass
    class Implement implementClass
    class ValidationGate validationClass
    class QA qaClass
    class Archive archiveClass

Breakpoint Resume Mechanism:

Launch Spec persisted at router/runs/launch_spec_*.md
First action after session interruption: read this file to restore state
Status enum: PENDING, IN_PROGRESS, DONE, WAITING_APPROVAL, FAILED

💡 Usage Scenarios

Scenario A: New Query API (No DB Changes)

Goal: Create read-only endpoints (DTO/Controller/Service) without table structure changes

graph LR
    A[Explorer<br/>Clarify Requirements] --> B[Propose<br/>OpenSpec]
    B --> C[Review<br/>Technical Review]
    C --> D[Approval<br/>HITL Checkpoint] --> E[Implement<br/>Per Contract]
    E --> V[Validation<br/>STOP Gate]
    V --> F[QA<br/>Test Validation]
    F --> G[Archive<br/>Update Index]
    
    style A fill:#e1f5ff,stroke:#333,stroke-width:2px
    style B fill:#fff4e6,stroke:#333,stroke-width:2px
    style C fill:#ffe6e6,stroke:#333,stroke-width:2px
    style D fill:#fff9e6,stroke:#333,stroke-width:2px
    style E fill:#e6ffe6,stroke:#333,stroke-width:2px
    style V fill:#fff9e6,stroke:#333,stroke-width:2px
    style F fill:#f0e6ff,stroke:#333,stroke-width:2px
    style G fill:#e6f0ff,stroke:#333,stroke-width:2px

Key Deliverables:

✅ explore_report.md - Scope & impact analysis + Core Context Anchors
✅ openspec.md - API contract with JSON examples, acceptance criteria
✅ Implementation following contract (no over-engineering)
✅ Unit tests with coverage evidence
✅ Update API index in wiki/api/ (WAL mechanism)

Mounted Roles (resolved from role_matrix.json):

Explorer: ambiguity_gatekeeper, requirement_engineer, focus_guard
Propose: system_architect
Implement: lead_engineer, focus_guard, security_sentinel
QA: code_reviewer, documentation_curator, accessibility_auditor, visual_critic, performance_warden
Archive: knowledge_extractor, documentation_curator, skill_graph_curator, librarian

Scenario B: API + Database Schema Changes

Goal: New endpoint with table structure & index modifications

Critical Path:

Propose: Freeze both API & Data contracts simultaneously (field semantics, constraints, index design, compatibility strategy)
Review: SQL risk assessment, index utilization, implicit conversion checks, authorization risks
QA: Regression tests covering core queries & edge cases
Archive: Update both wiki/api/ and wiki/data/ indices, synchronize ER diagrams

Activated Skills (frontend / Effect-native; backend SQL skills are archived):

effect-schema-as-contract - Schema design for shared contracts (frontend ↔ API)
consumed-api-contracts - Pinning + tracking server contracts the client consumes
effect-schema-form - Form binding when the schema also drives a UI

Scenario C: Bug Fix (Reproduce First, Then Test)

Goal: Fix defects ensuring reproducibility, regressability, and traceability

stateDiagram-v2
    [*] --> Explorer
    Explorer --> Implement: Identify Root Cause
    Implement --> QA: Write Failing Test First
    QA --> Implement: fail_hook (test failed)
    Implement --> QA: Fix to Pass Test
    QA --> Archive: Regression Test Suite
    Archive --> [*]
    
    note right of QA
        TDD Approach:
        1. Write failing test first
        2. Fix to pass test
        3. Add regression tests
    end note

Workflow:

Explorer: Minimal reproduction path + root cause hypothesis + impact analysis (whether Propose/contract update needed)
QA: Write failing test BEFORE fix (TDD approach)
Implement: Fix implementation to pass test
Archive: Record pattern in wiki/testing/ or reviews/, update related API/Domain indices if necessary

Profile: PATCH (LOW risk) or STANDARD (MEDIUM/HIGH risk)

Scenario D: Performance Optimization

Goal: Optimize SQL/performance without changing external behavior

Focus Areas:

Propose: Document "behavior unchanged" constraints + rollback strategy
Review: SQL standards & index utilization as top priority
QA: Comparative evidence (performance benchmarks + correctness)
Archive: Extract reusable performance rules to preferences/

Activated Skills:

bundle-budget-standard - Bundle-size guards (top priority for client perf)
effect-fiber-and-stream - Concurrency / scheduling primitives when refactoring hot paths
code-review-checklist - Refactoring review surface

Scenario E: Refactoring (With Boundary Guards)

Goal: Improve maintainability without introducing requirement drift

Guardrails:

Explicit "what's in / what's out" scope definition (Focus Card)
Cross-domain modifications require explicit authorization (guard_hook)
Architecture decisions written back to wiki/architecture/

Activated Roles:

Ambiguity Gatekeeper - Ambiguity guard
Focus Guard - Anti-drift guard
Knowledge Architect - Knowledge architect (if Wiki refactoring needed)

Scenario F: Parallel Collaboration

Goal: Server-led delivery with optional UI/QA parallel work

sequenceDiagram
    participant S as Server Agent
    participant H as Human (HITL)
    participant U as UI Agent
    participant Q as QA Agent
    
    S->>S: Explorer → Propose
    S->>H: Request Approval (Freeze Contract)
    H->>S: ✅ Approved
    par Parallel Execution
        S->>S: Implement Code
        U->>U: Build UI from API Contract
        Q->>Q: Write Tests from Acceptance Criteria
    end
    S->>S: QA → Archive

Key Handoff Points:

Approval Gate Phase: Frozen OpenSpec becomes single source of truth, acts as "starting gun" for parallel collaboration
Minimal Handoff: API Contract (JSON examples), Acceptance Criteria (Given/When/Then), Error Codes
Server Cohesion: Internal details remain encapsulated (not forced outward)

Scenario G: Read-Only Audit (Audit.Codebase)

Goal: Perform read-only analysis and assessment of the codebase, producing structured audit reports

Constraints:

❌ No code modifications
❌ No Wiki writes
❌ No launch spec generation
❌ No lifecycle entry

Allowed Operations:

✅ Read-only retrieval and reading
✅ Run tests/builds (but do not modify any tracked files)

Output Requirements: Each conclusion must include evidence (file path + line range) and impact/recommendations

Typical Scenarios: Architecture review, code quality scanning, technical debt assessment

Scenario H: Documentation Q&A (QA.Doc / QA.Doc.Actionize)

QA.Doc (Pure Q&A)

Goal: Answer questions based on Wiki/requirement documents
Method: Drill down through knowledge funnel, output answers with citations
Citations: Wiki/requirement paragraphs, supplement with code references when needed
Does NOT trigger lifecycle

QA.Doc.Actionize (Q&A to Action)

Goal: Convert Q&A conclusions into executable intent queues
Critical Step: Must ask user whether to "launch" first
After Confirmation: Generate launch spec and enter lifecycle
Without Confirmation: Output answer only, no side effects

Typical Scenarios: Query business rules, understand API usage, confirm architecture decisions

🚦 Intent Gateway: From Natural Language to Executable Queues

The Intent Gateway transforms natural language requirements into structured intent queues that drive the entire lifecycle.

Execution Profiles

Not every request needs the full lifecycle. The gateway selects an execution profile:

Profile	When to Use	Lifecycle Entry	Artifacts
LEARN	Read-only explanation, code understanding	No	None
PATCH	Small changes, bug fixes (LOW risk)	Minimal	Slim Spec or Change Log
STANDARD	MEDIUM/HIGH risk, wide blast radius	Full 6-phase	Full OpenSpec + Approval Gate

Shortcuts (Explicit Routing)

Users can override automatic routing with explicit shortcuts:

@read / @learn: Force Profile LEARN (read-only, no write-back)
@patch / @quickfix: Force Profile PATCH (small change mode)
@standard: Force Profile STANDARD (full lifecycle)

Shortcut DSL (Composable)

Shortcuts can be composed with flags to express common workflows as a small DSL.

Syntax:

@<profile> <flags...> -- <natural language request or question>

Flags (order-independent):

Scope / read:
- --scope <path|glob|symbol>: explicit scope (file/dir/symbol)
- --direct: force direct reads (do not start with Knowledge Graph drill-down)
- --funnel: force the funnel even if scope is explicit
- --depth shallow|normal|deep: explanation depth (LEARN only)
Risk / artifacts:
- --risk low|medium|high: explicit risk override
- --slim: force Slim Spec (PATCH only, or STANDARD with --risk low)
- --changelog: use Change Log only (PATCH only)
- --evidence required|optional|none: evidence requirement (default: PATCH=required)
Launch / write-back:
- --launch: force lifecycle launch (STANDARD only)
- --no-launch: force no launch
- --writeback: allow wiki/WAL write-back (not allowed for LEARN)
- --no-writeback: forbid write-back (default)
Verification:
- --test "<cmd>": required verification command + evidence
- --no-test: skip tests (LEARN only; PATCH requires an explicit justification)
DocQA actionize:
- --actionize: convert DocQA into an executable STANDARD queue (requires confirmation)
- --yes: auto-confirm --actionize / --launch (team use with caution)

Conflict rules (MUST enforce):

@learn MUST NOT be combined with --launch or --writeback.
--launch MUST be used with @standard only.
--slim requires --risk low (or implied low risk in PATCH).
--actionize MUST ask for confirmation unless --yes is present.

Examples:

@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API
@learn --funnel -- what is the API design standard? --actionize

Core Intent Types

The gateway maps requests to a small set of top-level intents:

Intent	When to Use	Default Profile	Launch Spec	Write-back
`Learn`	"Explain/read/understand this code" with explicit scope	LEARN	No	No
`Change`	"Modify code" (feature, refactor, bugfix)	PATCH or STANDARD	Yes (STANDARD only)	Optional (Archive)
`DocQA`	"What is the rule/process/template?"	LEARN	No	No (unless actionized)
`Audit`	"Assess the codebase" (read-only review/risk scan)	LEARN	No	No

Context Collection Rules

Rule 0: Direct Read when scope is explicit (MUST)

If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
- ✅ Do direct read first
- ❌ Do NOT start with Knowledge Graph drill-down
- Use funnel only if background context needed after first read

Rule 1: Otherwise, use Knowledge Funnel (MUST)

Read root: KNOWLEDGE_GRAPH.md
Drill down via: CONTEXT_FUNNEL.md
If unsure which skill to use, consult: trae-skill-index

Budgeted Navigation & Escalation

Budgeted Navigation (MUST) For Change and Audit intents, uncontrolled exploration is forbidden.

Default budgets:

Wiki budget: 3 documents
Code budget: 8 files
Pagination reads within the same file do NOT count as additional file reads

Saturation Gate (Stop Reading When Enough) Stop reading and move to decision/implementation when ANY is met:

Template acquired: any 2 of (route shape, DTO validation style, service entry pattern, mapper/sql pattern, table field pattern)
Integration point acquired: a concrete example of the dependency usage
Executable chain acquired: a known good call chain exists and the remaining work is a mechanical extension

Stop-Wiki (MUST) If 3 consecutive wiki reads are "no-gain", the Agent MUST stop wiki navigation and proceed with a minimal, standards-compliant decision.

Stop-Code (MUST) Code reading must monotonically shrink scope. If scope does not shrink for 2 consecutive code reads, the Agent MUST stop reading and trigger Escalation Protocol.

Escalation Protocol (MUST) If budgets are exhausted OR stop rules trigger and success criteria are not met, the Agent MUST request human help instead of continuing to read.

Escalation Card format:

Consumed: wiki X/3, code Y/8
Confirmed facts (<= 5 bullets)
Missing info (<= 2 bullets, must be specific)
Why it is blocking (one sentence)
Proposed next targets (<= 5 file paths / keywords)
Request: wiki +1 or code +2 (small step)
Fallback if still missing: pick one of:
- ask 1 critical question
- request a concrete anchor (class/table/entrypoint) from human
- deliver a minimal viable plan with explicit risks

When escalation blocks the workflow, set the intent row in launch_spec_*.md to WAITING_APPROVAL and include a link to the relevant artifact.

Internal Lifecycle Queue Codes (STANDARD Profile Only)

When Profile is STANDARD, the Change intent expands into:

Code	Phase	Notes
`Explore.Req`	Explorer	Clarify requirements + scope anchors
`Propose.API`	Propose → Review	API contract and design
`Propose.Data`	Propose → Review	Database schema changes
`Implement.Code`	Implement → QA	Code changes
`QA.Test`	QA	Tests + evidence

Launch Spec Template (Machine-Friendly, Supports Breakpoint Resume)

Status values: PENDING, IN_PROGRESS, DONE, WAITING_APPROVAL, FAILED

# Launch Spec - {YYYYMMDD_HHMMSS}

## State Machine
| Intent | Status | Phase | Artifact/Log | Failed_Reason |
|---|---|---|---|---|
| Explore.Req | IN_PROGRESS | 1_Explorer | `explore_report.md` | - |
| Propose.API | PENDING | - | - | - |
| Implement.Code | PENDING | - | - | - |

## Breakpoint Resume
- If session interrupted/human delayed: First action is to read this file upon wake-up.
- If `WAITING_APPROVAL` exists: Enter Approval checkpoint, read corresponding `openspec.md`, wait for human confirmation, then switch status to `IN_PROGRESS` and proceed to Implement.
- If `FAILED` exists: Stop automatic progression, report `Failed_Reason` to human and request intervention.

Key Discipline: The state machine table drives workflow progression. Only update Status/Phase/Failed_Reason fields to avoid checkbox matching failures and state confusion.

🛡️ Self-Correction Mechanisms

Mechanism	Trigger Point	Trigger Condition	Effect	Evaluation Method
Cognitive_Brake	Before any action	Protocol enforcement	Forces LLM to explicitly reason about roles, boundaries, budgets, and next steps before generating tools or code	XML CoT parsing
pre_hook	Before entering new phase	Phase transition	Load relevant rule sets + output Decision-First Preflight + budgets	Required output format
guard_hook	During implementation/modification	Style violations, permission breaches, cross-domain pollution, budget exhaustion	Immediate block, require rewrite or authorization; enforce Anti-runaway guard	Standard skill review + Budget rules
fail_hook	Any phase failure	Compilation/test/review failures	State downgrade rollback; log failure reason to `openspec.md`; trigger retry count	Objective logs (compilation/test output)
Max Retries	Inside fail_hook	Same phase consecutive failures reach threshold	Force stop and request human intervention (Max 3 for scripts, STRICT MAX 2 for compilation)	Failure count reaches threshold
Approval Gate (HITL)	After Review passes	Need to enter Implement	"Freeze contract", human authorizes whether to proceed	Human confirmation (YES/NO + modification feedback)
Doc Consistency Gate	post_hook / Archive	Wiki hallucination & contract corruption risk	Read-only validation (`schema_checker.py` + `wiki_linter.py`), trigger `fail_hook` on FAIL	Script exit codes (non-zero = FAIL)
Archive Write-back	Task completion	New/changed knowledge needs persistence	Extract stable knowledge from Spec, archive hot documents, update indices (WAL mechanism)	Rule validation, connectivity check
Preferences Memory	Before/after Archive	Representative human ratings/feedback	Persist experience as preferences/anti-patterns to `wiki/preferences/index.md`, effective in next pre_hook	Human rating + textual reasoning
Non-Convergence Fallback	Workflow stuck repeating same action	Doc rewrite or linter failure loop	Stop repeating, run deterministic verification, report mismatch, request human intervention	Evidence-based mismatch detection

🔧 Skills Matrix

The harness ships 35 active skills under .agents/skills/. The canonical list is trae-skill-index/SKILL.md — what follows is a tour of the major sections.

Entry & routing

intent-gateway — entry node for every task. Routes natural language into the correct intent + profile + role mounts. Also handles Audit.Codebase, QA.Doc, QA.Doc.Actionize.
trae-skill-index — central index for skill discovery.
skill-graph-manager — maintains bidirectional links between related skills.

React + Effect frontend (the current target)

react-component-architecture — composition, prop contracts, server/client boundary, ref forwarding.
effect-react-patterns — entry node for the seven effect-* skills below.
effect-runtime-and-layers — single ManagedRuntime + Layer composition. Read FIRST before any other Effect skill.
effect-schema-as-contract — Schema.Struct as the contract between frontend and API.
effect-tagged-errors-and-match — typed error unions + exhaustive matching.
effect-platform-httpclient — @effect/platform HTTP client conventions.
effect-fiber-and-stream — concurrency, schedules, and Streams.
effect-vitest-testing — Vitest + Effect testing patterns (TestClock, Layer mocking).
effect-schema-form — schema-driven form binding.
effect-tsgo-diagnostics — catalog of @effect/tsgo diagnostics mapped to harness role / phase / severity.

Code-health patterns (project-agnostic)

assertion-not-snapshot-tests — assert intent, not output shape.
zustand-slice-architecture — slice patterns when Zustand is the chosen store.
god-component-decomposition — heuristics for breaking up oversized React components.
schema-migration-versioned-documents — versioned-document migration pattern.
form-validation-zod — Zod-based validation when projects choose Zod over Effect Schema.

Code standards & engineering

global-engineering-standards — master index for code-generation standards.
utils-usage-standard — utility usage patterns.
tsdoc-standard — unified TSDoc style.
eslint-prettier-standard — code-style enforcement.
linter-severity-standard — severity rules (FAIL / WARN / IGNORE) for downstream linters.
code-review-checklist — review checklist (security / performance / maintainability).
api-documentation-rules — API doc generation & archival.
consumed-api-contracts — pinning + tracking server contracts the client consumes.

Frontend QA & UX standards

accessibility-wcag-aa-standard — WCAG 2.2 AA conformance rules.
design-token-standard — design token discipline.
responsive-breakpoint-standard — breakpoint conventions.
bundle-budget-standard — bundle-size budgets per route.
visual-regression-standard — Lost-Pixel discipline.
e2e-playwright-standard — Playwright conventions.
state-management-pattern — when to reach for Zustand vs Context vs Effect.

Deprecated (pending archival)

devops-lifecycle-master → use intent-gateway + .agents/PROTOCOL.md.
devops-testing-standard → use effect-vitest-testing.

8 first-generation devops-* skills, plus prd-task-splitter and product-manager-expert, were archived 2026-05-09. Their responsibilities are now covered by roles in role_matrix.json. See .agents/skills/_archive/README.md.

Phase → mounted roles

The skill-to-phase mapping is now driven by role_matrix.json, not by skill names. The phase mounts roles; the roles invoke skills + gates. See .agents/workflow/ROLE_MATRIX.md (auto-generated from JSON) for the canonical mapping.

Phase	Mounted roles
Explorer	`ambiguity_gatekeeper`, `requirement_engineer`, `focus_guard`
Propose	`system_architect`
Review	`system_architect`
Approval (HITL)	— (human checkpoint)
Implement	`lead_engineer`, `focus_guard`, `security_sentinel`
QA	`code_reviewer`, `documentation_curator`, `accessibility_auditor`, `visual_critic`, `performance_warden`
Archive	`knowledge_extractor`, `documentation_curator`, `skill_graph_curator`, `librarian`

📂 Project Structure

effect-harness-agent/
├── .agents/
│   ├── PROTOCOL.md              # 📜 SSOT for routing, lifecycle, hooks, scenarios
│   ├── QUICKSTART.md            # 1-page agent quickstart (eager-load on session start)
│   │
│   ├── router/
│   │   ├── CONTEXT_FUNNEL.md    # Knowledge funnel + reverse write-back
│   │   ├── ROUTER.md            # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   └── runs/                # Launch specs (gitignored)
│   │
│   ├── workflow/
│   │   ├── role_matrix.json         # 🎭 SSOT — roles, gate mounts, phase routing
│   │   ├── role_matrix.schema.json  # JSON Schema for IDE validation
│   │   ├── ROLE_MATRIX.md           # Human-readable view (auto-generated)
│   │   ├── LIFECYCLE.md             # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   ├── HOOKS.md                 # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   ├── ARCHIVE_WAL.md           # WAL conventions
│   │   ├── bundle-budget.json       # Default bundle budgets
│   │   ├── lighthouse-budget.json   # Default Web Vitals budgets
│   │   └── runs/                    # Gates reports + engine_state.json (gitignored)
│   │
│   ├── llm_wiki/
│   │   ├── KNOWLEDGE_GRAPH.md   # 🗺️ Root node (mandatory entry)
│   │   ├── purpose.md
│   │   ├── schema/
│   │   │   ├── openspec_schema.md
│   │   │   └── subagent_contract_schema.md
│   │   ├── wiki/                # Active knowledge domains
│   │   │   ├── api-contracts/   # External APIs the client consumes
│   │   │   ├── routes/          # Frontend route map
│   │   │   ├── components/      # Component catalog
│   │   │   ├── design-system/   # Tokens, primitives
│   │   │   ├── flows/           # User flows
│   │   │   ├── runtime/         # ManagedRuntime composition
│   │   │   ├── layers/          # Effect Layer definitions
│   │   │   ├── services/        # Service interfaces
│   │   │   ├── schemas/         # Effect Schema definitions
│   │   │   ├── architecture/    # ADRs (WAL fragments)
│   │   │   ├── specs/           # Active openspec.md
│   │   │   └── preferences/     # Project preferences
│   │   └── archive/             # Cold storage (extracted specs)
│   │
│   ├── skills/                  # 35 active skills + _archive/
│   │   ├── intent-gateway/
│   │   ├── trae-skill-index/    # Central index (every skill linked here)
│   │   ├── effect-runtime-and-layers/
│   │   ├── effect-schema-as-contract/
│   │   ├── effect-vitest-testing/
│   │   └── ... (30+ more — see trae-skill-index)
│   │
│   └── scripts/
│       ├── gates/               # Process (.py) + frontend (.ts) gates
│       │   ├── run.py           # Unified runner (resolves mounts from role_matrix.json)
│       │   ├── ambiguity_gate.py, scope_guard.py, secrets_linter.py, ...
│       │   ├── skill_schema_checker.py, openspec_gate.py, ...
│       │   └── a11y_gate.ts, visual_regression_gate.ts, tsgo_gate.ts, ...
│       ├── harness/
│       │   ├── engine.py        # Lifecycle queue + state machine
│       │   └── self_test.py     # Role-matrix consistency
│       ├── wiki/                # Knowledge-graph linters + compactor
│       └── tools/               # GC + snapshot helpers
│
├── tests/                       # Pytest contract tests for parsing gates
├── examples/
│   └── standard-run-walkthrough/  # Documentation-only end-to-end run example
│
├── AGENTS.md                # 📌 Master rule entry (read every session)
├── CLAUDE.md                # Claude Code-specific entry point
├── PROTOCOL.md              # (mirror lives at .agents/PROTOCOL.md)
├── ENGINEERING_MANUAL.md    # Detailed engineering manual
├── CONTRIBUTING.md          # Contributor workflow
├── CODE_OF_CONDUCT.md       # Contributor Covenant 2.1 (by reference)
├── SECURITY.md              # Vulnerability disclosure policy
├── SKILLS_GOVERNANCE.md     # Skill lifecycle + schema rules
├── CHANGELOG.md             # Significant changes by date
└── LICENSE                  # MIT

🔍 Optional Diagnostic Tools

These scripts provide deterministic quality checks (report only, don't modify files):

Graph Health Check

python .agents/scripts/wiki/wiki_linter.py

Checks: Dead links, orphaned files, index length warnings

Contract Structure Validation

python .agents/scripts/wiki/schema_checker.py

Checks: Missing key sections, JSON example presence

Preference Tag Inspection

python .agents/scripts/wiki/pref_tag_checker.py

Checks: Rule tag conventions for precise retrieval

Unified Gate Runner

python .agents/scripts/gates/run.py --intent <intent> --profile <profile> --phase <phase>

Function: Automatically run relevant gate scripts based on current phase

🎯 Engineering Red Lines

🚫 No Blind Search

Always start from Knowledge Graph Root → drill down through indices. Fallback search only when indices fail.

🚫 No Unauthorized Access

Cross-domain modifications require explicit authorization in openspec.md and confirmation during Review/HITL phases.

🚫 No Runaway Loops

Failure rollback + max retry threshold (3 attempts for scripts, STRICT MAX 2 for compilation). Stop and request human intervention when threshold reached. Never infinite loop.

🚫 No Knowledge Bloat

Specs must be archived after extraction
Stable knowledge must be extracted to indices
Indices exceeding 500 lines must be split into subdirectories
Archive execution transitions automatically in the same session, using targeted git diff <files> or openspec.md to avoid context overload.

⚠️ Known Limitations

This is a pre-1.0 harness. Adopt with eyes open:

Pre-release status — current version is v0.x. The protocol surface (lifecycle phases, role matrix schema, gate exit codes) is mostly stable but not yet frozen. Pin a commit if you depend on byte-for-byte stability.
tsgo_gate.ts enforces tsgo configuration, not diagnostics directly — it verifies that @effect/tsgo is wired into tsconfig.json so that tsserver-aware editors (VS Code, JetBrains, Zed) surface the ~70 Effect-specific diagnostics in real time. Upstream @effect/tsgo is currently a Language Service plugin without a CLI runner, so harness-side diagnostic enforcement is upstream-blocked. The gate is feature-complete for the current upstream surface; when a tsgo check CLI ships, the gate will swap to a diagnostic runner (see the migration note in the gate header). For authoritative type errors today, use your project's tsc --noEmit.
Effect Schema drift detection is opt-in — effect_schema_drift_gate.ts only runs when you wire it into role_matrix.json for your project. There is no global default because schema drift policy depends on whether your project treats schemas as published contracts.
TS gates assume peer dependencies — a11y_gate, visual_regression_gate, bundle_budget_gate, web_vitals_gate, console_error_gate need Playwright + axe + Lost-Pixel + Lighthouse + ts-morph installed in the target project. See .agents/scripts/gates/package.json for the peer-dep list.
Skill governance is manual — the 40+ skills in .agents/skills/ follow a stable → deprecated → archive lifecycle (see SKILLS_GOVERNANCE.md), but moves are still operator-driven. Frontmatter is now CI-validated by skill_schema_checker.py.
LLM-native by design — there is no human CLI, web UI, or REPL. If you are evaluating this with "developer tool" expectations, see the Critical Positioning Statement at the top of this file.
Windows path quoting — gate scripts run on Linux/macOS/Windows, but if you invoke them from PowerShell, double-quote any path containing spaces. WSL is recommended on Windows for the smoothest experience.

If you hit a limitation not listed here, open an issue — known-limitations is a living section.

📖 Related Documentation

📌 Master rules — AGENTS.md (read every session)
📜 Operational protocol (SSOT) — .agents/PROTOCOL.md (routing, lifecycle, hooks, scenarios)
🚀 Agent quickstart — .agents/QUICKSTART.md
🤖 Claude Code entry — CLAUDE.md
📘 Engineering Manual — ENGINEERING_MANUAL.md
🗺️ Knowledge Graph — .agents/llm_wiki/KNOWLEDGE_GRAPH.md
📝 Spec template — .agents/llm_wiki/schema/openspec_schema.md
🔍 Context Funnel — .agents/router/CONTEXT_FUNNEL.md
🎭 Role Matrix (SSOT) — .agents/workflow/role_matrix.json (human view: ROLE_MATRIX.md)
📚 Skill governance — SKILLS_GOVERNANCE.md
🎬 Walkthrough example — examples/standard-run-walkthrough/

Legacy .agents/router/ROUTER.md, .agents/workflow/LIFECYCLE.md, and .agents/workflow/HOOKS.md are deprecated stubs that point at PROTOCOL.md sections — do not add new content there.

🤝 Contributing

Contributions are welcome. Start with CONTRIBUTING.md for the full PR workflow, gate/skill authoring guide, and self-test instructions. By participating you agree to the Code of Conduct. Security issues should follow SECURITY.md (do not open a public issue for vulnerabilities).

Quick checklist:

Read First: AGENTS.md, .agents/PROTOCOL.md, and ENGINEERING_MANUAL.md for deeper context.
Follow Lifecycle: All non-trivial changes go through the 6-phase lifecycle.
Update Knowledge: Extract stable knowledge to the appropriate domain index.
Run Diagnostics: python3 .agents/scripts/harness/self_test.py and the relevant gates.
Submit PR: Include openspec.md (or a Slim Spec for @patch) for any change with MEDIUM or HIGH risk.

🏷️ Versioning

The harness follows semver. The current release is v0.1.0 — the first tagged release after a feature-complete polish pass on the React + Effect frontend scope. Pre-1.0 minor versions may include breaking changes to the protocol surface (lifecycle phases, role matrix shape, gate exit codes); pin a specific tag in your downstream projects if you need byte-for-byte stability. From v1.0.0 onward, breaking changes will follow a documented deprecation window of one minor release.

See CHANGELOG.md for the full release history.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

This framework draws inspiration from:

OpenSpec: Contract-first development methodology
Harness: Lifecycle state machines & hook systems
LLM Wiki: Evolvable knowledge graphs with anti-bloat mechanisms
Agentic Patterns: Autonomous agent workflows with human-in-the-loop checkpoints

⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.agents		.agents
.claude		.claude
.github/workflows		.github/workflows
assets/readme		assets/readme
examples/standard-run-walkthrough		examples/standard-run-walkthrough
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ENGINEERING_MANUAL.md		ENGINEERING_MANUAL.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SKILLS_GOVERNANCE.md		SKILLS_GOVERNANCE.md

Folders and files

Latest commit

History

Repository files navigation