Skip to content

ManzoliW/effect-harness-agent

Effect Harness Agent 🚀

An Agent-Driven Engineering Framework for TypeScript & Effect

License TypeScript Effect Agent-Ready Lifecycle

⚠️ Critical Positioning Statement: An LLM-Native Operating System

"Do not hand a steering wheel to an engine."

This repository is a machine-to-machine infrastructure. It is not a TypeScript framework like NestJS, nor is it a CLI tool for human engineers. It is a Cognitive Harness—an executable protocol designed by humans, but read, interpreted, and executed exclusively by Large Language Models (LLMs).

By encoding engineering discipline (Lifecycle State Machines, Role Matrices, Vector-less Knowledge Graphs) into LLM-native formats, it transforms the AI from a simple code-completion oracle into an autonomous, self-navigating, and self-correcting engineering agent.

Effect Harness Agent is an agent-driven engineering framework for TypeScript/Effect projects, designed for sustainable software development. It integrates an Intent Gateway, a 6-phase Lifecycle State Machine, Contract-first OpenSpec design, and a drill-down LLM Wiki (Knowledge Graph) to prevent context bloat, enabling AI agents to autonomously build, test, and self-correct production-ready code.

Engineering Manual | Quick Start

📖 Overview

Effect Harness Agent is an innovative agent-driven development workflow that bridges the gap between natural language requirements and production-ready TypeScript/Effect code. Built on Intent Gateway, Lifecycle State Machine, Knowledge Graph (LLM Wiki), and Specialized Skills Matrix, it enables sustainable, interruptible, self-correcting, and anti-bloat engineering closed-loops.

✨ Key Features

  • 🎯 Intent-Driven: Natural language → Structured intent queues → Executable tasks
  • 🔄 Lifecycle State Machine: Explorer → Propose → Review → Approval Gate (HITL) → Implement → Validation Gate → QA → Archive
  • 🧠 Knowledge Graph: Hierarchical wiki system with bidirectional navigation
  • 🛡️ Self-Correcting: Automatic guard hooks, failure recovery, human-in-the-loop checkpoints
  • 📊 Contract-First: OpenSpec-based design before implementation
  • 🔌 Skills Matrix: 25+ specialized skills providing domain expert capabilities
  • 📈 Anti-Bloat Mechanism: Automatic knowledge extraction and archival to prevent information overload

💰 Token Economics & Cost Model

Given that Effect Harness Agent is an agentic framework, it inherently consumes more tokens than a simple code completion tool. However, its architecture shifts costs from Rework & Blind Search to Planning & Guardrails, resulting in highly predictable and stable overall costs for complex tasks.

1. The "Thinking Tax" (Where Costs Increase)

  • Each turn requires the LLM to output the <Cognitive_Brake> and read mandatory system contexts (e.g., LIFECYCLE.md, AGENTS.md). This adds a fixed baseline "thinking tax" of ~500 Output Tokens and ~2000 Input Tokens per interaction.
  • The Propose phase explicitly requires drafting explore_report.md and openspec.md, consuming an extra ~1500 Output Tokens before a single line of code is written.

2. The ROI: Comparing 3 Paradigms for a Complex Feature (e.g., Cross-Table Transaction)

Note: Estimates assume a typical modern flagship LLM (e.g., GPT-4o, Claude 3.5/3.7 Sonnet, Gemini 1.5 Pro) pricing model.

Paradigm Behavior Input Tokens Output Tokens Hidden Costs / Risks Verdict
Pure Chat / Copilot Jumps straight to coding with limited context. ~5k ~1k High Rework Rate. Misses transaction boundaries, forgets existing enums. Requires human prompt corrections. Cheap in Tokens, Expensive in Human Time.
Unconstrained Auto-Agent Blindly searches the entire codebase (e.g., SearchCodebase or Grep without limits), loops endlessly on compile errors. 100k+ 10k+ Disastrous. Burns through budget via massive context bloat and infinite loops before hitting platform limits. Unpredictable & Dangerous.
Effect Harness Agent Pays the "Thinking Tax", limits searches (Wiki≤3, Code≤8), designs openspec.md, and STOPs at Approval Gates. ~30k ~6k Highly predictable. Architectural errors are intercepted early by humans; syntax errors are digested by Shift-Left Validation. The Sweet Spot. Optimized for high-quality delivery with controlled token spend.

3. Scenario-Based Token Estimates (Using Latest Flagship Models)

Scenario Profile Typical Turns Input Tokens Output Tokens Expected Cost / Task
@patch (Small Bugfix) 1-2 Turns ~5k - 8k ~1k - 2k $0.05 - $0.15
@standard (New Feature) 4-6 Turns ~20k - 40k ~4k - 8k $0.30 - $0.80
@learn (Doc QA) 1 Turn ~3k - 5k ~500 $0.02 - $0.05

4. Token Optimization Formula

Total Token Cost = (Base Context + Context Funnel Payload) × Turns + (Artifact Generation + Code Generation + Cognitive Brake)

To minimize costs:

  1. Use Shortcuts: Use @patch instead of @standard for trivial changes to skip Phase 1-3.
  2. Provide Explicit Scopes: Include --scope src/Foo.java in your prompt. This triggers Rule 0 (Direct Read), bypassing the entire Knowledge Funnel drill-down process and saving thousands of Input Tokens.
  3. Respect the Brakes: When the agent STOPs at the Validation Gate, ensure your local environment is ready before allowing it to compile, preventing retry loops.

🏗️ Architecture

Core Philosophy

Three Fundamental Problems Solved by Effect Harness Agent:

  1. Context Bloat Out of Control: LLM blind searching in large codebases leads to token waste and attention dispersion → Solved through Knowledge Graph + Budgeted Navigation
  2. Requirement Drift & Unauthorized Modifications: Agent free-play causes cross-domain pollution and contract corruption → Solved through Intent Gateway + Role Matrix Guards
  3. Knowledge Fragmentation & Unsustainability: Conversation memory loss, documentation desynchronization, index bloat → Solved through WAL Write-back + Auto-Refactoring

Design Philosophy: Encode engineering discipline into LLM-executable protocols, enabling machine-to-machine self-coordination, self-correction, and self-evolution.

System Architecture Diagram

flowchart TB
    subgraph Input["📥 Input Layer"]
        User[👤 User Requirements]
        Shortcut["⚡ Shortcuts<br/>@read/@patch/@standard"]
    end
    
    subgraph Gateway["🎯 Intent Gateway Layer (ROUTER)"]
        IG[Intent Gateway<br/>Intent Classifier]
        Profile{Execution Profile Selection}
        LEARN[LEARN<br/>Read-only Q&A]
        PATCH[PATCH<br/>Small Changes]
        STANDARD[STANDARD<br/>Full Lifecycle]
    end
    
    subgraph Context["🔍 Context Collection Layer (FUNNEL)"]
        DirectRead[Direct Read<br/>When scope explicit]
        Funnel[Knowledge Funnel<br/>Sitemap→Index→Doc]
        Budget[Budget Control<br/>Wiki≤3, Code≤8]
        Escalation[Escalation Protocol<br/>Escalation Card]
    end
    
    subgraph Knowledge["🧠 Knowledge Graph Layer (LLM Wiki)"]
        KG[KNOWLEDGE_GRAPH.md<br/>Root Node]
        DomainIndex["Domain Indices<br/>api/data/domain etc."]
        Docs[Specific Documents]
        Archive[Archive Zone<br/>Cold Storage]
    end
    
    subgraph Lifecycle["⚙️ Lifecycle Engine Layer (WORKFLOW)"]
        LaunchSpec[Launch Spec<br/>State Machine Table]
        Phase1[1_Explorer<br/>Clarify Requirements]
        Phase2[2_Propose<br/>Freeze Contracts]
        Phase3[3_Review<br/>Technical Review]
        ApprovalGate[Approval Gate<br/>HITL Checkpoint]
        Phase4[4_Implement<br/>Implement per Contract]
        Phase5[5_QA<br/>Test Validation]
        Phase6[6_Archive<br/>Knowledge Extraction]
    end
    
    subgraph Roles["🎭 Role Matrix Layer (ROLE MATRIX)"]
        Ambiguity[Ambiguity Gatekeeper<br/>Ambiguity Guard]
        ReqEngineer[Requirement Engineer<br/>Requirements]
        SysArchitect[System Architect<br/>Architecture]
        LeadEngineer[Lead Engineer<br/>Code & Shift-Left]
        CodeReviewer[Code Reviewer<br/>Quality QA]
        FocusGuard[Focus Guard<br/>Anti-Drift Guard]
        KnowledgeExt[Knowledge Extractor<br/>Unified WAL]
        SecuritySentinel[Security Sentinel<br/>Security Sentinel]
        DocCurator[Documentation Curator<br/>Doc Curator]
        SkillCurator[Skill Graph Curator<br/>Skills]
        Librarian[Librarian<br/>GC & Compaction]
        KnowledgeArch[Knowledge Architect<br/>Knowledge Architect]
    end
    
    subgraph Hooks["🛡️ Hook Correction Layer (HOOKS)"]
        PreHook[pre_hook<br/>Load Rule Sets]
        GuardHook[guard_hook<br/>Execution Guard]
        PostHook[post_hook<br/>Post-Audit]
        FailHook[fail_hook<br/>Failure Rollback]
        LoopHook[loop_hook<br/>Queue Loop]
    end
    
    subgraph Skills["🔧 Skills Matrix Layer (SKILLS)"]
        SkillIndex[trae-skill-index<br/>Master Skill Index]
        EffectSkills[TypeScript/Effect Skills<br/>25+ Professional Capabilities]
    end
    
    subgraph Scripts["📜 Script Tools Layer (SCRIPTS)"]
        Gates["Gate Scripts<br/>ambiguity_gate.py etc."]
        WikiTools[Wiki Tools<br/>linter/compactor]
        Engine[Engine Helper<br/>engine.py]
    end
    
    User --> IG
    Shortcut --> IG
    IG --> Profile
    Profile -->|LEARN| DirectRead
    Profile -->|PATCH| LaunchSpec
    Profile -->|STANDARD| LaunchSpec
    
    DirectRead --> Funnel
    Funnel --> Budget
    Budget --> Escalation
    
    KG --> DomainIndex
    DomainIndex --> Docs
    Docs --> Archive
    
    LaunchSpec --> Phase1
    Phase1 --> Phase2
    Phase2 --> Phase3
    Phase3 --> ApprovalGate
    ApprovalGate --> Phase4
    Phase4 --> Phase5
    Phase5 --> Phase6
    Phase6 --> LaunchSpec
    
    Phase1 -.->|Mount| Ambiguity
    Phase1 -.->|Mount| ReqEngineer
    Phase1 -.->|Mount| FocusGuard
    Phase2 -.->|Mount| SysArchitect
    Phase3 -.->|Mount| SysArchitect
    Phase4 -.->|Mount| LeadEngineer
    Phase4 -.->|Mount| FocusGuard
    Phase4 -.->|Mount| SecuritySentinel
    Phase5 -.->|Mount| CodeReviewer
    Phase5 -.->|Mount| DocCurator
    Phase6 -.->|Mount| KnowledgeExt
    Phase6 -.->|Mount| DocCurator
    Phase6 -.->|Mount| SkillCurator
    Phase6 -.->|Mount| Librarian
    
    Phase1 -.->|Trigger| PreHook
    Phase4 -.->|Trigger| GuardHook
    Phase5 -.->|Trigger| PostHook
    Phase3 -.->|Trigger| FailHook
    Phase5 -.->|Trigger| FailHook
    Phase6 -.->|Trigger| LoopHook
    
    Ambiguity -.->|Invoke| Gates
    GuardHook -.->|Invoke| Gates
    PostHook -.->|Invoke| WikiTools
    KnowledgeArch -.->|Invoke| WikiTools
    
    Phase1 -.->|Query| SkillIndex
    Phase2 -.->|Query| SkillIndex
    Phase4 -.->|Query| EffectSkills
Loading

Core Components Breakdown

Layer Component Responsibility Key File
Input Intent Gateway Natural language → Structured intents + Execution profiles ROUTER.md
Context Knowledge Funnel Bidirectional navigation (forward retrieval + reverse write-back) CONTEXT_FUNNEL.md
Knowledge LLM Wiki Fractal knowledge graph (Sitemap/Index/Docs/Archive) KNOWLEDGE_GRAPH.md
Process Lifecycle Engine 6-phase state machine + breakpoint resume LIFECYCLE.md
Roles Role Matrix Dynamic virtual role mounting + gate guards ROLE_MATRIX.md
Correction Hooks System Pre/guard/post/fail/loop interception HOOKS.md
Capability Skills Matrix 25+ domain-specific expert capabilities trae-skill-index
Tools Script Tools Deterministic quality checks + auxiliary tools scripts/

🚀 Quick Start

Prerequisites

  • TypeScript 5.0+
  • Python 3.8+ (for script tools)
  • Git

3-Minute Onboarding Guide

Step 1: Read Project Rules ⚡

Start with AGENTS.md - the master entry point defining execution discipline with hard constraints and navigation rules.

Core Constraints Quick Reference:

  • Budget Limits: Wiki ≤ 3 docs, Code ≤ 8 files (same-file pagination doesn't count)
  • Cognitive Brake: Mandatory <Cognitive_Brake> XML block before any action to enforce Role, Scope, and Budget awareness
  • Approval & Validation Gates: Must STOP for human confirmation before writing code (Approval Gate) and before heavy compilation (Validation Gate)
  • Anti-Looping: Max 3 retries for scripts/linters; STRICT MAX 2 retries for compilation. Exceeding thresholds MUST request human intervention
  • Scope Guard: Cannot modify files outside focus_card.md agreed scope without explicit authorization

Step 2: Understand Intent Gateway 🎯

The Intent Gateway transforms natural language into executable queues, supporting three execution profiles:

Profile Use Case Lifecycle Entry Artifacts
LEARN Read-only explanation, code understanding ❌ No None
PATCH Small changes, bug fixes (LOW risk) ✅ Minimal Slim Spec + Change Log
STANDARD MEDIUM/HIGH risk, wide blast radius ✅ Full 6-phase Full OpenSpec + Approval Gate

Shortcuts (Explicit Routing):

@read / @learn     → Force LEARN mode (read-only)
@patch / @quickfix → Force PATCH mode (small changes)
@standard          → Force STANDARD mode (full lifecycle)

Shortcut DSL Examples:

@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API

Step 3: Navigate Knowledge Graph 🗺️

Rule 0: Direct Read when scope is explicit (MUST)

  • If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
    • ✅ Do direct read first
    • ❌ Do NOT start with Knowledge Graph drill-down

Rule 1: Otherwise, use Knowledge Funnel (MUST)

  1. Read root: KNOWLEDGE_GRAPH.md
  2. Drill down via: CONTEXT_FUNNEL.md
  3. If unsure which skill to use, consult: trae-skill-index

Common Domain Indices:

Step 4: Run Your First Complete Cycle 🔄

Complete one STANDARD task following the Lifecycle:

stateDiagram-v2
    [*] --> Explorer: Clarify Requirements
    Explorer --> Propose: Freeze Contracts
    Propose --> Review: Technical Review
    Review --> ApprovalGate: HITL Checkpoint
    ApprovalGate --> Implement: Implement per Contract
    Implement --> ValidationGate: STOP & Request Compile
    ValidationGate --> QA: Test Validation
    QA --> Archive: Knowledge Extraction (Same Session)
    Archive --> [*]: Queue Complete
    
    Review --> Propose: fail_hook(review failed)
    QA --> Implement: fail_hook(compile/test failed, max 2 retries)
    
    note right of ApprovalGate
        MEDIUM/HIGH Risk:
        Must wait for human confirmation
        Status=WAITING_APPROVAL
    end note
    
    classDef explorerClass fill:#e1f5ff,stroke:#333
    classDef proposeClass fill:#fff4e6,stroke:#333
    classDef reviewClass fill:#ffe6e6,stroke:#333
    classDef approvalClass fill:#fff9e6,stroke:#333
    classDef implementClass fill:#e6ffe6,stroke:#333
    classDef validationClass fill:#fff9e6,stroke:#333
    classDef qaClass fill:#f0e6ff,stroke:#333
    classDef archiveClass fill:#e6f0ff,stroke:#333
    
    class Explorer explorerClass
    class Propose proposeClass
    class Review reviewClass
    class ApprovalGate approvalClass
    class Implement implementClass
    class ValidationGate validationClass
    class QA qaClass
    class Archive archiveClass
Loading

Breakpoint Resume Mechanism:

  • Launch Spec persisted at router/runs/launch_spec_*.md
  • First action after session interruption: read this file to restore state
  • Status enum: PENDING, IN_PROGRESS, DONE, WAITING_APPROVAL, FAILED

💡 Usage Scenarios

Scenario A: New Query API (No DB Changes)

Goal: Create read-only endpoints (DTO/Controller/Service) without table structure changes

graph LR
    A[Explorer<br/>Clarify Requirements] --> B[Propose<br/>OpenSpec]
    B --> C[Review<br/>Technical Review]
    C --> D[Approval<br/>HITL Checkpoint] --> E[Implement<br/>Per Contract]
    E --> V[Validation<br/>STOP Gate]
    V --> F[QA<br/>Test Validation]
    F --> G[Archive<br/>Update Index]
    
    style A fill:#e1f5ff,stroke:#333,stroke-width:2px
    style B fill:#fff4e6,stroke:#333,stroke-width:2px
    style C fill:#ffe6e6,stroke:#333,stroke-width:2px
    style D fill:#fff9e6,stroke:#333,stroke-width:2px
    style E fill:#e6ffe6,stroke:#333,stroke-width:2px
    style V fill:#fff9e6,stroke:#333,stroke-width:2px
    style F fill:#f0e6ff,stroke:#333,stroke-width:2px
    style G fill:#e6f0ff,stroke:#333,stroke-width:2px
Loading

Key Deliverables:

  • explore_report.md - Scope & impact analysis + Core Context Anchors
  • openspec.md - API contract with JSON examples, acceptance criteria
  • ✅ Implementation following contract (no over-engineering)
  • ✅ Unit tests with coverage evidence
  • ✅ Update API index in wiki/api/ (WAL mechanism)

Mounted Roles (resolved from role_matrix.json):

  • Explorer: ambiguity_gatekeeper, requirement_engineer, focus_guard
  • Propose: system_architect
  • Implement: lead_engineer, focus_guard, security_sentinel
  • QA: code_reviewer, documentation_curator, accessibility_auditor, visual_critic, performance_warden
  • Archive: knowledge_extractor, documentation_curator, skill_graph_curator, librarian

Scenario B: API + Database Schema Changes

Goal: New endpoint with table structure & index modifications

Critical Path:

  1. Propose: Freeze both API & Data contracts simultaneously (field semantics, constraints, index design, compatibility strategy)
  2. Review: SQL risk assessment, index utilization, implicit conversion checks, authorization risks
  3. QA: Regression tests covering core queries & edge cases
  4. Archive: Update both wiki/api/ and wiki/data/ indices, synchronize ER diagrams

Activated Skills (frontend / Effect-native; backend SQL skills are archived):

  • effect-schema-as-contract - Schema design for shared contracts (frontend ↔ API)
  • consumed-api-contracts - Pinning + tracking server contracts the client consumes
  • effect-schema-form - Form binding when the schema also drives a UI

Scenario C: Bug Fix (Reproduce First, Then Test)

Goal: Fix defects ensuring reproducibility, regressability, and traceability

stateDiagram-v2
    [*] --> Explorer
    Explorer --> Implement: Identify Root Cause
    Implement --> QA: Write Failing Test First
    QA --> Implement: fail_hook (test failed)
    Implement --> QA: Fix to Pass Test
    QA --> Archive: Regression Test Suite
    Archive --> [*]
    
    note right of QA
        TDD Approach:
        1. Write failing test first
        2. Fix to pass test
        3. Add regression tests
    end note
Loading

Workflow:

  1. Explorer: Minimal reproduction path + root cause hypothesis + impact analysis (whether Propose/contract update needed)
  2. QA: Write failing test BEFORE fix (TDD approach)
  3. Implement: Fix implementation to pass test
  4. Archive: Record pattern in wiki/testing/ or reviews/, update related API/Domain indices if necessary

Profile: PATCH (LOW risk) or STANDARD (MEDIUM/HIGH risk)


Scenario D: Performance Optimization

Goal: Optimize SQL/performance without changing external behavior

Focus Areas:

  • Propose: Document "behavior unchanged" constraints + rollback strategy
  • Review: SQL standards & index utilization as top priority
  • QA: Comparative evidence (performance benchmarks + correctness)
  • Archive: Extract reusable performance rules to preferences/

Activated Skills:

  • bundle-budget-standard - Bundle-size guards (top priority for client perf)
  • effect-fiber-and-stream - Concurrency / scheduling primitives when refactoring hot paths
  • code-review-checklist - Refactoring review surface

Scenario E: Refactoring (With Boundary Guards)

Goal: Improve maintainability without introducing requirement drift

Guardrails:

  • Explicit "what's in / what's out" scope definition (Focus Card)
  • Cross-domain modifications require explicit authorization (guard_hook)
  • Architecture decisions written back to wiki/architecture/

Activated Roles:

  • Ambiguity Gatekeeper - Ambiguity guard
  • Focus Guard - Anti-drift guard
  • Knowledge Architect - Knowledge architect (if Wiki refactoring needed)

Scenario F: Parallel Collaboration

Goal: Server-led delivery with optional UI/QA parallel work

sequenceDiagram
    participant S as Server Agent
    participant H as Human (HITL)
    participant U as UI Agent
    participant Q as QA Agent
    
    S->>S: Explorer → Propose
    S->>H: Request Approval (Freeze Contract)
    H->>S: ✅ Approved
    par Parallel Execution
        S->>S: Implement Code
        U->>U: Build UI from API Contract
        Q->>Q: Write Tests from Acceptance Criteria
    end
    S->>S: QA → Archive
Loading

Key Handoff Points:

  • Approval Gate Phase: Frozen OpenSpec becomes single source of truth, acts as "starting gun" for parallel collaboration
  • Minimal Handoff: API Contract (JSON examples), Acceptance Criteria (Given/When/Then), Error Codes
  • Server Cohesion: Internal details remain encapsulated (not forced outward)

Scenario G: Read-Only Audit (Audit.Codebase)

Goal: Perform read-only analysis and assessment of the codebase, producing structured audit reports

Constraints:

  • ❌ No code modifications
  • ❌ No Wiki writes
  • ❌ No launch spec generation
  • ❌ No lifecycle entry

Allowed Operations:

  • ✅ Read-only retrieval and reading
  • ✅ Run tests/builds (but do not modify any tracked files)

Output Requirements: Each conclusion must include evidence (file path + line range) and impact/recommendations

Typical Scenarios: Architecture review, code quality scanning, technical debt assessment


Scenario H: Documentation Q&A (QA.Doc / QA.Doc.Actionize)

QA.Doc (Pure Q&A)

  • Goal: Answer questions based on Wiki/requirement documents
  • Method: Drill down through knowledge funnel, output answers with citations
  • Citations: Wiki/requirement paragraphs, supplement with code references when needed
  • Does NOT trigger lifecycle

QA.Doc.Actionize (Q&A to Action)

  • Goal: Convert Q&A conclusions into executable intent queues
  • Critical Step: Must ask user whether to "launch" first
  • After Confirmation: Generate launch spec and enter lifecycle
  • Without Confirmation: Output answer only, no side effects

Typical Scenarios: Query business rules, understand API usage, confirm architecture decisions


🚦 Intent Gateway: From Natural Language to Executable Queues

The Intent Gateway transforms natural language requirements into structured intent queues that drive the entire lifecycle.

Execution Profiles

Not every request needs the full lifecycle. The gateway selects an execution profile:

Profile When to Use Lifecycle Entry Artifacts
LEARN Read-only explanation, code understanding No None
PATCH Small changes, bug fixes (LOW risk) Minimal Slim Spec or Change Log
STANDARD MEDIUM/HIGH risk, wide blast radius Full 6-phase Full OpenSpec + Approval Gate

Shortcuts (Explicit Routing)

Users can override automatic routing with explicit shortcuts:

  • @read / @learn: Force Profile LEARN (read-only, no write-back)
  • @patch / @quickfix: Force Profile PATCH (small change mode)
  • @standard: Force Profile STANDARD (full lifecycle)

Shortcut DSL (Composable)

Shortcuts can be composed with flags to express common workflows as a small DSL.

Syntax:

@<profile> <flags...> -- <natural language request or question>

Flags (order-independent):

  • Scope / read:
    • --scope <path|glob|symbol>: explicit scope (file/dir/symbol)
    • --direct: force direct reads (do not start with Knowledge Graph drill-down)
    • --funnel: force the funnel even if scope is explicit
    • --depth shallow|normal|deep: explanation depth (LEARN only)
  • Risk / artifacts:
    • --risk low|medium|high: explicit risk override
    • --slim: force Slim Spec (PATCH only, or STANDARD with --risk low)
    • --changelog: use Change Log only (PATCH only)
    • --evidence required|optional|none: evidence requirement (default: PATCH=required)
  • Launch / write-back:
    • --launch: force lifecycle launch (STANDARD only)
    • --no-launch: force no launch
    • --writeback: allow wiki/WAL write-back (not allowed for LEARN)
    • --no-writeback: forbid write-back (default)
  • Verification:
    • --test "<cmd>": required verification command + evidence
    • --no-test: skip tests (LEARN only; PATCH requires an explicit justification)
  • DocQA actionize:
    • --actionize: convert DocQA into an executable STANDARD queue (requires confirmation)
    • --yes: auto-confirm --actionize / --launch (team use with caution)

Conflict rules (MUST enforce):

  • @learn MUST NOT be combined with --launch or --writeback.
  • --launch MUST be used with @standard only.
  • --slim requires --risk low (or implied low risk in PATCH).
  • --actionize MUST ask for confirmation unless --yes is present.

Examples:

@learn --scope src/foo/bar.ts --direct --depth deep -- explain this file
@patch --risk low --slim --test "pnpm test -t "OrderServiceTest"" -- fix NPE in createOrder
@standard --risk high --launch -- implement tenant permission checks for order list API
@learn --funnel -- what is the API design standard? --actionize

Core Intent Types

The gateway maps requests to a small set of top-level intents:

Intent When to Use Default Profile Launch Spec Write-back
Learn "Explain/read/understand this code" with explicit scope LEARN No No
Change "Modify code" (feature, refactor, bugfix) PATCH or STANDARD Yes (STANDARD only) Optional (Archive)
DocQA "What is the rule/process/template?" LEARN No No (unless actionized)
Audit "Assess the codebase" (read-only review/risk scan) LEARN No No

Context Collection Rules

Rule 0: Direct Read when scope is explicit (MUST)

  • If user provides explicit scope (file path, class/method name, pasted snippet) and goal is learning:
    • ✅ Do direct read first
    • ❌ Do NOT start with Knowledge Graph drill-down
    • Use funnel only if background context needed after first read

Rule 1: Otherwise, use Knowledge Funnel (MUST)

  1. Read root: KNOWLEDGE_GRAPH.md
  2. Drill down via: CONTEXT_FUNNEL.md
  3. If unsure which skill to use, consult: trae-skill-index

Budgeted Navigation & Escalation

Budgeted Navigation (MUST) For Change and Audit intents, uncontrolled exploration is forbidden.

Default budgets:

  • Wiki budget: 3 documents
  • Code budget: 8 files
  • Pagination reads within the same file do NOT count as additional file reads

Saturation Gate (Stop Reading When Enough) Stop reading and move to decision/implementation when ANY is met:

  • Template acquired: any 2 of (route shape, DTO validation style, service entry pattern, mapper/sql pattern, table field pattern)
  • Integration point acquired: a concrete example of the dependency usage
  • Executable chain acquired: a known good call chain exists and the remaining work is a mechanical extension

Stop-Wiki (MUST) If 3 consecutive wiki reads are "no-gain", the Agent MUST stop wiki navigation and proceed with a minimal, standards-compliant decision.

Stop-Code (MUST) Code reading must monotonically shrink scope. If scope does not shrink for 2 consecutive code reads, the Agent MUST stop reading and trigger Escalation Protocol.

Escalation Protocol (MUST) If budgets are exhausted OR stop rules trigger and success criteria are not met, the Agent MUST request human help instead of continuing to read.

Escalation Card format:

  • Consumed: wiki X/3, code Y/8
  • Confirmed facts (<= 5 bullets)
  • Missing info (<= 2 bullets, must be specific)
  • Why it is blocking (one sentence)
  • Proposed next targets (<= 5 file paths / keywords)
  • Request: wiki +1 or code +2 (small step)
  • Fallback if still missing: pick one of:
    • ask 1 critical question
    • request a concrete anchor (class/table/entrypoint) from human
    • deliver a minimal viable plan with explicit risks

When escalation blocks the workflow, set the intent row in launch_spec_*.md to WAITING_APPROVAL and include a link to the relevant artifact.

Internal Lifecycle Queue Codes (STANDARD Profile Only)

When Profile is STANDARD, the Change intent expands into:

Code Phase Notes
Explore.Req Explorer Clarify requirements + scope anchors
Propose.API Propose → Review API contract and design
Propose.Data Propose → Review Database schema changes
Implement.Code Implement → QA Code changes
QA.Test QA Tests + evidence

Launch Spec Template (Machine-Friendly, Supports Breakpoint Resume)

Status values: PENDING, IN_PROGRESS, DONE, WAITING_APPROVAL, FAILED

# Launch Spec - {YYYYMMDD_HHMMSS}

## State Machine
| Intent | Status | Phase | Artifact/Log | Failed_Reason |
|---|---|---|---|---|
| Explore.Req | IN_PROGRESS | 1_Explorer | `explore_report.md` | - |
| Propose.API | PENDING | - | - | - |
| Implement.Code | PENDING | - | - | - |

## Breakpoint Resume
- If session interrupted/human delayed: First action is to read this file upon wake-up.
- If `WAITING_APPROVAL` exists: Enter Approval checkpoint, read corresponding `openspec.md`, wait for human confirmation, then switch status to `IN_PROGRESS` and proceed to Implement.
- If `FAILED` exists: Stop automatic progression, report `Failed_Reason` to human and request intervention.

Key Discipline: The state machine table drives workflow progression. Only update Status/Phase/Failed_Reason fields to avoid checkbox matching failures and state confusion.


🛡️ Self-Correction Mechanisms

Mechanism Trigger Point Trigger Condition Effect Evaluation Method
Cognitive_Brake Before any action Protocol enforcement Forces LLM to explicitly reason about roles, boundaries, budgets, and next steps before generating tools or code XML CoT parsing
pre_hook Before entering new phase Phase transition Load relevant rule sets + output Decision-First Preflight + budgets Required output format
guard_hook During implementation/modification Style violations, permission breaches, cross-domain pollution, budget exhaustion Immediate block, require rewrite or authorization; enforce Anti-runaway guard Standard skill review + Budget rules
fail_hook Any phase failure Compilation/test/review failures State downgrade rollback; log failure reason to openspec.md; trigger retry count Objective logs (compilation/test output)
Max Retries Inside fail_hook Same phase consecutive failures reach threshold Force stop and request human intervention (Max 3 for scripts, STRICT MAX 2 for compilation) Failure count reaches threshold
Approval Gate (HITL) After Review passes Need to enter Implement "Freeze contract", human authorizes whether to proceed Human confirmation (YES/NO + modification feedback)
Doc Consistency Gate post_hook / Archive Wiki hallucination & contract corruption risk Read-only validation (schema_checker.py + wiki_linter.py), trigger fail_hook on FAIL Script exit codes (non-zero = FAIL)
Archive Write-back Task completion New/changed knowledge needs persistence Extract stable knowledge from Spec, archive hot documents, update indices (WAL mechanism) Rule validation, connectivity check
Preferences Memory Before/after Archive Representative human ratings/feedback Persist experience as preferences/anti-patterns to wiki/preferences/index.md, effective in next pre_hook Human rating + textual reasoning
Non-Convergence Fallback Workflow stuck repeating same action Doc rewrite or linter failure loop Stop repeating, run deterministic verification, report mismatch, request human intervention Evidence-based mismatch detection

🔧 Skills Matrix

The harness ships 35 active skills under .agents/skills/. The canonical list is trae-skill-index/SKILL.md — what follows is a tour of the major sections.

Entry & routing

  • intent-gateway — entry node for every task. Routes natural language into the correct intent + profile + role mounts. Also handles Audit.Codebase, QA.Doc, QA.Doc.Actionize.
  • trae-skill-index — central index for skill discovery.
  • skill-graph-manager — maintains bidirectional links between related skills.

React + Effect frontend (the current target)

Code-health patterns (project-agnostic)

Code standards & engineering

Frontend QA & UX standards

Deprecated (pending archival)

  • devops-lifecycle-master → use intent-gateway + .agents/PROTOCOL.md.
  • devops-testing-standard → use effect-vitest-testing.

8 first-generation devops-* skills, plus prd-task-splitter and product-manager-expert, were archived 2026-05-09. Their responsibilities are now covered by roles in role_matrix.json. See .agents/skills/_archive/README.md.

Phase → mounted roles

The skill-to-phase mapping is now driven by role_matrix.json, not by skill names. The phase mounts roles; the roles invoke skills + gates. See .agents/workflow/ROLE_MATRIX.md (auto-generated from JSON) for the canonical mapping.

Phase Mounted roles
Explorer ambiguity_gatekeeper, requirement_engineer, focus_guard
Propose system_architect
Review system_architect
Approval (HITL) — (human checkpoint)
Implement lead_engineer, focus_guard, security_sentinel
QA code_reviewer, documentation_curator, accessibility_auditor, visual_critic, performance_warden
Archive knowledge_extractor, documentation_curator, skill_graph_curator, librarian

📂 Project Structure

effect-harness-agent/
├── .agents/
│   ├── PROTOCOL.md              # 📜 SSOT for routing, lifecycle, hooks, scenarios
│   ├── QUICKSTART.md            # 1-page agent quickstart (eager-load on session start)
│   │
│   ├── router/
│   │   ├── CONTEXT_FUNNEL.md    # Knowledge funnel + reverse write-back
│   │   ├── ROUTER.md            # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   └── runs/                # Launch specs (gitignored)
│   │
│   ├── workflow/
│   │   ├── role_matrix.json         # 🎭 SSOT — roles, gate mounts, phase routing
│   │   ├── role_matrix.schema.json  # JSON Schema for IDE validation
│   │   ├── ROLE_MATRIX.md           # Human-readable view (auto-generated)
│   │   ├── LIFECYCLE.md             # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   ├── HOOKS.md                 # ⚠️ Deprecated stub — see PROTOCOL.md
│   │   ├── ARCHIVE_WAL.md           # WAL conventions
│   │   ├── bundle-budget.json       # Default bundle budgets
│   │   ├── lighthouse-budget.json   # Default Web Vitals budgets
│   │   └── runs/                    # Gates reports + engine_state.json (gitignored)
│   │
│   ├── llm_wiki/
│   │   ├── KNOWLEDGE_GRAPH.md   # 🗺️ Root node (mandatory entry)
│   │   ├── purpose.md
│   │   ├── schema/
│   │   │   ├── openspec_schema.md
│   │   │   └── subagent_contract_schema.md
│   │   ├── wiki/                # Active knowledge domains
│   │   │   ├── api-contracts/   # External APIs the client consumes
│   │   │   ├── routes/          # Frontend route map
│   │   │   ├── components/      # Component catalog
│   │   │   ├── design-system/   # Tokens, primitives
│   │   │   ├── flows/           # User flows
│   │   │   ├── runtime/         # ManagedRuntime composition
│   │   │   ├── layers/          # Effect Layer definitions
│   │   │   ├── services/        # Service interfaces
│   │   │   ├── schemas/         # Effect Schema definitions
│   │   │   ├── architecture/    # ADRs (WAL fragments)
│   │   │   ├── specs/           # Active openspec.md
│   │   │   └── preferences/     # Project preferences
│   │   └── archive/             # Cold storage (extracted specs)
│   │
│   ├── skills/                  # 35 active skills + _archive/
│   │   ├── intent-gateway/
│   │   ├── trae-skill-index/    # Central index (every skill linked here)
│   │   ├── effect-runtime-and-layers/
│   │   ├── effect-schema-as-contract/
│   │   ├── effect-vitest-testing/
│   │   └── ... (30+ more — see trae-skill-index)
│   │
│   └── scripts/
│       ├── gates/               # Process (.py) + frontend (.ts) gates
│       │   ├── run.py           # Unified runner (resolves mounts from role_matrix.json)
│       │   ├── ambiguity_gate.py, scope_guard.py, secrets_linter.py, ...
│       │   ├── skill_schema_checker.py, openspec_gate.py, ...
│       │   └── a11y_gate.ts, visual_regression_gate.ts, tsgo_gate.ts, ...
│       ├── harness/
│       │   ├── engine.py        # Lifecycle queue + state machine
│       │   └── self_test.py     # Role-matrix consistency
│       ├── wiki/                # Knowledge-graph linters + compactor
│       └── tools/               # GC + snapshot helpers
│
├── tests/                       # Pytest contract tests for parsing gates
├── examples/
│   └── standard-run-walkthrough/  # Documentation-only end-to-end run example
│
├── AGENTS.md                # 📌 Master rule entry (read every session)
├── CLAUDE.md                # Claude Code-specific entry point
├── PROTOCOL.md              # (mirror lives at .agents/PROTOCOL.md)
├── ENGINEERING_MANUAL.md    # Detailed engineering manual
├── CONTRIBUTING.md          # Contributor workflow
├── CODE_OF_CONDUCT.md       # Contributor Covenant 2.1 (by reference)
├── SECURITY.md              # Vulnerability disclosure policy
├── SKILLS_GOVERNANCE.md     # Skill lifecycle + schema rules
├── CHANGELOG.md             # Significant changes by date
└── LICENSE                  # MIT

🔍 Optional Diagnostic Tools

These scripts provide deterministic quality checks (report only, don't modify files):

Graph Health Check

python .agents/scripts/wiki/wiki_linter.py

Checks: Dead links, orphaned files, index length warnings

Contract Structure Validation

python .agents/scripts/wiki/schema_checker.py

Checks: Missing key sections, JSON example presence

Preference Tag Inspection

python .agents/scripts/wiki/pref_tag_checker.py

Checks: Rule tag conventions for precise retrieval

Unified Gate Runner

python .agents/scripts/gates/run.py --intent <intent> --profile <profile> --phase <phase>

Function: Automatically run relevant gate scripts based on current phase


🎯 Engineering Red Lines

🚫 No Blind Search

Always start from Knowledge Graph Root → drill down through indices. Fallback search only when indices fail.

🚫 No Unauthorized Access

Cross-domain modifications require explicit authorization in openspec.md and confirmation during Review/HITL phases.

🚫 No Runaway Loops

Failure rollback + max retry threshold (3 attempts for scripts, STRICT MAX 2 for compilation). Stop and request human intervention when threshold reached. Never infinite loop.

🚫 No Knowledge Bloat

  • Specs must be archived after extraction
  • Stable knowledge must be extracted to indices
  • Indices exceeding 500 lines must be split into subdirectories
  • Archive execution transitions automatically in the same session, using targeted git diff <files> or openspec.md to avoid context overload.

⚠️ Known Limitations

This is a pre-1.0 harness. Adopt with eyes open:

  • Pre-release status — current version is v0.x. The protocol surface (lifecycle phases, role matrix schema, gate exit codes) is mostly stable but not yet frozen. Pin a commit if you depend on byte-for-byte stability.
  • tsgo_gate.ts enforces tsgo configuration, not diagnostics directly — it verifies that @effect/tsgo is wired into tsconfig.json so that tsserver-aware editors (VS Code, JetBrains, Zed) surface the ~70 Effect-specific diagnostics in real time. Upstream @effect/tsgo is currently a Language Service plugin without a CLI runner, so harness-side diagnostic enforcement is upstream-blocked. The gate is feature-complete for the current upstream surface; when a tsgo check CLI ships, the gate will swap to a diagnostic runner (see the migration note in the gate header). For authoritative type errors today, use your project's tsc --noEmit.
  • Effect Schema drift detection is opt-ineffect_schema_drift_gate.ts only runs when you wire it into role_matrix.json for your project. There is no global default because schema drift policy depends on whether your project treats schemas as published contracts.
  • TS gates assume peer dependenciesa11y_gate, visual_regression_gate, bundle_budget_gate, web_vitals_gate, console_error_gate need Playwright + axe + Lost-Pixel + Lighthouse + ts-morph installed in the target project. See .agents/scripts/gates/package.json for the peer-dep list.
  • Skill governance is manual — the 40+ skills in .agents/skills/ follow a stable → deprecated → archive lifecycle (see SKILLS_GOVERNANCE.md), but moves are still operator-driven. Frontmatter is now CI-validated by skill_schema_checker.py.
  • LLM-native by design — there is no human CLI, web UI, or REPL. If you are evaluating this with "developer tool" expectations, see the Critical Positioning Statement at the top of this file.
  • Windows path quoting — gate scripts run on Linux/macOS/Windows, but if you invoke them from PowerShell, double-quote any path containing spaces. WSL is recommended on Windows for the smoothest experience.

If you hit a limitation not listed here, open an issue — known-limitations is a living section.


📖 Related Documentation

Legacy .agents/router/ROUTER.md, .agents/workflow/LIFECYCLE.md, and .agents/workflow/HOOKS.md are deprecated stubs that point at PROTOCOL.md sections — do not add new content there.


🤝 Contributing

Contributions are welcome. Start with CONTRIBUTING.md for the full PR workflow, gate/skill authoring guide, and self-test instructions. By participating you agree to the Code of Conduct. Security issues should follow SECURITY.md (do not open a public issue for vulnerabilities).

Quick checklist:

  1. Read First: AGENTS.md, .agents/PROTOCOL.md, and ENGINEERING_MANUAL.md for deeper context.
  2. Follow Lifecycle: All non-trivial changes go through the 6-phase lifecycle.
  3. Update Knowledge: Extract stable knowledge to the appropriate domain index.
  4. Run Diagnostics: python3 .agents/scripts/harness/self_test.py and the relevant gates.
  5. Submit PR: Include openspec.md (or a Slim Spec for @patch) for any change with MEDIUM or HIGH risk.

🏷️ Versioning

The harness follows semver. The current release is v0.1.0 — the first tagged release after a feature-complete polish pass on the React + Effect frontend scope. Pre-1.0 minor versions may include breaking changes to the protocol surface (lifecycle phases, role matrix shape, gate exit codes); pin a specific tag in your downstream projects if you need byte-for-byte stability. From v1.0.0 onward, breaking changes will follow a documented deprecation window of one minor release.

See CHANGELOG.md for the full release history.


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🙏 Acknowledgments

This framework draws inspiration from:

  • OpenSpec: Contract-first development methodology
  • Harness: Lifecycle state machines & hook systems
  • LLM Wiki: Evolvable knowledge graphs with anti-bloat mechanisms
  • Agentic Patterns: Autonomous agent workflows with human-in-the-loop checkpoints

About

An LLM-native cognitive harness for React + Effect TypeScript projects

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors