Autonoma-AI · tomaspiaggio · Apr 23, 2026 · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -14,12 +14,12 @@
         "repo": "Autonoma-AI/test-planner-plugin",
         "ref": "production"
       },
-      "description": "Generates comprehensive E2E test cases through a validated 4-step pipeline with deterministic validation"
+      "description": "Generates comprehensive E2E test cases through a validated multi-step pipeline with deterministic validation. Includes generate-tests (full suite) and generate-adhoc-tests (focused topic) commands."
     },
     {
       "name": "autonoma-test-planner-development",
       "source": "./",
-      "description": "[DEVELOPMENT] Generates comprehensive E2E test cases through a validated 4-step pipeline with deterministic validation"
+      "description": "[DEVELOPMENT] Generates comprehensive E2E test cases through a validated multi-step pipeline with deterministic validation. Includes generate-tests (full suite) and generate-adhoc-tests (focused topic) commands."
     }
   ]
 }
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,11 +1,8 @@
 {
   "name": "autonoma-test-planner",
   "description": "Generates comprehensive E2E test cases for a codebase through a validated multi-step pipeline with deterministic validation at each step",
-  "version": "1.1.0",
+  "version": "1.13.1",
   "author": {
     "name": "Autonoma"
-  },
-  "commands": [
-    "./commands"
-  ]
+  }
 }
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -14,5 +14,5 @@ jobs:
       - uses: actions/setup-python@v5
         with:
           python-version: "3.11"
-      - run: pip install pytest pyyaml
+      - run: pip install pytest pyyaml Faker
       - run: pytest tests/ -v
diff --git a/.release-please-manifest.json b/.release-please-manifest.json
@@ -1,3 +1,3 @@
 {
-  ".": "1.4.0"
+  ".": "1.14.0"
 }
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,56 +1,80 @@
 # Autonoma Test Planner Plugin
 
-Claude Code plugin that generates E2E test suites through a 4-step deterministic pipeline.
+Claude Code plugin that generates E2E test suites through a deterministic multi-step pipeline.
 
 ## Project Structure
 
-```
-.claude-plugin/           # Plugin manifest (plugin.json, marketplace.json)
-commands/generate-tests.md  # Entry point — dispatches the 4-step pipeline
-skills/generate-tests/SKILL.md  # Orchestrator skill
-agents/                   # Isolated subagents (one per step)
-  kb-generator.md         # Step 1: Knowledge base → autonoma/AUTONOMA.md + features.json
-  scenario-generator.md   # Step 2: Scenarios → autonoma/scenarios.md
-  test-case-generator.md  # Step 3: Tests → autonoma/qa-tests/INDEX.md + test files
-  env-factory-generator.md # Step 4: Environment factory endpoint
+```text
+.claude-plugin/              # Plugin manifest
+commands/generate-tests.md   # Full pipeline command
+commands/generate-adhoc-tests.md
+skills/generate-tests/SKILL.md
+skills/generate-adhoc-tests/SKILL.md
+agents/
+  kb-generator.md              # Step 1: Knowledge base
+  entity-audit-generator.md    # Step 2: Entity creation audit
+  scenario-generator.md        # Step 3: Scenarios
+  env-factory-generator.md     # Step 4: Environment Factory implementation
+  scenario-validator.md        # Step 5: Scenario lifecycle validation
+  test-case-generator.md       # Step 6: E2E tests
+  focused-test-case-generator.md
 hooks/
-  hooks.json              # PostToolUse hook config (triggers on Write)
-  validate-pipeline-output.sh  # Bash dispatcher → routes to Python validators
-  validators/             # Python scripts that validate YAML frontmatter
+  hooks.json
+  pipeline-kickoff.sh
+  pretool-heartbeat.sh
+  transcript-streamer.py
+  validate-pipeline-output.sh
+  preflight_scenario_recipes.py
+  validators/
+    evals/
+tests/
 ```
 
-## How the Pipeline Works
+## Pipeline
 
-Each step spawns an isolated subagent. After each Write, the PostToolUse hook in `hooks/hooks.json` runs `validate-pipeline-output.sh`, which pattern-matches the file path and runs the appropriate Python validator. Validators exit 0 (OK) or 2 (block with error message).
+1. Knowledge Base
+2. Entity Creation Audit
+3. Scenarios
+4. Implement Environment Factory
+5. Validate Scenario Lifecycle
+6. Generate E2E Tests
 
-Steps 1-3 require user confirmation before advancing. Step 4 is the final step (no gate).
+The full pipeline is interactive. After steps 1-5, Claude presents the step summary and waits for user confirmation before continuing. Lifecycle reporting is handled by plugin hooks, not by ad hoc agent curl calls.
 
 ## Validation
 
-Validators are in `hooks/validators/`. They parse YAML frontmatter and check required fields, types, and cross-file consistency. All validators print "OK" on success or an error message on failure.
+Validators are in `hooks/validators/`.
 
 | Validator | File matched | Key checks |
 |-----------|-------------|------------|
-| `validate_kb.py` | `*/autonoma/AUTONOMA.md` | app_name, app_description (≥20 chars), core_flows with at least one `core: true` |
-| `validate_features.py` | `*/autonoma/features.json` | features array length matches total_features, valid types, at least one core feature |
-| `validate_scenarios.py` | `*/autonoma/scenarios.md` | scenario_count ≥ 3, standard/empty/large scenarios present, entity_types |
-| `validate_test_index.py` | `*/autonoma/qa-tests/INDEX.md` | test totals match folder sums, criticality sums, cross-checks against features.json |
-| `validate_test_file.py` | `*/autonoma/qa-tests/*/[!I]*.md` | title, description, criticality (critical/high/mid/low), scenario, flow |
+| `validate_kb.py` | `*/autonoma/AUTONOMA.md` | frontmatter and core-flow structure |
+| `validate_features.py` | `*/autonoma/features.json` | feature inventory schema |
+| `validate_entity_audit.py` | `*/autonoma/entity-audit.md` | model creation classification and owner links |
+| `validate_scenarios.py` | `*/autonoma/scenarios.md` | scenario count, metadata, required sections |
+| `validate_endpoint_implemented.py` | `*/autonoma/.endpoint-implemented` | handler path and factory integrity |
+| `validate_creation_file_immutable.py` | `*/autonoma/.endpoint-implemented` | accepted audit creation files were not rewritten unsafely |
+| `validate_factory_fidelity.py` | `*/autonoma/.endpoint-implemented` | semantic per-model factory fidelity |
+| `validate_scenario_validation.py` | `*/autonoma/.scenario-validation.json` | Step 5 terminal-state contract |
+| `validate_scenario_recipes.py` | `*/autonoma/scenario-recipes.json` | recipe schema |
+| `validate_test_index.py` | `*/autonoma/qa-tests/INDEX.md` | test totals and folder sums |
+| `validate_directory_structure.py` | `*/autonoma/qa-tests/INDEX.md` | test directory structure |
+| `validate_test_file.py` | `*/autonoma/qa-tests/*/[!I]*.md` | test frontmatter |
+
+Scenario recipes also run live endpoint preflight through `hooks/preflight_scenario_recipes.py`.
+
+Test file writes are blocked until `autonoma/.endpoint-validated` exists.
 
 ## Development
 
 ```bash
-# Run plugin locally without installing
 claude --plugin-dir ./
-
-# Validate plugin structure
 claude plugin validate ./
+pytest
 ```
 
-## Dependencies
-
-- Python 3 + PyYAML (auto-installed by the hook if missing)
-
-## Known Issues
+## Notes
 
-- `commands/generate-tests.md` has unresolved merge conflicts between the AskUserQuestion approach and the end-turn approach for user confirmation between steps. Resolve before merging to main.
+- Step 4 implements the Environment Factory and may edit target backend code.
+- Step 4 writes `autonoma/.endpoint-implemented` only after discover smoke and factory-integrity checks pass.
+- Step 5 validates signed `discover` / `up` / `down` for every scenario and may fix handler bugs or reconcile `scenarios.md`.
+- Step 6 is gated on `autonoma/.endpoint-validated`.
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
@@ -4,8 +4,8 @@ This guide explains how to test changes from a branch without publishing to the
 
 ## Prerequisites
 
-- [Claude Code](https://claude.ai/code) installed
-- Your branch pushed to GitHub
+- [Claude Code](https://claude.ai/code)
+- your branch pushed to GitHub
 
 ## Install from a branch
 
@@ -31,15 +31,22 @@ Push new commits to your branch, then reinstall:
 
 ## Environment variables
 
-The plugin requires three environment variables to be set in the project where you run it:
+The plugin itself requires these values in the target project session:
 
 | Variable | Description |
 | --- | --- |
-| `AUTONOMA_API_KEY` | Your Autonoma API key (get it from the dashboard under Settings > API Keys) |
-| `AUTONOMA_PROJECT_ID` | The application ID from the Autonoma dashboard |
-| `AUTONOMA_API_URL` | API base URL - use `http://localhost:4000` for local dev |
+| `AUTONOMA_API_KEY` | Autonoma API key |
+| `AUTONOMA_PROJECT_ID` | Application ID from the Autonoma dashboard |
+| `AUTONOMA_API_URL` | API base URL, for example `http://localhost:4000` in local dev |
 
-Add them to the `.env` file or export them in your shell before running Claude Code in the target project.
+You do **not** need to pre-set `AUTONOMA_SDK_ENDPOINT`, `AUTONOMA_SHARED_SECRET`, or `AUTONOMA_SIGNING_SECRET`.
+Step 1 creates or discovers those values in the target repo by editing `.env` and `.env.example`.
+
+Use `AUTONOMA_AUTO_ADVANCE=true` as the canonical launch mode while testing. If you are still using
+the older confirmation flag, `AUTONOMA_REQUIRE_CONFIRMATION=false` is treated as the same
+auto-advance behavior.
+
+After the generated PR is merged, the user still needs to deploy those env changes.
 
 ## References