diff --git a/CLAUDE.md b/CLAUDE.md index 7685d42..7bb5b8b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,37 +1,45 @@ # Autonoma Test Planner Plugin -Claude Code plugin that generates E2E test suites through a deterministic 5-step pipeline. +Claude Code plugin that generates E2E test suites through a deterministic multi-step pipeline. ## Project Structure ```text .claude-plugin/ # Plugin manifest -commands/generate-tests.md # Command entry +commands/generate-tests.md # Full pipeline command +commands/generate-adhoc-tests.md skills/generate-tests/SKILL.md +skills/generate-adhoc-tests/SKILL.md agents/ - sdk-integrator.md # Step 1: SDK integration - kb-generator.md # Step 2: Knowledge base - scenario-generator.md # Step 3: Scenarios - test-case-generator.md # Step 4: E2E tests - scenario-validator.md # Step 5: Scenario validation + kb-generator.md # Step 1: Knowledge base + entity-audit-generator.md # Step 2: Entity creation audit + scenario-generator.md # Step 3: Scenarios + env-factory-generator.md # Step 4: Environment Factory implementation + scenario-validator.md # Step 5: Scenario lifecycle validation + test-case-generator.md # Step 6: E2E tests + focused-test-case-generator.md hooks/ hooks.json + pipeline-kickoff.sh + pretool-heartbeat.sh + transcript-streamer.py validate-pipeline-output.sh preflight_scenario_recipes.py validators/ + evals/ tests/ ``` ## Pipeline -1. SDK Integration -2. Knowledge Base +1. Knowledge Base +2. Entity Creation Audit 3. Scenarios -4. E2E Tests -5. Scenario Validation +4. Implement Environment Factory +5. Validate Scenario Lifecycle +6. Generate E2E Tests -The canonical launch mode is `AUTONOMA_AUTO_ADVANCE=true`. If you are still using the older flag, -`AUTONOMA_REQUIRE_CONFIRMATION=false` is treated as the same auto-advance behavior. Step 5 is final. +The full pipeline is interactive. After steps 1-5, Claude presents the step summary and waits for user confirmation before continuing. Lifecycle reporting is handled by plugin hooks, not by ad hoc agent curl calls. ## Validation @@ -39,19 +47,23 @@ Validators are in `hooks/validators/`. | Validator | File matched | Key checks | |-----------|-------------|------------| -| `validate_kb.py` | `*/autonoma/AUTONOMA.md` | app_name, app_description, core_flows | -| `validate_discover.py` | `*/autonoma/discover.json` | schema object, models, edges, relations, scopeField | -| `validate_sdk_endpoint.py` | `*/autonoma/.sdk-endpoint` | absolute http/https URL | -| `validate_sdk_integration.py` | `*/autonoma/.sdk-integration.json` | Step 1 handoff contract | +| `validate_kb.py` | `*/autonoma/AUTONOMA.md` | frontmatter and core-flow structure | | `validate_features.py` | `*/autonoma/features.json` | feature inventory schema | -| `validate_scenarios.py` | `*/autonoma/scenarios.md` | scenario count and metadata | +| `validate_entity_audit.py` | `*/autonoma/entity-audit.md` | model creation classification and owner links | +| `validate_scenarios.py` | `*/autonoma/scenarios.md` | scenario count, metadata, required sections | +| `validate_endpoint_implemented.py` | `*/autonoma/.endpoint-implemented` | handler path and factory integrity | +| `validate_creation_file_immutable.py` | `*/autonoma/.endpoint-implemented` | accepted audit creation files were not rewritten unsafely | +| `validate_factory_fidelity.py` | `*/autonoma/.endpoint-implemented` | semantic per-model factory fidelity | | `validate_scenario_validation.py` | `*/autonoma/.scenario-validation.json` | Step 5 terminal-state contract | | `validate_scenario_recipes.py` | `*/autonoma/scenario-recipes.json` | recipe schema | | `validate_test_index.py` | `*/autonoma/qa-tests/INDEX.md` | test totals and folder sums | +| `validate_directory_structure.py` | `*/autonoma/qa-tests/INDEX.md` | test directory structure | | `validate_test_file.py` | `*/autonoma/qa-tests/*/[!I]*.md` | test frontmatter | Scenario recipes also run live endpoint preflight through `hooks/preflight_scenario_recipes.py`. +Test file writes are blocked until `autonoma/.endpoint-validated` exists. + ## Development ```bash @@ -62,6 +74,7 @@ pytest ## Notes -- Step 1 installs the SDK from package managers only. -- The SDK reference repo is read-only context. -- Step 5 validates the live integration and does not edit backend code. +- Step 4 implements the Environment Factory and may edit target backend code. +- Step 4 writes `autonoma/.endpoint-implemented` only after discover smoke and factory-integrity checks pass. +- Step 5 validates signed `discover` / `up` / `down` for every scenario and may fix handler bugs or reconcile `scenarios.md`. +- Step 6 is gated on `autonoma/.endpoint-validated`. diff --git a/README.md b/README.md index 6ef2b28..28fb2b5 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # Autonoma Test Planner -A Claude Code plugin that generates comprehensive E2E test suites for your codebase through a validated 5-step pipeline. +A Claude Code plugin that generates comprehensive E2E test suites for your codebase through a validated 6-step pipeline. -Each step runs in an isolated subagent with deterministic validation. The first step now integrates the Autonoma SDK directly into the target project, and the final step validates scenarios against that live endpoint without editing backend code. +Each step runs in an isolated subagent with deterministic validation. The pipeline audits how application entities are created, implements an Autonoma Environment Factory against the target app, validates scenario lifecycles through the live endpoint, and only then generates E2E tests. ## Install @@ -19,55 +19,73 @@ Inside any project with Claude Code: /autonoma-test-planner:generate-tests ``` -The canonical launch mode is `AUTONOMA_AUTO_ADVANCE=true`, which keeps the plugin moving after -Steps 1-4. If you are still using the older confirmation flag, `AUTONOMA_REQUIRE_CONFIRMATION=false` -is treated as the same auto-advance behavior. +The full pipeline is interactive. After steps 1-5, Claude presents the step summary and waits for your confirmation before continuing. + +Lifecycle reporting is hook-driven: + +- `hooks/pipeline-kickoff.sh` creates the setup record and writes `autonoma/.docs-url` plus `autonoma/.generation-id`. +- `hooks/validate-pipeline-output.sh` validates artifacts, emits step events, uploads artifacts, and enforces the test-generation gate. +- `hooks/pretool-heartbeat.sh` keeps dashboard activity reporting alive while tools are running. ## Pipeline -### Step 1: SDK Integration +### Step 1: Knowledge Base -Detects the project stack, installs the Autonoma SDK from package managers, wires the endpoint, ensures secrets exist, starts or reuses a local dev server, verifies signed `discover` / `up` / `down`, and writes `autonoma/.sdk-endpoint` plus `autonoma/.sdk-integration.json`. +Analyzes the app and produces `autonoma/AUTONOMA.md`, `autonoma/skills/*.md`, and `autonoma/features.json`. -It may also create a branch, commit the integration, and open a PR when `gh` is available. +**You review**: the core flows table. -**You review**: detected stack, installed packages, endpoint URL, generated env vars, and PR status. +### Step 2: Entity Creation Audit -### Step 2: Knowledge Base +Audits every database model and records how each model comes into existence in `autonoma/entity-audit.md`. -Analyzes the app and produces `autonoma/AUTONOMA.md` and `autonoma/features.json`. +Models marked `independently_created: true` become Environment Factory factories that call the app's real creation functions. Dependent-only models use the SDK's raw SQL fallback and are torn down through their owner model. -**You review**: the core flows table. +**You review**: factory-backed models, dependent-only models, and any dual-creation models. ### Step 3: Scenarios -Fetches `discover` from the Step 1 endpoint and produces `autonoma/discover.json` plus `autonoma/scenarios.md`. +Reads the knowledge base and `autonoma/entity-audit.md`, then produces `autonoma/scenarios.md`. -**You review**: entity names, counts, relationships, and which values should stay concrete versus variable. +Scenarios include `standard`, `empty`, and `large`, track variable fields that must vary across runs, and use nested create trees rooted at the scope entity. -### Step 4: E2E Tests +**You review**: entity names, counts, relationships, variable fields, and via-owner versus standalone creation choices. -Generates markdown test files in `autonoma/qa-tests/` plus `INDEX.md`. +### Step 4: Implement Environment Factory -**You review**: test distribution and coverage correlation. +Installs and configures the Autonoma SDK endpoint, then registers a factory for every `independently_created: true` model from `entity-audit.md`. + +This step runs a signed `discover` smoke test and factory-integrity checks, then writes `autonoma/.endpoint-implemented`. It does **not** run full `up` / `down`; lifecycle validation happens in Step 5. -### Step 5: Scenario Validation +**You review**: handler path, installed packages, factories registered, and required secrets. -Validates `standard`, `empty`, and `large` against the live SDK endpoint, writes `autonoma/scenario-recipes.json` plus `autonoma/.scenario-validation.json`, runs endpoint preflight, and uploads the approved recipes to the setup API only after all checks pass. +### Step 5: Validate Scenario Lifecycle -This step does **not** implement backend code. It only validates the existing integration. +Runs signed `discover` / `up` / `down` against every scenario. The validator may fix handler bugs or reconcile `autonoma/scenarios.md` with real endpoint behavior. + +On success, it writes `autonoma/scenario-recipes.json`, `autonoma/.scenario-validation.json`, and `autonoma/.endpoint-validated`. The `.endpoint-validated` sentinel gates Step 6; test files cannot be written before it exists. + +**You review**: scenarios passed, scenario edits, preflight result, and recipe upload status. + +### Step 6: Generate E2E Tests + +Generates markdown test files in `autonoma/qa-tests/` plus `autonoma/qa-tests/INDEX.md`. + +**You review**: test distribution and coverage correlation. ## Key Outputs -- `autonoma/.sdk-endpoint`: validated SDK endpoint URL -- `autonoma/.sdk-integration.json`: Step 1 machine-readable handoff - `autonoma/AUTONOMA.md` +- `autonoma/skills/*.md` - `autonoma/features.json` -- `autonoma/discover.json` +- `autonoma/entity-audit.md` - `autonoma/scenarios.md` -- `autonoma/qa-tests/INDEX.md` -- `autonoma/.scenario-validation.json`: Step 5 terminal-state artifact +- `autonoma/.factory-plan.md` +- `autonoma/.endpoint-implemented` - `autonoma/scenario-recipes.json` +- `autonoma/.scenario-validation.json` +- `autonoma/.endpoint-validated` +- `autonoma/qa-tests/INDEX.md` ## Ad Hoc Test Generation @@ -89,7 +107,7 @@ Or invoke without arguments and the command will suggest focus areas based on yo ### How it works -**Subsequent runs** (scenarios already configured in Autonoma): fetches scenarios and existing tests from the Autonoma, then runs only focused test generation (Step 3). Steps 1, 2, and 4 are skipped. +**Subsequent runs** (active scenarios and recipes already exist in Autonoma): fetches existing scenario, skill, and test context from Autonoma, then runs only focused test generation for the requested topic. Tests are written to `autonoma/qa-tests/{focus-slug}/` so they sit alongside your existing test suite without overwriting it. @@ -108,50 +126,40 @@ autonoma/qa-tests/ Provide these before running the plugin: ```bash +AUTONOMA_DOCS_URL= AUTONOMA_API_KEY= AUTONOMA_PROJECT_ID= AUTONOMA_API_URL= ``` -Canonical: - -```bash -AUTONOMA_AUTO_ADVANCE=true -``` +`AUTONOMA_DOCS_URL` is required so subagents can fetch the latest Autonoma instructions. `AUTONOMA_API_KEY`, `AUTONOMA_PROJECT_ID`, and `AUTONOMA_API_URL` are required for dashboard setup records, lifecycle events, artifact uploads, and recipe uploads. -Compatibility alias: - -```bash -AUTONOMA_REQUIRE_CONFIRMATION=false -``` - -You no longer need to pre-provide `AUTONOMA_SDK_ENDPOINT` or `AUTONOMA_SHARED_SECRET`. Step 1 creates or discovers them in the target project. - -The integration step updates `.env` and `.env.example` in the target repo with: +The Environment Factory step generates or discovers these target-app values and updates `.env` and `.env.example` when applicable: ```bash AUTONOMA_SHARED_SECRET= AUTONOMA_SIGNING_SECRET= ``` -Those changes still need to be deployed after PR creation or merge. +`AUTONOMA_SDK_ENDPOINT` is needed by scenario validation and recipe preflight once the endpoint exists. Generated environment changes still need to be deployed with the target app. ## Validation Every pipeline output is validated by shell-dispatched Python validators. -| File | Validation | -| --- | --- | -| `AUTONOMA.md` | frontmatter and core-flow structure | -| `features.json` | feature inventory schema | -| `discover.json` | SDK discover schema | -| `.sdk-endpoint` | absolute `http` or `https` URL | -| `.sdk-integration.json` | Step 1 handoff contract | -| `scenarios.md` | scenario schema and required sections | -| `.scenario-validation.json` | Step 5 terminal-state contract | -| `scenario-recipes.json` | recipe schema plus live endpoint preflight | -| `INDEX.md` | test totals and folder breakdown | -| test files | required frontmatter | +| File | Validator | Validation | +| --- | --- | --- | +| `AUTONOMA.md` | `validate_kb.py` | frontmatter and core-flow structure | +| `features.json` | `validate_features.py` | feature inventory schema | +| `entity-audit.md` | `validate_entity_audit.py` | model creation classification, factory counts, and owner links | +| `scenarios.md` | `validate_scenarios.py` | scenario schema and required sections | +| `.endpoint-implemented` | `validate_endpoint_implemented.py`, `validate_creation_file_immutable.py`, `validate_factory_fidelity.py` | handler path, factory integrity, immutable audit snapshot, and semantic factory fidelity | +| `.scenario-validation.json` | `validate_scenario_validation.py` | Step 5 terminal-state contract | +| `scenario-recipes.json` | `validate_scenario_recipes.py` | recipe schema plus live endpoint preflight | +| `INDEX.md` | `validate_test_index.py`, `validate_directory_structure.py` | test totals, folder breakdown, and directory structure | +| test files | `validate_test_file.py` | required frontmatter | + +Test files are blocked until `autonoma/.endpoint-validated` exists. ## Local Development @@ -167,26 +175,25 @@ pytest autonoma-test-planner/ ├── .claude-plugin/ ├── commands/generate-tests.md +├── commands/generate-adhoc-tests.md ├── skills/generate-tests/SKILL.md +├── skills/generate-adhoc-tests/SKILL.md ├── agents/ -│ ├── sdk-integrator.md │ ├── kb-generator.md +│ ├── entity-audit-generator.md │ ├── scenario-generator.md +│ ├── env-factory-generator.md │ ├── test-case-generator.md +│ ├── focused-test-case-generator.md │ └── scenario-validator.md ├── hooks/ +│ ├── pipeline-kickoff.sh +│ ├── pretool-heartbeat.sh +│ ├── transcript-streamer.py │ ├── validate-pipeline-output.sh │ ├── preflight_scenario_recipes.py │ └── validators/ -├── adhoc/ -│ ├── .claude-plugin/ -│ ├── skills/generate-adhoc-tests/SKILL.md -│ ├── commands/generate-adhoc-tests.md -│ ├── agents/focused-test-case-generator.md -│ └── hooks/ -│ ├── hooks.json -│ ├── validate-pipeline-output.sh -│ └── validators/ +│ └── evals/ └── tests/ ```