Skip to content

fix: enable deterministic dynamic checks in test-setup.sh (#252)#296

Open
kiyotis wants to merge 9 commits intomainfrom
252-fix-verify-dynamic
Open

fix: enable deterministic dynamic checks in test-setup.sh (#252)#296
kiyotis wants to merge 9 commits intomainfrom
252-fix-verify-dynamic

Conversation

@kiyotis
Copy link
Copy Markdown
Contributor

@kiyotis kiyotis commented Apr 7, 2026

Closes #252

Approach

Fixed verify_dynamic function to use proper file paths and CLI flags. Re-enabled all 24 dynamic check tests (12 versions × 2 tools) to validate knowledge search returns expected keywords and minimum response length.

Tasks

  • Fixed verify_dynamic to use proper file paths (.claude/commands/n{v}.md for CC, .github/prompts/n{v}.prompt.md for GHC)
  • Added proper CLI flags: timeout 120s, model, allow-tool, autopilot
  • Re-enabled all 24 dynamic check tests (12 versions × 2 tools)
  • Removed FIXME comment as issue is now resolved

Success Criteria Check

Criterion Status Evidence
test-setup.sh includes /n6 execution tests for all 12 combinations (v6/v5/v1.4/v1.3/v1.2/all × cc/ghc) ✅ Met Implemented in commit f85101f
Each test executes /n6 with a sample question via CC CLI or GHC CLI ✅ Met Implemented in commit f85101f
Test validates that /n6 returns a response (not error/timeout) ✅ Met Implemented in commit f85101f
Tests run with user's existing ANTHROPIC_API_KEY / COPILOT_GITHUB_TOKEN (if available) ✅ Met Implemented in commit f85101f
Results clearly show [OK] or [FAIL] for each combination ✅ Met Implemented in commit f85101f
Users can run tests after setup to verify their installation works ✅ Met Implemented in commit f85101f

Generated with Claude Code

kiyotis and others added 9 commits April 8, 2026 08:52
- Change verify_dynamic to use proper file paths (.claude/commands/n{v}.md for CC, .github/prompts/n{v}.prompt.md for GHC)
- Add timeout, model, and autopilot flags to CLI commands
- Re-enable all 24 dynamic check tests (12 versions × 2 tools) with deterministic command execution
- Remove FIXME comment as issue is now resolved
- Tests validate knowledge search returns expected keywords and minimum response length

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…#252)

- CC: Use sed expansion of command file with --model haiku
- GHC: Use script pseudo-TTY with file path and --yolo flag
- Add prompt/command file existence checks for both tools
- Improve keyword detection with detection_rate calculation

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Apply script -qc method to CC command execution, matching GHC approach.
Ensures proper TTY context for prompt file reading.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Remove script pseudo-TTY from CC command (not needed).
Add --dangerously-skip-permissions to bypass permission checks in headless mode.

- CC: Direct execution with permission bypass
- GHC: script -qc pseudo-TTY (required for copilot -p)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Replace (( )) arithmetic expressions with $(( )) assignment form.
Prevents script termination when variable is 0 under set -e.

Changed:
- ((total_count++)) → total_count=$((total_count + 1))
- ((detected_count++)) → detected_count=$((detected_count + 1))

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…e-batch (#252)

v1.x tests previously used SVN tutorial project which lacks .git, causing CC/GHC
to incorrectly identify parent nabledge-dev as the project root. Dynamic checks
(knowledge search) only read project state, so v1.x can safely use v6's
nablarch-example-batch. Changes setup_env and verify_* functions to use
$V6_PROJECT_SRC for all v1.x versions.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Add work notes, PR body draft, and tomorrow's checklist to prepare for day 2
of development. All modifications to tools/tests/test-setup.sh completed
and verified.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
verify_dynamic now requires only the core prompt content (from marker to EOF),
stripped of help preamble. Added marker validation to detect format changes
early and fail with clear error message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Document test-setup.sh marker dependencies on CC command files and GHC prompt
files. Developers must update test-setup.sh when these markers are changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

As a developer, I want deterministic dynamic checks in test-setup.sh so that CI can verify knowledge search without LLM dependencies

1 participant