fix: enable deterministic dynamic checks in test-setup.sh (#252)#296
Open
fix: enable deterministic dynamic checks in test-setup.sh (#252)#296
Conversation
- Change verify_dynamic to use proper file paths (.claude/commands/n{v}.md for CC, .github/prompts/n{v}.prompt.md for GHC)
- Add timeout, model, and autopilot flags to CLI commands
- Re-enable all 24 dynamic check tests (12 versions × 2 tools) with deterministic command execution
- Remove FIXME comment as issue is now resolved
- Tests validate knowledge search returns expected keywords and minimum response length
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…#252) - CC: Use sed expansion of command file with --model haiku - GHC: Use script pseudo-TTY with file path and --yolo flag - Add prompt/command file existence checks for both tools - Improve keyword detection with detection_rate calculation Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Apply script -qc method to CC command execution, matching GHC approach. Ensures proper TTY context for prompt file reading. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Remove script pseudo-TTY from CC command (not needed). Add --dangerously-skip-permissions to bypass permission checks in headless mode. - CC: Direct execution with permission bypass - GHC: script -qc pseudo-TTY (required for copilot -p) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Replace (( )) arithmetic expressions with $(( )) assignment form. Prevents script termination when variable is 0 under set -e. Changed: - ((total_count++)) → total_count=$((total_count + 1)) - ((detected_count++)) → detected_count=$((detected_count + 1)) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…e-batch (#252) v1.x tests previously used SVN tutorial project which lacks .git, causing CC/GHC to incorrectly identify parent nabledge-dev as the project root. Dynamic checks (knowledge search) only read project state, so v1.x can safely use v6's nablarch-example-batch. Changes setup_env and verify_* functions to use $V6_PROJECT_SRC for all v1.x versions. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Add work notes, PR body draft, and tomorrow's checklist to prepare for day 2 of development. All modifications to tools/tests/test-setup.sh completed and verified. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
verify_dynamic now requires only the core prompt content (from marker to EOF), stripped of help preamble. Added marker validation to detect format changes early and fail with clear error message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Document test-setup.sh marker dependencies on CC command files and GHC prompt files. Developers must update test-setup.sh when these markers are changed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #252
Approach
Fixed verify_dynamic function to use proper file paths and CLI flags. Re-enabled all 24 dynamic check tests (12 versions × 2 tools) to validate knowledge search returns expected keywords and minimum response length.
Tasks
Success Criteria Check
/n6execution tests for all 12 combinations (v6/v5/v1.4/v1.3/v1.2/all × cc/ghc)/n6with a sample question via CC CLI or GHC CLI/n6returns a response (not error/timeout)Generated with Claude Code