test(tdd): add failing security tests for SEC-01 through SEC-07 hardening by rysweet · Pull Request #3290 · rysweet/amplihack

rysweet · 2026-03-18T01:58:54Z

Summary

TDD test files specifying the security hardening contract from the #3266/#3282 design specification. All tests intentionally fail against current code — they define the implementation target for the follow-up security PRs.

86 tests collected | 81 fail | 4 pass (SEC-02 regression guards) | 1 skip

Test files added

File	Security items covered	Tests	Status
`tests/test_memory_auto_install_security.py`	SEC-02, SEC-04, SEC-07	14	3 pass, 10 fail, 1 skip
`tests/recipes/test_run_recipe_by_name_working_dir.py`	SEC-03, SEC-05	33	all fail
`tests/recipes/test_step_02b_output_validation.py`	SEC-01	39	all fail

What each set tests

SEC-01 (test_step_02b_output_validation.py) — validate_codebase_analysis() in new module src/amplihack/recipes/_validation.py:

Allowlist-only keys: {files, patterns, dependencies, entry_points, build_system, language, test_framework}
Max 64 KB JSON-serialised size
Max nesting depth 3
Rejects non-dict inputs, invalid JSON strings, empty dicts
Rejects XPiA probe keys (__system__, IMPORTANT)

SEC-03 (test_run_recipe_by_name_working_dir.py) — _validate_working_dir(path) helper:

Shell metacharacter rejection: ;, &, |, $, backtick, (, ), {, }, [, ], <, >, \\, ", '
Non-existent path rejection
File-instead-of-directory rejection
Returns pathlib.Path for valid directories
run_recipe_by_name calls validator before Rust runner

SEC-04 (test_memory_auto_install_security.py) — Hash verification:

--require-hashes flag in pip install command
src/amplihack/memory_auto_install_hashes.txt file existence and non-emptiness
Hash reference in pip command

SEC-05 (test_run_recipe_by_name_working_dir.py) — DeprecationWarning for unknown kwargs:

adapter="old" emits DeprecationWarning naming the kwarg
No warning for known params (user_context, dry_run, progress, etc.)

SEC-07 (test_memory_auto_install_security.py) — Error sanitization:

_sanitize_error() strips Unix and Windows absolute paths
_do_install calls _sanitize_error() on pip failure output

Test plan

Confirm all 81 tests fail on main (proving they are real TDD tests)
Implement SEC-07 (_sanitize_error) → 6 SEC-07 tests pass
Implement SEC-04 (hashes file + --require-hashes) → 4 SEC-04 tests pass
Implement SEC-03 (_validate_working_dir + integration into run_recipe_by_name) → 15 SEC-03 tests pass
Implement SEC-05 (DeprecationWarning in run_recipe_by_name) → 5 SEC-05 tests pass
Implement SEC-01 (validate_codebase_analysis in _validation.py) → 39 SEC-01 tests pass
All 86 tests green = implementation complete

🤖 Generated with Claude Code

…ning Adds three TDD test files specifying the contract for unimplemented security features from the #3266/#3282 design specification. Tests are intentionally failing against current code: tests/test_memory_auto_install_security.py - SEC-02: regression guard — sys.executable used in pip command (3 pass) - SEC-04: --require-hashes + hashes file + version pin (3 fail, 1 skip) - SEC-07: _sanitize_error() strips absolute paths from error output (6 fail) tests/recipes/test_run_recipe_by_name_working_dir.py - SEC-03: _validate_working_dir() helper — metachar/nonexistent/file rejection (11 fail) - SEC-05: DeprecationWarning emitted for unknown **kwargs (5 fail) tests/recipes/test_step_02b_output_validation.py - SEC-01: validate_codebase_analysis() — whitelist keys, 64KB limit, depth ≤ 3, type validation, empty dict rejection (39 fail) Total: 86 collected | 81 fail | 4 pass | 1 skip Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-18T01:59:33Z

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

github-actions · 2026-03-18T02:02:40Z

Repo Guardian - Passed

All 4 changed files were reviewed and no violations were found.

File	Assessment
`pyproject.toml`	Standard version bump — durable configuration file ✅
`tests/recipes/test_run_recipe_by_name_working_dir.py`	TDD security tests for SEC-03/SEC-05 — permanent test suite ✅
`tests/recipes/test_step_02b_output_validation.py`	TDD security tests for SEC-01 — permanent test suite ✅
`tests/test_memory_auto_install_security.py`	TDD security tests for SEC-02/SEC-04/SEC-07 — permanent test suite ✅

No ephemeral content, meeting notes, one-off scripts, or temporal snapshots detected. All additions are durable security specifications appropriate for the repository.

Generated by Repo Guardian for issue #3290 · ◷

github-actions · 2026-03-18T06:36:28Z

PR Triage Report

Risk: 🟡 Medium | Priority: High | Type: Test / Security (TDD foundation)

Summary

TDD test files specifying the security hardening contract for SEC-01 through SEC-07. 81 of 86 tests intentionally fail against current code — this is by design, establishing the implementation target for follow-up security PRs.

Coverage:

SEC-01: validate_codebase_analysis() — allowlist-only keys, size/depth limits, XPiA probe rejection
SEC-02/04/07: Hash verification (--require-hashes), error sanitization
SEC-03: _validate_working_dir() — shell metacharacter injection prevention
SEC-05: DeprecationWarning for unknown kwargs

CI Status

GitGuardian: ✅ Pass
Full CI suite: Not run (only GitGuardian)

Risk Factors

Intentionally failing tests: Merging will cause CI test runs to report failures until implementation PRs (smart-orchestrator/default-workflow stalls at step-02b-analyze-codebase #3266/smart-orchestrator launch broken in Copilot CLI (API mismatch + PEP 668 auto-install) #3282 follow-ups) land
Base SHA mismatch: Based on 7fd2fc190a77 — older than current main (c8608fe9) — branch needs rebase
Security-critical scope: SEC-03 (working dir injection prevention) and SEC-01 (XPiA defense) address real attack vectors

Recommendation

⚠️ Rebase required before merge (base is behind main). Consider merging together with or immediately before the implementation PRs to avoid a CI red period. The security contracts defined here are well-specified and valuable — mark for triage:security-review.

Generated by PR Triage Agent · ◷

Ubuntu and others added 2 commits March 18, 2026 01:58

[skip ci] chore: Auto-bump patch version

53daa28

github-actions bot added triage:complete triage:has-tests triage:high-priority triage:medium-risk triage:security-review labels Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(tdd): add failing security tests for SEC-01 through SEC-07 hardening#3290

test(tdd): add failing security tests for SEC-01 through SEC-07 hardening#3290
rysweet wants to merge 2 commits intomainfrom
tdd/security-hardening-3266-3282

rysweet commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rysweet commented Mar 18, 2026

Summary

Test files added

What each set tests

Test plan

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Repo Guardian - Passed

Uh oh!

github-actions bot commented Mar 18, 2026

PR Triage Report

Summary

CI Status

Risk Factors

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant