test(e2e): make secrets guardrail case-insensitive (fix python-e2e flake)#277
Open
v1r3n wants to merge 2 commits into
Open
test(e2e): make secrets guardrail case-insensitive (fix python-e2e flake)#277v1r3n wants to merge 2 commits into
v1r3n wants to merge 2 commits into
Conversation
…flake test_suite8_guardrails::test_agent_output_secrets_blocked flaked: the agent was asked to say "password", the model replied "Password security is crucial..." (capitalised, sentence-start), and the G3_NO_SECRETS guardrail's case-sensitive patterns (\bpassword\b) missed it — so the word reached the output and the test's own case-insensitive assertion failed. A secrets filter should match any casing, so add (?i) to the patterns. The guardrail runs locally via Python re.compile, so the inline flag is portable. Verified against the exact CI failure output: old patterns miss "Password", new patterns catch it -> deterministic. (No LLM in the assertion path; the flake was the guardrail config, not the test's check.)
…atterns test_plan_reflects_all_guardrails hard-codes G3's expected pattern strings; align them with the (?i) forms so it matches the compiled plan. test_agent_output_secrets_blocked already passes with the fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
python-e2e→test_suite8_guardrails::test_agent_output_secrets_blockedwas failing intermittently:Root cause
The test asks the agent to include "password" in its reply and expects the
G3_NO_SECRETSoutput guardrail to either escalate or scrub it. But the guardrail's patterns were case-sensitive:The LLM nondeterministically replied "Password security is crucial…" (capitalised, sentence-start).
\bpassword\bdoesn't match "Password", so the guardrail passed it through → the word reached the output → the test's own case-insensitive assertion failed. Whether it flaked depended purely on how the model capitalised the word.Fix
Make the patterns case-insensitive with an inline
(?i)flag. A secrets filter should catch any casing, so this is also the correct behaviour.RegexGuardrailevaluates locally via Pythonre.compile, so the inline flag is portable.Verification (CLAUDE.md make-it-fail)
Checked against the exact CI failure string:
The assertion path uses a deterministic regex (no LLM judging) — the flake was the guardrail config, not the check.