Skip to content

fix(examples/hooks): bash_command_validator regex false negatives (#59441)#59508

Open
dhruba-datta wants to merge 1 commit into
anthropics:mainfrom
dhruba-datta:fix/bash-validator-regex-bugs
Open

fix(examples/hooks): bash_command_validator regex false negatives (#59441)#59508
dhruba-datta wants to merge 1 commit into
anthropics:mainfrom
dhruba-datta:fix/bash-validator-regex-bugs

Conversation

@dhruba-datta
Copy link
Copy Markdown

Fixes #59441.

The _VALIDATION_RULES in examples/hooks/bash_command_validator_example.py had two regex bugs that caused silent false negatives on common command shapes:

  1. grep: ^grep\b(?!.*\|) exempted grep even when it was the leading command of a pipeline (e.g. grep foo | wc -l), because the (?!.*\|) lookahead fails on any pipe anywhere in the string. The lookahead was apparently intended to exempt downstream uses like cat foo | grep bar, but the ^grep anchor already handles that — the lookahead was dead code that turned into a false-negative source. Dropped it.

  2. find: ^find\s+\S+\s+-name\b only matched the find PATH -name shape. The most common real-world form — find PATH -type f -name '*.log' — was missed because of the rigid \S+\s+-name adjacency. Changed to ^find\s+\S+.*\s-name\b to allow arbitrary predicates between the path and -name.

Case matrix (matches the issue body)

Command Before After
grep foo | wc -l not flagged flagged ✓
grep foo bar.txt | head -5 not flagged flagged ✓
find / -type f -name '*.log' not flagged flagged ✓
find . -type d -name node_modules not flagged flagged ✓
find . -maxdepth 2 -name '*.md' not flagged flagged ✓

Regression guards (still NOT flagged):

Command Status
cat foo | grep bar not flagged ✓ (downstream grep)
xargs find . -name foo not flagged ✓ (anchored to start)
find / -type f -newer foo.txt not flagged ✓ (no -name)
findstr foo not flagged ✓ (\b after find)

Considered switching to shlex-based parsing to also catch prefix forms (sudo grep, time grep) but that would expand the example's scope significantly; happy to do as a follow-up if the maintainers prefer that direction.

Fixes anthropics#59441.

Two regex bugs in _VALIDATION_RULES caused silent false negatives:

1. grep: ^grep\b(?!.*\|) exempted leading grep in pipelines
   (e.g. "grep foo | wc -l") because the (?!.*\|) lookahead
   fails on any pipe anywhere in the string. The ^grep anchor
   already excludes downstream uses like "cat foo | grep bar",
   so the lookahead was dead code creating false negatives.
   Dropped it.

2. find: ^find\s+\S+\s+-name\b only matched "find PATH -name"
   shape. The most common real form — "find PATH -type f -name
   '*.log'" — was missed because the regex required -name to
   be adjacent to the path token. Changed to
   ^find\s+\S+.*\s-name\b to allow arbitrary predicates between
   the path and -name.

Verified against the full case matrix from the issue plus
regression guards (downstream grep, xargs find, findstr, find
without -name).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] examples/hooks/bash_command_validator_example.py: regex bugs cause silent false negatives on common shapes

1 participant