Skip to content

fix: tokenize expressions correctly in function/macro args#431

Open
toddr-bot wants to merge 1 commit into
mainfrom
koan.toddr.bot/fix-tokenizer-expr-parsing
Open

fix: tokenize expressions correctly in function/macro args#431
toddr-bot wants to merge 1 commit into
mainfrom
koan.toddr.bot/fix-tokenizer-expr-parsing

Conversation

@toddr-bot

Copy link
Copy Markdown
Contributor

What

Fix tokenizer so that expressions like show(x-2) and show(x/2) work correctly without spaces around operators.

Why

Fixes GH #315. The tokenizer misparses expressions inside function/macro arguments when there are no spaces around - and / operators:

  • show(x-2) silently passes x (value 5) instead of x-2 (value 3) — the -2 is consumed as a separate negative number argument
  • show(x/2) throws a parse error — /2 is consumed as a filename token

With spaces (show(x - 2), show(x / 2)) everything works fine. The bug is purely in the tokenizer's regex.

How

Two targeted changes in tokenise_directive():

  1. Negative number splitting: When a negative number token immediately follows a value-producing token (IDENT, NUMBER, LITERAL, ), ]), split it into BINOP '-' + positive NUMBER. This preserves standalone negative literals like -3 and [-1, -2].

  2. Filename regex tightening: Changed \/\w+ to \/[a-zA-Z_]\w* so that /2 is not matched as a filename. Path segments after / or :: now require a leading letter or underscore, which matches real template filenames.

Testing

  • Added 3 new test cases to t/macro.t covering expressions in macro args, negative literals, and multi-arg expressions
  • Full test suite passes: 3176 tests across 116 files
  • Edge cases verified: negative literals, negative lists, chained subtraction, parenthesized expressions, division without spaces

🤖 Generated with Claude Code


Quality Report

Changes: 2 files changed, 37 insertions(+), 1 deletion(-)

Code scan: clean

Tests: skipped

Branch hygiene: clean

Generated by Kōan post-mission quality pipeline

The tokenizer had two bugs that caused expressions like `show(x-2)` and
`show(x/2)` to be misparsed when written without spaces:

1. The number regex `-?\d+` consumed the minus sign as part of a negative
   number literal, so `x-2` tokenized as IDENT:'x' NUMBER:'-2' instead
   of IDENT:'x' BINOP:'-' NUMBER:'2'. Fix: when a negative number follows
   a value-producing token (IDENT, NUMBER, LITERAL, ')' or ']'), split it
   into a BINOP '-' and a positive NUMBER.

2. The filename regex `\/\w+` matched `/2` as a filename, so `x/2`
   tokenized as FILENAME:'x/2'. Fix: require filename path segments to
   start with a letter or underscore, not a digit.

Standalone negative literals like `-3` and `[-1, -2]` continue to work.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@toddr-bot

Copy link
Copy Markdown
Contributor Author

Recreated from #345 (auto-closed when the toddr-bot fork was removed). Original branch is now hosted on this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant