Skip to content

Fix extension grammar: multi-char operators (=>, ->, …) highlight as one token#31

Merged
assapir merged 3 commits into
mainfrom
worktree-agent-a4f93e0aa5137af65
Jun 27, 2026
Merged

Fix extension grammar: multi-char operators (=>, ->, …) highlight as one token#31
assapir merged 3 commits into
mainfrom
worktree-agent-a4f93e0aa5137af65

Conversation

@assapir

@assapir assapir commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Problem

In the Quilon VS Code grammar, multi-character operators risked being highlighted as two tokens (e.g. => could color the = in one scope and the > in another), the same for ->, :=, |>, <-, ==, !=, <=, >=, &&, ||, ::. TextMate applies, at each position, the first pattern in a rule list that matches there — ties at the same start position are decided by list order, not match length — so a single-character operator rule ordered ahead of a multi-character one would split the operator.

Fix

In syntaxes/quilon.tmLanguage.json, make every multi-character operator win as a single token:

  • Order all multi-char operator rules before the single-char rules, and before any single-char operator that is a prefix of them (|>/|| before |, <= before <, =>/== before =, etc.). Each multi-char operator keeps exactly one scope.
  • Split the old &&|\|\||! rule so the two-char &&/|| sit in the multi-char tier while the single-char ! stays in the single-char tier (keeps != correct).
  • << / >> remain single import/export tokens (unchanged module-line rules).
  • No regression to $ (unit), ~ comments, </> (comparison + block delimiters), strings, or numbers.

Tests (mandatory)

Added src/grammar.test.ts, backed by a tiny dependency-free TextMate match-engine (src/grammar.ts) that reproduces the ordered first-match-wins rule (no native vscode-textmate/vscode-oniguruma dependency, so it runs under the existing node --test gate). It asserts:

  • each multi-char operator (=>, ->, :=, |>, <-, ::, ==, !=, <=, >=, &&, ||) tokenizes to a single, consistent scope — both spaced (a => b) and tight (a=>b);
  • <</>> stay single tokens; adjacent operators (a==b!=c<=d>=e) each stay one token; a representative lambda + arrow-type line highlights each operator once;
  • regression guards for </>, =, $, comments, strings, and numbers.

Verified the test is meaningful: with the buggy (single-char-first) ordering, 19 of these assertions fail; with the fix they all pass.

Updated the extension README to document the single-token behavior and the new grammar tests.

Verification

pnpm install --frozen-lockfile, then pnpm run lint, pnpm run fmt:check, pnpm test (51 passing), and pnpm run package (.vsix builds) — all green. Scope is limited to the grammar JSON, the grammar tests, and the README; no package.json/lockfile/.npmrc/workflow/Rust changes.

🤖 Generated with Claude Code

assapir and others added 3 commits June 27, 2026 16:29
TextMate applies, at each position, the first pattern in a rule list that
matches there — ties at the same start are decided by list order, not match
length. Make every multi-character operator (=> -> := |> <- :: == != <= >=
&& ||) win over its single-character prefixes by ordering the multi-char
operator rules ahead of the single-char ones, and split the logical rule so
the two-char `&&`/`||` sit in the multi-char tier while single-char `!`
stays in the single-char tier. Each multi-char operator keeps exactly one
scope; `<<`/`>>` remain single import/export tokens.

Add grammar tokenization tests (src/grammar.test.ts) backed by a tiny
dependency-free TextMate match-engine (src/grammar.ts) that reproduces the
ordered first-match-wins rule, so this can't regress. The tests assert each
multi-char operator tokenizes to a single consistent scope (spaced and tight),
plus regression guards for `<`/`>`, `=`, `$`, comments, strings, and numbers.

Document the single-token behavior and the grammar tests in the extension
README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Post-merge fixup after integrating the pnpm migration and CodeLens work:
the test list now names all three Node-testable modules (diagnostics,
entry-point detector, grammar tokenization) and uses `pnpm test`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@assapir assapir merged commit 436cd50 into main Jun 27, 2026
4 checks passed
@assapir assapir deleted the worktree-agent-a4f93e0aa5137af65 branch June 27, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant