Skip to content

docs(parser): numeric -t/--type filter silently matches by grammar-unstable kind_id #799

@dekobon

Description

@dekobon

Summary

Parser::filters silently reinterprets any -t/--type value that parses
as a u16 as a raw tree-sitter kind_id, but this behaviour is neither
explained at the call site nor documented for CLI users.

Location

  • src/parser.rs:190-201

Evidence

_ => {
    if let Ok(n) = f.parse::<u16>() {
        res.push(Box::new(move |node: &Node| -> bool { node.kind_id() == n }));
    } else {
        // Exact match on `node.kind()` — the CLI documents
        // `find <NODE>` / `count <NODE_TYPE>` as searching
        // for a specific node type, not a substring (see
        // big-code-analysis-book/src/commands/nodes.md and
        // issue #293).
        let f = f.to_owned();
        res.push(Box::new(move |node: &Node| -> bool { node.kind() == f }));
    }
}

The inline comment documents only the string (exact-kind()) branch and
its #293 history. The numeric branch — which matches by integer kind_id
instead of by node-type name — has no rationale comment, and the user-facing
docs (big-code-analysis-book/src/commands/nodes.md) describe -t/--type
solely as "the node type to match", with no mention that a numeric argument
is interpreted as a kind_id.

Consequences:

  • A user who passes a numeric -t value gets kind_id matching with no
    indication this differs from name matching. kind_id values are not
    stable across grammar versions, so a value that matched one node type
    today may match a different one after a grammar bump — a silent
    behaviour change for any script that relied on a numeric filter.
  • -t 0 matches the tree-sitter end/error sentinel rather than producing
    an empty result, which is surprising and undocumented.

Expected Behavior

The numeric kind_id matching path should carry an inline rationale comment
(mirroring the documented string branch), and the nodes.md docs should
state that a numeric -t/--type value is treated as a raw kind_id (and
warn that kind_id values are grammar-version-specific).

Actual Behavior

The numeric kind_id path is undocumented both in-source and in user docs;
the only documented filter semantics are the keyword set and exact kind()
name matching.

Impact

CLI users of find/count with numeric type arguments; maintainers reading
filters who must reverse-engineer why numeric input is special-cased.
Low severity — no incorrect computation, but a documentation/contract gap
around grammar-version-unstable kind_id values.


Resolution

Fixed in 7a41184: numeric -t/--type kind_id behavior documented in parser.rs and nodes.md as grammar-version-unstable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions