feat(parser): complete direct ANTLR migration and retire Instaparse#74
Open
munen wants to merge 27 commits into
Open
feat(parser): complete direct ANTLR migration and retire Instaparse#74munen wants to merge 27 commits into
munen wants to merge 27 commits into
Conversation
77bc50f to
1825820
Compare
Migrate CLJ/CLJS parsing to direct ANTLR, remove Instaparse/EBNF runtime remnants, and keep CI plus test workflows green across lein, doo, CLI, and uberjar paths.
1825820 to
63b3087
Compare
Move shared parser API logic to parser.cljc and remove duplicated CLJ/CLJS wrapper files while keeping runtime-specific ANTLR interop split.
Rename antlr_parser_test to parser_start_rules_test and update namespace/test ids to describe parser behavior rather than backend implementation details.
Bring back richer table assertions in parser_test for the headlines_and_tables fixture using focused structural checks for org-table cells, formulas, and table.el lines without brittle full-tree equality.
Replace legacy EBNF parser wording with ANTLR-focused language and clarify that grammar remains EBNF-like while ANTLR provides stronger tooling and cross-runtime generation.
Document parser architecture and workflow, tighten parser API contracts, expand CLJ/CLJS start-rule parity tests, and extract shared AST post-processing to reduce runtime drift risk.
0574f80 to
75f8830
Compare
Shift more line and timestamp parsing into the shared lexer/parser grammars so CLJ and CLJS rely on the same cross-platform parser behavior with less duplicated runtime logic.
Move tags, diary sexp, affiliated keywords, list items, tables, and text-styled start rules from custom regex parsing into shared grammar-backed parsing for CLJ and CLJS.
Replace manual link-format parsing with grammar-backed parsing and shared AST mapping while preserving existing escaped-bracket and link-target semantics across CLJ and CLJS.
Route link-format plus eol/word direct starts through grammar-backed parsing and keep AST behavior aligned across JVM and JS runtimes.
Shift text-sup and radio-target parsing into grammar-backed starts and wire text scanning through those ANTLR rules while preserving existing AST outputs across CLJ and CLJS.
Move noparse block parsing from custom regex logic into grammar-backed parsing and remove obsolete runtime parsing helpers while keeping CLJ/CLJS behavior aligned.
Use grammar-driven text and dynamic block parsing so the JVM and JS parsers stay aligned while removing the remaining custom text scanner.
Keep unmatched inline delimiters as plain text and reject URL-like file-link inputs so the ANTLR parser stays aligned across CLJ and CLJS. Remove the stray planning artifact and lock the behavior in regression tests.
Move the common parser AST-building and validation code into a shared cljc namespace so JVM and Node stay aligned while reducing duplicated maintenance work.
2aff51d to
fc238e0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
resources/org.ebnf, parser macros, and Instaparse dependency), while keeping CLI and uberjar behavior intactlein test,lein doo node once,lein run, andlein uberjarsmoke paths:Sfast path strategy and benchmark workflow inREADME.orgBenchmark Comparison
lein run -m org-parser.benchmark 2 4(Instaparse era)lein run -m org-parser.benchmark 5 20(current ANTLR branch)