Skip to content

Explore replacing ad-hoc parsing logic in datafusion-examples with a nom-based parser #20025

@cj-zhukov

Description

@cj-zhukov

Is your feature request related to a problem or challenge?

As we discussed with @Jefffrey in #19750 (comment) , I’d like to explore replacing the ad-hoc parsing logic in datafusion-examples used for generating documentation with a nom-based parser.

Describe the solution you'd like

datafusion-examples currently relies on custom / ad-hoc parsing logic for generating documentation. While functional, the existing approach can be difficult to reason about, extend, and validate against edge cases.

Parser combinator libraries such as nom
may offer clearer structure, improved error handling, and more robust parsing - but they also introduce additional complexity and a new dependency.

The goal of this issue is not to immediately replace the existing implementation, but to explore whether a nom-based parser:

  • Clearly simplifies the parsing logic
  • Improves correctness and robustness (especially around edge cases)
  • Makes future extensions easier to reason about and maintain

If these benefits are not clear, keeping the current approach is a valid outcome.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions