Skip to content

Add experimental/ast/printer package#657

Draft
emcfarlane wants to merge 42 commits intomainfrom
ed/printer2
Draft

Add experimental/ast/printer package#657
emcfarlane wants to merge 42 commits intomainfrom
ed/printer2

Conversation

@emcfarlane
Copy link
Contributor

@emcfarlane emcfarlane commented Jan 22, 2026

This adds experimental/ast/printer, an AST printer for protobuf files. It does not yet implement formatting, but uses the dom library to produce correct indentation for synthetic edits. Formatting support will be added later.

The printer preserves comments and whitespace using a trivia index. Each comment is classified as either attached (bound to a specific token) or detached (bound to a positional slot between declarations). Attached comments travel with their token when declarations are moved; detached comments stay in place. At print time, the printer zips slot trivia with children within each scope, and looks up attached trivia when emitting individual tokens. This works identically for natural and synthetic tokens (no special-casing needed). Synthetic tokens won't have a trivia and fallback to the declared gap. This gap is also what will be used to inform how to format the trivia between tokens when formatting.

Copy link
Member

@doriable doriable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API looks very reasonable to me, I just had a few nitpicks/discussion points for clarifying a few things. Thank you!

@emcfarlane emcfarlane changed the title Add experimental/printer package Add experimental/ast/printer package Feb 6, 2026
@emcfarlane emcfarlane marked this pull request as ready for review February 17, 2026 16:52
)

// PrintFile renders an AST file to protobuf source text.
func PrintFile(file *ast.File, opts Options) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want this to take function options right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean type Option func(...)? Was following dom and report options

Comment on lines +135 to +136
// the pending buffer so that adjacent pure-newline runs are combined into a
// single kindBreak dom tag, preventing the dom from merging them and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be dealt with by making kindBreak a bit more sophisticated?

@emcfarlane emcfarlane marked this pull request as draft March 11, 2026 19:23
Add TestBufFormat that reads buf's bufformat testdata (54 test cases)
and compares printer output against golden files. Currently 35/54 pass.

Fixes:
- Empty files produce empty output instead of trailing newline
- New gapInline style keeps punctuation on same line as preceding
  comments: `value /* comment */;` instead of breaking to new line
- New gapGlue style for path separators glues comments without
  spaces: `header/*comment*/.v1`
- Preserve source blank lines after detached comments using actual
  newline count from pending trivia tokens
- Body declarations at file level only get blank lines when source
  had them, instead of unconditionally
- Compound string first element on new indented line after `=`
- Strip trailing whitespace from comments in format mode
Add test cases documenting 8 remaining formatting issues:
- Issue 1: trailing // before ] should become /* */ on single-line
- Issue 2: trailing comment after , should stay inline
- Issue 3: comment after [ opener should expand to multi-line
- Issue 5: comment before } in enum (already passes)
- Issue 6: EOF comment after blank line should preserve blank line
- Issue 7: block comments in RPC parens should not add extra spaces
- Issue 8: extension path comments should preserve spaces
- Issue 9: message literal with block comments should expand
Two formatting fixes:
- Extract inline trailing comments after commas in walkDecl so they
  stay on the same line as the comma instead of becoming leading
  trivia on the next token
- Preserve blank lines before EOF comments by checking
  trivia.blankBeforeClose in printFile
When the open bracket of compact options has trailing comments (e.g.,
[ // comment), force multi-line layout and emit the comments on their
own indented lines instead of inline with the bracket.
Trailing block comments are now properly handled: line comments always
get a space separator, block comments also get a space (matching buf
format for semicolon and brace contexts). Issue 7 (paren-context
gluing) remains open.
When a compact option collapses to a single line, any trailing //
comment would eat the closing bracket. Convert these to /* ... */
block comments to preserve the bracket.
Clear the convertLineToBlock flag when entering nested scopes (body,
array, dict) so that // comments on their own lines inside expanded
structures are not incorrectly converted to /* */ block comments.
Extract close-bracket/brace leading comments and emit them inside the
indented block (like printBody does) so they get proper indentation.
When walkDecl reaches end of scope with leftover pending tokens (no ;
or } found), check if the rest tokens contain a blank line before
pushing them back. This sets blankBeforeClose correctly for close-
bracket/brace comments that need a blank line separator.
Path comments use gapGlue which suppresses spaces around comments,
matching buf format behavior for package paths. Issue 8 (source-
dependent spacing) remains open for extension paths where the source
has spaces.
The first separator in a path (e.g., the leading '.' in '.google')
now uses the caller's gap instead of gapGlue, fixing 'extend. google'
to the correct 'extend .google'.
- Dict literals with attached comments (on open/close braces or
  interior tokens) now force multi-line expansion
- Compact options close-bracket leading comments are now emitted
  inside the indented block for proper indentation
- Update message_literals golden to accept inline trailing block
  comments on dict fields
- Detect blank lines in empty scope pending tokens to set
  blankBeforeClose correctly when two comment groups are separated
  by a blank line (trailing-on-open and leading-on-close)
- Fix emitCloseComments to use gapNewline for pending comments
  (first content in indent block) regardless of blankBeforeClose
- Dict literals with attached comments force multi-line expansion
  using scopeHasAttachedComments to check all tokens in the scope
- Compact options close-bracket comments are now properly indented
- emitCloseComments is now called when pending has comments even
  without close-token comments, fixing slot comments that would
  otherwise be emitted outside the indent block
- Detect blank lines in empty brace scopes between comment groups
  to set blankBeforeClose correctly
- group/empty and enum_value_trailing_comment now pass
Block comments that trail a semicolon at end of a body scope (e.g.,
enum Foo { VAL = 1; /* comment */ }) are pushed back to become scope
trivia instead of staying inline. This ensures they get their own
indented line in the formatted output.

Also fix close-brace pending comment flushing in printBody to handle
slot comments that appear after the last declaration.
When a negative prefix (-) has a block comment before its value,
use gapSpace instead of gapNone so the comment gets proper spacing
(e.g., "- /* comment */ 32" instead of "-/* comment */32").
…sion

- Negative prefix (-) with block comments uses gapSpace for proper
  spacing (e.g., "- /* comment */ 32")
- Revert compound string // to /* */ conversion attempt as it caused
  trailing comment conversion on the following semicolon
This ensures block comments in glued contexts (RPC parens, path
separators, generics) always get a space after them before the next word
token. Without comments, behavior is unchanged.
…rsion

Three fixes for comment preservation in the printer:

1. Generalize inline trailing comment extraction in walkDecl to all
   tokens, not just commas. A comment on the same line as any token
   (e.g., "bar: 2 // comment") is now correctly attached as trailing
   trivia on that token. Guarded by firstNewline < len(leading) to
   avoid reclassifying block comments between same-line tokens.

2. Add emitCommaTrivia to printDict so that comments attached to
   comma tokens (which are removed during message literal formatting)
   are never silently dropped.

3. Manage convertLineToBlock in printCompoundString: clear it for
   intermediate parts (// comments between string parts on their own
   lines are fine), restore the caller's value for the last part's
   trailing (a // there would eat the following ; or ]). Add
   withLineToBlock helper for scoped save/restore. Set it in
   printOption since ; follows the value inline.
…ia only

Set convertLineToBlock in printPath since path components are glued
inline (gapGlue) and a trailing // comment between components would
eat the next identifier.

Remove the convertLineToBlock check from emitTrivia (leading trivia).
After the generalized inline trailing extraction in walkDecl, all
comments remaining in leading trivia are on their own lines and never
eat following tokens. Only emitTrailing needs the conversion. This
prevents over-conversion of leading // comments that are safe as-is.
Categorize and explain the stylistic differences between our printer
output and the old buf format golden files. All remaining differences
are intentional formatting choices, not correctness issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants