feat(fmt): infer config from .editorconfig#34071
Conversation
Adds an .editorconfig loader to `deno fmt`. When formatting a file, the nearest `.editorconfig` (walking up to a `root = true` boundary) is read and its properties are merged into the resolved fmt config, filling in only fields that were not set by `deno.json` or CLI flags. Precedence is CLI flags > deno.json > .editorconfig > defaults. Mappings: - indent_style -> useTabs - indent_size -> indentWidth (falls back to tab_width when style=tab) - tab_width -> indentWidth (when used as fallback above) - max_line_length -> lineWidth (ignored when set to "off") - end_of_line -> newLineKind (lf/crlf) Glob patterns from `.editorconfig` section headers are translated to regex and matched against the file path. Parsed config files are cached per `EditorConfigCache` so repeated lookups within a fmt batch do not re-read or re-parse them. Closes bartlomieju/orchid-inbox#77 Refs #14717
| /// Resolve `.editorconfig` properties for `file_path`. Returns | ||
| /// `Default::default()` if no `.editorconfig` files apply. | ||
| pub fn resolve(&self, file_path: &Path) -> EditorConfigProperties { | ||
| let abs_path = match canonicalize_path(file_path) { |
There was a problem hiding this comment.
This is a lot of extra work. Probably it's going to make deno fmt much slower on large directories.
There was a problem hiding this comment.
Good call — pushed 029a6b7 which:
- Pre-compiles each section's glob into a
Regexonce at parse time (it was being recompiled for every (file, section) pair). - Memoizes the resolved chain (outermost → innermost) of
.editorconfigfiles per starting directory, so files in the same directory share one walk and one set of mutex lookups.
So for a project with no .editorconfig, the per-file cost is now one canonicalize_path + one cached chain lookup that returns an empty Arc<Vec<_>> and bails immediately. For projects with a single root .editorconfig, the only per-file work is the regex is_match calls (no re-compilation, no re-walk).
Happy to add a benchmark or a directory-tree-wide short-circuit ("any .editorconfig anywhere?" check at fmt-batch start) if you want one — let me know.
…ection Addresses review feedback that per-file `.editorconfig` lookup was too expensive on large trees: - Pre-compile each section's glob into a `Regex` once at parse time instead of recompiling for every (file, section) pair. - Store parsed files behind `Arc` and memoize the resolved chain (outermost -> innermost) per starting directory. Subsequent files under the same directory reuse the chain without re-walking parents or re-locking the file cache per ancestor.
… debug Fold the per-file .editorconfig-resolved options into the incremental cache hash so editing .editorconfig invalidates the cached "already formatted" result even when the file body is unchanged. Previously the cache keyed only on file content plus batch-level options, so a --check could pass on a stale entry after an .editorconfig edit. Also log at debug level (once per discovered file) when an .editorconfig is found and used, and add spec coverage for the debug log and for the nested walk-up where a nearer non-root .editorconfig overrides a farther one.
Make the .editorconfig glob translator degrade gracefully instead of
crashing on malformed or adversarial section headers:
- Bound numeric range {n..m} expansion so a huge span like
{1..1000000000} no longer builds a giant regex (memory/CPU blowup);
oversized ranges degrade to a literal that simply does not match.
- Cap brace-nesting recursion depth so deeply nested alternations like
{a,{a,{a,...}}} cannot overflow the stack.
- Parse indent_size/tab_width/max_line_length with saturating integer
conversion so out-of-range values clamp rather than being silently
dropped.
- Escape '[' ']' '{' '}' in literal output so degraded/unbalanced
patterns still compile to a valid regex.
Adds unit tests for each case.
resolve() canonicalized every file path on every fmt run, paying a realpath syscall per file even when no .editorconfig exists anywhere in the tree. Walk the literal absolute path instead (fmt's collected paths are already absolute) and short-circuit to defaults via the memoized per-directory chain lookup before any filesystem work. No .editorconfig present now costs a single cached HashMap lookup per file with zero syscalls; discovery happens once per directory rather than once per file. Symlinks are no longer resolved during the walk, matching the editorconfig reference implementation.
bartlomieju
left a comment
There was a problem hiding this comment.
Reviewed the editorconfig loader and fmt integration. Solid, well-tested, and the incremental-cache folding + pathological-input hardening are genuinely careful work. Posting a few non-blocking nits inline. (Separately, worth deciding before merge: reading .editorconfig default-on is a silent behavior change for existing projects that have one but no deno.json fmt block — they'd get reformatted with no opt-out. You already flagged this in the description; just calling it out as the main product decision. CI also hasn't run here yet.)
| // Per the editorconfig spec, when indent_style is "tab" and | ||
| // indent_size is not set, indent_size defaults to tab_width. | ||
| // For "space" or unset indent_style, indent_size is taken as-is. | ||
| let indent = self.indent_size.or( |
There was a problem hiding this comment.
nit: per the editorconfig spec, indent_size = tab should resolve to tab_width regardless of indent_style. Here the tab_width fallback only fires when indent_style == Tab, so a section with indent_size = tab + tab_width = 4 but no indent_style falls through to the default instead of 4. Rare config, but a spec deviation.
| match c { | ||
| '*' => { | ||
| if i + 1 < bytes.len() && bytes[i + 1] == '*' { | ||
| // Treat `**/` as zero or more path components so that |
There was a problem hiding this comment.
nit: treating **/foo.ts as also matching foo.ts at the root is gitignore semantics, which deviates slightly from strict editorconfig **. It's documented and tested, so fine to keep — just worth a sanity check against the editorconfig reference test suite if exact parity matters.
| out.push_str("(?:.*/)?"); | ||
| } | ||
| let pattern = pattern.strip_prefix('/').unwrap_or(pattern); | ||
| let bytes: Vec<char> = pattern.chars().collect(); |
There was a problem hiding this comment.
nit: bytes holds chars, not bytes — slightly misleading name (chars would read better).
| @@ -0,0 +1,48 @@ | |||
| { | |||
| "tempDir": true, | |||
| "tests": { | |||
There was a problem hiding this comment.
nit: the incremental-cache invalidation is the subtlest logic in the PR but only covered implicitly. Consider a spec test that runs fmt --check, edits .editorconfig, and re-runs --check to assert it re-evaluates rather than returning a stale pass — that would lock in the behavior incremental_cache_text was built for.
Summary
deno fmtnow reads.editorconfigfiles and uses their settings to fillin fmt config fields that aren't otherwise set. This is the long-standing
ask in #14717.
Precedence (highest to lowest):
--indent-width,--use-tabs, ...)deno.jsonfmtblock.editorconfigSo
.editorconfigonly fills in fields the user hasn't alreadyconfigured. Mappings:
.editorconfigindent_styleuseTabsindent_sizeindentWidthtab_widthindentWidth(fallback whenindent_style = tabandindent_sizeis unset)max_line_lengthlineWidth(ignored whenoff)end_of_linenewLineKind(lf/crlf).editorconfigresolution walks up from the file being formatted,parsing each file it finds, and stops at the first one with
root = true(or at the filesystem root). Sections farther from the file are applied
first so nearer files override them — matching the editorconfig spec.
A small glob-to-regex translator handles the section header patterns:
*,**,**/,?,[abc],[!abc],{a,b,c}, and{n..m}. The**/foo.tsform is treated as matchingfoo.tsat any depth includingthe root, matching gitignore-style user expectations.
Parsed
.editorconfigfiles are cached per-fmt-run viaEditorConfigCache, so repeated lookups within a batch do not re-read orre-parse them. When a file is found, fmt logs
Found .editorconfig at <path> and using itatdebuglevel (visible with-L debug), once perdiscovered file.
Incremental cache
The fmt incremental cache is keyed on file content plus the batch-level
fmt options. Because
.editorconfigresolves per-file options that thebatch-level key does not capture, those resolved options are folded into
the cached hash for each file. Editing an
.editorconfigvalue thereforeinvalidates the cached "already formatted" result even when the file body
is unchanged, so a subsequent
--checkre-evaluates the file rather thanreturning a stale pass. Files not governed by any
.editorconfigkeeptheir existing cache entries and hash the file text as-is (no extra
allocation).
Test coverage
cli/tools/fmt_editorconfig.rs— 16 unit tests covering the parser,glob translation, and
apply_to/precedence behavior.tests/specs/fmt/editorconfig— 7 spec tests:infers_indent_size—.editorconfigsetsindent_size = 4andfmt --checkpasses on a 4-space file.infers_indent_size_negative—--indent-width=2on the CLIoverrides
.editorconfig, so check fails for the same file.infers_use_tabs—indent_style = tabin a subdirectory.infers_max_line_length—max_line_length = 200lets a long linethrough that would otherwise wrap.
deno_json_takes_precedence— explicitindentWidth: 2indeno.jsonis not overridden by.editorconfig'sindent_size = 4.nested_overrides_parent— a nearer non-root.editorconfigoverrides a farther
root = trueone (exercises the walk-up andnearer-overrides-farther merge order).
logs_at_debug—-L debugprints the "found and using it" notice.Robustness
The
.editorconfigglob translator and value parser degrade gracefullyon malformed or adversarial input rather than crashing:
{n..m}expansion is bounded, so a pathological spansuch as
{1..1000000000}does not build a giant regex; oversizedranges fall back to a literal that simply will not match.
like
{a,{a,{a,...}}}cannot overflow the stack.indent_size/tab_width/max_line_lengthparse with saturatinginteger conversion, clamping out-of-range values instead of silently
dropping them.
patterns still compile to a valid regex (a section that fails to
compile is simply inert).
Notes
.editorconfigis currently always read when present. Prettier matchesthis behavior; if maintainers prefer an opt-in (e.g. a fmt config flag
or an unstable gate) I'll add it.
Closes bartlomieju/orchid-inbox#77
Closes #14717