Skip to content

Simplify formatter architecture#1165

Open
sferik wants to merge 11 commits intomainfrom
simplify-formatter-architecture
Open

Simplify formatter architecture#1165
sferik wants to merge 11 commits intomainfrom
simplify-formatter-architecture

Conversation

@sferik
Copy link
Copy Markdown
Collaborator

@sferik sferik commented Apr 5, 2026

This PR builds on #1164 to separate data from presentation in the coverage report by making the HTML formatter a client-side JavaScript (TypeScript) app that reads structured JSON data, rather than a server-side Ruby ERB renderer.

Both formatters share a single entry point for data serialization: JSONFormatter.build_hash(result), which converts a SimpleCov::Result into a Ruby hash with no side effects. The JSON formatter calls build_hash and writes the result to coverage.json. The HTML formatter calls the same method, wraps the result as window.SIMPLECOV_DATA = {...}; in a JavaScript file, and copies pre-compiled static assets (index.html, application.js, application.css, and favicons) into the output directory. There is no ERB and no Ruby-side rendering.

The JavaScript app reads window.SIMPLECOV_DATA on page load and renders everything client-side. Source files are rendered on demand when the user clicks into them.

The ERB templates, Ruby view and coverage helpers, ResultExporter, and SourceFileFormatter have been deleted, removing about 2,700 lines of code. In their place, a static index.html and client-side rendering logic in TypeScript add about 1,500 lines, for a net reduction of roughly 1,200 lines.

The JSON output has been enriched with source code, coverage counts, branch display metadata, group file membership, and result metadata. The formatter class hierarchy has been flattened from 5 classes to 3: HTMLFormatter, JSONFormatter, and ResultHashFormatter.

This gives a clean separation of data and presentation. The HTML report can be regenerated from a saved coverage.json without re-running tests via HTMLFormatter#format_from_json. The coverage.json format becomes a self-contained data API that other tools and third-party formatters can consume directly. The static HTML, JS, and CSS assets are pre-compiled once via rake assets:compile and ship with the gem. No runtime template rendering is needed.

This comment was marked as resolved.

@sferik sferik force-pushed the simplify-formatter-architecture branch 4 times, most recently from 1478ee1 to 85a78d1 Compare April 7, 2026 02:36
Comment thread spec/fixtures/json/sample_with_branch.json
Comment thread lib/simplecov/formatter/json_formatter.rb Outdated
Replace server-side ERB template rendering with a static single-page
JavaScript app. The HTML formatter now writes coverage_data.js
(a thin JS wrapper around the JSON coverage data) and copies
pre-compiled static assets. All rendering — file lists, source code
views, coverage bars, group tabs — happens in the browser.

This cleanly separates data (JSON) from presentation (JS/HTML/CSS):

* Delete ERB templates, Ruby view helpers, and coverage helpers
* Move all rendering logic into TypeScript (app.ts)
* Add static index.html as a pre-compiled asset via rake assets:compile
* Flatten formatter class hierarchy (5 classes → 3)
* Introduce JSONFormatter.build_hash for shared, side-effect-free
  serialization used by both formatters
* Eliminate double JSONFormatter execution when both formatters
  are configured
@sferik sferik force-pushed the simplify-formatter-architecture branch from 85a78d1 to 827afd3 Compare April 17, 2026 00:56
sferik added 2 commits April 16, 2026 18:29
Per-file coverage keys, group file lists, and minimum_coverage_by_file
error keys now use `project_filename` (root-relative) instead of
`filename` (absolute). Consumers already have `meta.root` if they
need to reconstruct an absolute path.

The frontend's `shortenFilename` simplifies to stripping the leading
`/` — no more root parameter threaded through the render pipeline.
Record the time each JSONFormatter process loads as `PROCESS_START_TIME`
and, before writing `coverage.json`, compare the existing file's
`meta.timestamp` against it. If the file on disk was written *after*
this process started, another test run likely produced it — warn that
the overwrite will lose their data and point at `SimpleCov::ResultMerger`.

The check is zero-cost for the common case (no existing file, or an
older one from a prior sequential run) and has no state beyond the
existing `meta.timestamp` field.

Sequential re-runs do not warn: the prior file's timestamp predates
the new process's start. parallel_tests-style setups, where workers
share `PROCESS_START_TIME` but finish at different times, do warn
once the later writers reach format time.
@sferik sferik force-pushed the simplify-formatter-architecture branch from 7b7f797 to f953a80 Compare April 17, 2026 01:33
@sferik sferik requested a review from brynary April 17, 2026 14:22
@brynary
Copy link
Copy Markdown
Contributor

brynary commented Apr 17, 2026

I had left two comments but with the new GH UI I can no longer find them:

  1. I'm unsure if the shortenFilename function may need to account for backslashes on Windows. (Not my jam, just caught my eye.)

  2. I would strongly recommend that the relative paths not be prefixed with a forward slash /. Consuming tools will likely interpret a forward-slash prefixed path as absolute and do the wrong thing.

The paths returned by `SourceFile#project_filename` are described as
"relative to the projects directory," but they began with a `/` because
stripping `SimpleCov.root` left the separator behind. Consumers of
`coverage.json` (and the `minimum_coverage_by_file` error line) can
reasonably interpret a `/`-prefixed path as absolute and resolve it
incorrectly.

Strip the leading `/` or `\` in `project_filename`, adjust the built-in
regex profiles (`hidden_filter`, `test_frameworks`, `rails`) and
`StringFilter#segment_pattern` to match segment boundaries without
requiring a leading slash, refresh the JSON fixtures and specs, and
remove the now-unnecessary `shortenFilename` helper from the HTML
frontend (which also sidesteps any Windows backslash concern).

Anchored user-supplied RegexFilters that relied on a leading `/`, like
`%r{^/lib/}` need to be rewritten as `%r{\Alib/}`. Noted in CHANGELOG.
@sferik
Copy link
Copy Markdown
Collaborator Author

sferik commented Apr 17, 2026

@brynary thanks for the second round for feedback.

SourceFile#project_filename now strips the leading separator so it returns a truly relative path (lib/foo.rb instead of /lib/foo.rb). That flows through to the coverage.json keys, the group file lists, the errors.minimum_coverage_by_file.* keys, and the CLI "coverage by file … is below … in " message.

Since paths are now genuinely relative, I removed the shortenFilename helper in the frontend entirely. This sidesteps the Windows backslash concern you flagged. The regex in project_filename strips / or \, so the frontend just uses the JSON values as-is.

Users with their own anchored RegexFilters like %r{^/lib/} will need to rewrite them as %r{\Alib/}. I called this out in the CHANGELOG. This could break a lot of apps, but it’s an easy fix and I agree that it’s more correct and now (ahead of the 1.0 release) is the time to make that change.

The concurrent-overwrite warning compares the timestamp in an existing
`coverage.json` against `PROCESS_START_TIME`. `Time.now` has sub-second
precision, but `iso8601` (no arg) truncates to whole seconds on the way
out, so a sibling writer landing inside the same wall-clock second as
when we started could slip past the check.

Switch `meta.timestamp` to `iso8601(3)` so the round-trip keeps
millisecond fidelity, and update the JSON fixtures and the
`json_formatter_spec` assertion to match.
The default HTMLFormatter already writes coverage.json, so the env-var
trigger that pushed JSONFormatter onto the formatter list is redundant,
it just caused a duplicate write of the same file. With the special case
gone, `SimpleCov::Formatter.from_env` has no reason to exist, so the
default_formatter module and its spec are deleted and defaults.rb sets
HTMLFormatter directly.
Copy link
Copy Markdown
Collaborator

@PragTob PragTob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I want to thank you for your work you put into this. 💚

That said, I'm still not a fan of the approach of letting the web frontend read the JSON again:

This moves a lot of core logic of simplecov to TypeScript, which for a Ruby project makes it less friendly to contribute to and fix (while many Ruby folks know JS/TS, not all do and often not as well).

I see the point of rendering on just data/a known data exchange format. But, since we already have the data loaded in Ruby with helpers et. al. it's just as fine to render it from Ruby in my opinion - there are no extra steps necessary (we're not talking to a REST API or something, but dealing with data we just gathered in our own process).

Rendering this out on the SimpleCov side may be faster than the old HTML renderer. That said, I liked that it was done then and not much rendering had to be done on page load. Call me old-fashioned but for a mostly static report I appreciate most of it being rendered once.

I was gonna say "but I can't argue with the reduction in loc", but I think the following claim is misleading:

The ERB templates, Ruby view and coverage helpers, ResultExporter, and SourceFileFormatter have been deleted, removing about 2,700 lines of code. In their place, a static index.html and client-side rendering logic in TypeScript add about 1,500 lines, for a net reduction of roughly 1,200 lines.

It makes it seem like the new approach is much more efficient in terms of line of code, that is not true.
The reduction comes from deleting the tests, not the source files. A net of 1700+ of the deletions is the test files for the removed ruby code, the net of the removed ruby code being ~400 loc (less than the TS code added, although granted a lot of that is also the md5 function so I suppose we can call it equal).

FOLDER                         ADDS       DELS
html_frontend                  678        57
Rakefile                       4          1
CHANGELOG.md                   3          0
spec                           290        81
features                       44         78
lib                            240        649
README.md                      0          3
test                           182        1926

The JS code also no equivalent tests, so for an argument on lines deleted/added we could have deleted just the ruby tests and had an equivalent reduction.

I didn't write nor review many of the html formatter tests, but this also means less tests to rely on/more reliance on the feature tests.

I'd prefer to keep the rendering logic on the ruby side, mostly to make it easier for people to debug and contribute - while having unit tests for it is also nice (I know we could also write some for the TS side).

I'll repeat what I said before, that's my opinion but imo I haven't done enough work on SimpleCov recently to push on that.

Other comments are mostly optional, although I'd be interested on the time stamp one.

Image

Comment thread features/support/env.rb
Comment thread html_frontend/src/app.ts Outdated
Comment thread html_frontend/src/app.ts Outdated
Comment thread html_frontend/src/app.ts
Comment thread html_frontend/src/app.ts
Comment thread lib/simplecov/formatter/json_formatter.rb Outdated
Comment thread CHANGELOG.md Outdated
sferik added 5 commits April 19, 2026 15:39
The frontend only needs a stable, fixed-length, URL/HTML-safe ID per
filename, not a cryptographic digest. The 85-line inline MD5 routine
existed solely to reproduce Digest::MD5.hexdigest from the old Ruby-side
renderer.

Drop it in favour of crypto.subtle.digest('SHA-1') in its own module,
truncated to 8 hex characters. Web Crypto is async, so resolve every
file's id once at startup (precomputeFileIds) and let the rendering
pipeline keep its synchronous fileId() lookups.
Previously the frontend recomputed group/all-files totals by summing
per-file fields, and computed each per-file total_lines as
covered_lines + missed_lines. The JSON formatter already exposed full
stats for groups and the overall total — the duplication invited the
frontend and the formatter to drift on rounding or definitions.

Add total_lines to per-file output so it's no longer derived, and pass
the precomputed total/group StatGroups into renderFileList so the
totals row reads them directly. The formatter is now the single
source of truth for these metrics.
Was a JSONFormatter constant initialized at class-load time, which
could be much later than when coverage actually started (e.g., lazy
require) and lived on a formatter rather than on SimpleCov itself.

Move it to a SimpleCov.process_start_time accessor set in
SimpleCov.start, so it represents what it claims to. JSONFormatter
reads it from there; if start was never called, we skip the
concurrent-overwrite check rather than warn against a fake reference.
@sferik
Copy link
Copy Markdown
Collaborator Author

sferik commented Apr 19, 2026

@PragTob Thanks for the thoughtful review. Your inline comments were very helpful and I’ve addressed all of them with code. Your broader points were also good, so let me try to address those…

On contributor-friendliness: just because SimpleCov is a tool for the Ruby community, I don’t think 100% of the code needs to be in Ruby/ERB. In fact, this has never been the case; there has always been some amount of JavaScript in SimpleCov. The difference is that now the Ruby/JS boundary corresponds to a real architectural seam rather than arbitrary file boundaries. Anyone fixing how coverage is measured or attributed still works exclusively in Ruby. And thanks to your feedback, fffeaf5 moves the last bits of frontend recomputation back into ResultHashFormatter, so the formatter is the single source of truth for every derived metric.

On the render-once-vs-on-page-load preference: you’re right that the TS does run at page load, and I don't want to pretend that’s not a trade. But I think the “render once” intuition is sized for the old ERB renderer, where rendering meant computing all the metrics. The TypeScript code isn’t doing that. The JSON formatter already has, so the TS just walks the data and builds DOM. Even on large reports it’s milliseconds. What we get in exchange is HTMLFormatter#format_from_json, which can rebuild the HTML view from any existing coverage.json. The same thing benefits any third-party tool: services, custom formatters, dashboards now get the same payload the HTML view does, instead of having to re-derive the numbers themselves. Previously, the output of JSONFormatter and the HTML report had drifted in shape, this architecture ensures that won’t happen again in the future.

You were also right that the LOC framing in the description was misleading. I shouldn’t have leaned on it. Honestly, the TS is roughly comparable in size to the deleted ERB + view helpers, but the report itself does quite a bit more (sortable/filterable tables, keyboard nav, in-page filename filter, dark mode, on-demand source rendering). The actual win isn't fewer lines, it’s one data path for many consumers.

The test-coverage point is fair too. The data layer is still fully unit-tested in Ruby, and the HTML formatter tests now assert against the parsed JSON payload, but you're right that there’s no direct unit coverage for the TS modules. Cucumber/Capybara is the only thing exercising the rendered DOM. If that’s a gate for you before 1.0, I'm happy to add Vitest specs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants