diff --git a/.agents/prompts/cg-htmlcss-feature.md b/.agents/prompts/cg-htmlcss-feature.md
new file mode 100644
index 000000000..b67bd08e6
--- /dev/null
+++ b/.agents/prompts/cg-htmlcss-feature.md
@@ -0,0 +1,316 @@
+# cg-htmlcss — feature loop prompt
+
+**What this is.** A pastable prompt template for driving a single CSS
+feature forward in the cg htmlcss renderer. Paste the template at the
+bottom into a new task; the reference above it is context an agent
+can read to follow the loop honestly.
+
+**Why this is a prompt and not a skill.** The 5-phase loop is
+deliberately heavy — audit + ground + fixture + implement + verify.
+It's overkill for small fixes, and it's already a conductor over
+`/research`, `/fixtures`, `/cg-reftest`, which auto-trigger correctly
+on their own. Opt-in invocation is right: paste it when you want the
+full cycle; skip it for paper-cuts.
+
+**Lifecycle.** Expect this file to grow as new divergence patterns
+surface. It will likely go stale in parts once htmlcss hits
+Chromium-parity on L0/L1; treat the _phase structure_ as durable and
+the _property-specific callouts_ as advisory.
+
+---
+
+## The five phases
+
+```text
+┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐
+│ 1. AUDIT │→ │2. GROUND │→ │3. FIXTURE│→ │ 4. IMPL  │→ │5. VERIFY │
+└──────────┘  └──────────┘  └──────────┘  └──────────┘  └──────────┘
+    │                                                         │
+    └───── ← ─── loop ← ─── score < floor ← ─── diff ← ───────┘
+```
+
+Each phase has a **question it answers**, a **deliverable**, and an
+**exit criterion**. Don't skip forward; don't linger past the exit
+criterion. The loop closes at verify — if the score is below the
+gate, return to phase 3 or 4 with a specific hypothesis, not a vibe.
+
+### 1. Audit — "what's the actual state of this feature?"
+
+**Question.** Where is the feature on the cg side today? What
+renders wrong, what doesn't render at all, what renders
+coincidentally-correctly but by the wrong path?
+
+**Actions.**
+
+- Scan `crates/grida-canvas/src/htmlcss/` for the property name in
+  stylo enum mapping, paint emit, layout feed. A property can be
+  parsed-but-dropped, emitted-but-wrong, or unhandled — each has a
+  different fix shape.
+- Enumerate existing fixtures that touch the feature
+  (`fixtures/test-html/L0/`). Run them under `L0.coverage` and
+  record current similarity per fixture. This is the before-number.
+- Check `docs/wg/feat-2d/htmlcss.md` and any related design notes
+  for a prior decision or deliberate gap.
+- List sibling properties likely to break the same way (e.g.
+  `border-radius` %-values implied `border-image-slice` %-values).
+
+**Deliverable.** A short audit note inside the task prompt or the
+PR draft:
+
+- _Current support level_: not-parsed / parsed-but-dropped /
+  partial / Chromium-parity-except-X.
+- _Fixtures touching it_: list with current similarity scores.
+- _Priority bucket_: easy-and-important / easy-low-value /
+  hard-important / hard-low-value. Pick from the top-left by
+  default; only go hard-important when called out.
+
+**Exit when.** You can state the feature's current renderer state
+in one paragraph with file references. If you can't, you don't
+know enough yet — read more, don't guess.
+
+### 2. Ground — "how do real engines solve this?"
+
+**Question.** What's the canonical implementation strategy for
+this feature in a mature engine? We are not inventing; we are
+adapting.
+
+**Actions.** Invoke `/research`. Three engines are the usual
+references:
+
+- **Servo + stylo** — Rust, most readable. Especially useful for
+  parsing, cascade, inheritance, computed-value rules.
+- **Chromium / Blink** — C++. Authoritative for layout and paint
+  divergence calls. The renderer we diff against.
+- **WebKit** — C++. Third voice; useful when Blink has
+  controversial behavior (Safari-only bugs / features).
+
+For a new property, read the **spec first** (CSS Backgrounds,
+CSS Display, CSS Values 4, etc.). Then look up:
+
+- How stylo represents the property's computed value.
+- How Blink paints or lays out against that representation.
+- What WPT section exercises it (for free fixtures later).
+
+**Deliverable.** A research note — either inline in the PR
+description or under `docs/wg/feat-2d/` if substantial — with:
+
+- The spec section(s) that govern behavior.
+- The 3–6 line summary of how stylo/Blink structure the solution.
+- The explicit deviation, if any, and why.
+
+**Exit when.** You can defend the implementation shape by pointing
+at prior art, not just "it compiles and the fixture passes." If
+the only justification is the fixture, you've over-fit.
+
+### 3. Fixture — "what's the smallest test that proves it?"
+
+**Question.** What HTML/CSS input demonstrates the feature
+unambiguously, and what does the ideal rendered output look like?
+
+**Actions.** Invoke `/fixtures` for authoring rules; `/cg-reftest`
+for the suite manifest. In short:
+
+- One concept per file. `paint-<property>-<variant>.html` naming.
+- Probe-friendly palette (≤3 colors, round coordinates) when the
+  feature is pixel-precision rather than paint-rich.
+- **Paint vs. layout decision.** Paint fixtures fix body size to
+  the preset (via `min-height`); layout fixtures let content size
+  itself and carry an explicit `viewport` in the suite entry.
+  See `fixtures/test-html/README.md`.
+- Inject `hide-text.css` via `extra_css` when text is incidental
+  (labels for humans, not the subject under test). This is the
+  single biggest lever against noise.
+- WPT fixtures are fair game — prefer pulling an established WPT
+  test into the suite over authoring one from scratch when the
+  section is mature.
+
+**Deliverable.**
+
+- One or more fixtures under `fixtures/test-html/L0/`.
+- Entries in `fixtures/test-html/suites/L0.coverage.json`. Only
+  put in `L0.exact.json` after verify phase confirms 100.00%.
+- For layout fixtures: the measured `viewport.height` from the cg
+  natural cull.
+
+**Exit when.** The fixture runs through both producers and
+produces PNGs of identical dimensions. Dimension mismatch → stop;
+the suite config is wrong and the score will be zero.
+
+### 4. Implement — "what code change realizes the behavior?"
+
+**Question.** What is the minimum set of edits in
+`crates/grida-canvas/src/htmlcss/` to make the fixture match?
+
+**Actions.**
+
+- Touch the smallest surface that can possibly work. Avoid
+  "refactor + feature" in one commit; the reftest cannot tell you
+  which change caused which delta.
+- Trace the pipeline end-to-end for the property:
+  parse → compute → layout feed → paint. A feature can fail at
+  any stage; diagnose before editing.
+- Add unit tests where behavior is data-assertable (computed
+  value, resolved length, layout position). Data tests are free
+  and catch regressions the reftest can't (e.g. "this resolves
+  to `12px` in _both_ Chromium and us, for the right reason").
+- When in doubt, mirror the Blink / stylo structure. Deviations
+  cost reviewer attention; prior-art parity is free.
+
+**Deliverable.**
+
+- Code change scoped to the feature.
+- Any new data tests for the computed-value surface.
+- A one-line entry in the PR description for each user-facing
+  behavior change, written in spec terms, not implementation
+  terms.
+
+**Exit when.** `cargo check -p cg` is clean, existing tests pass,
+and the fixture renders through `golden_htmlcss --suite` without
+error. Similarity score is measured in phase 5 — do not gate on
+it here.
+
+### 5. Verify — "does it actually match Chromium?"
+
+**Question.** Is the rendered output Chromium-parity at the
+fixture's tolerance gate?
+
+**Actions.** This is `/cg-reftest`'s core loop. For each fixture
+in the change:
+
+1. Render expecteds (Playwright Chromium) into
+   `target/refbrowser/<suite>/expected`.
+2. Render actuals (`cargo run -p cg --example golden_htmlcss --
+--suite …`).
+3. Diff with `@grida/reftest`, threshold 0 (the strict default).
+4. Read similarity against the suite's `gate.floor`.
+
+**Don't trust the score naively** — see "Reading the score" in the
+cg-reftest skill. A 96% score on a sparse fixture can mask a
+completely broken subject. Eyeball the diff PNG every time. A
+single round of verification without visual inspection is not
+verification.
+
+**Close the loop:**
+
+- Score ≥ `gate.floor`? Promote the fixture to `L0.exact.json`
+  if it reached 100.00%; otherwise leave in coverage and document
+  the residual delta in the PR description.
+- Score < floor? Return to phase 3 (fixture too noisy / wrong
+  subject) or phase 4 (renderer bug) with a specific hypothesis.
+  Do _not_ lower the gate to fit the result; the gate exists so
+  regressions are loud.
+
+**Deliverable.** The PR description, written honestly:
+
+- Before/after similarity numbers for every affected fixture.
+- Diff PNGs attached or linked for any score < 1.0.
+- The specific divergence surface (rounding, AA, layout math,
+  etc.) if below 100.00%. "Renderer choice differs from Blink at
+  <specific pixel class>" beats "close enough."
+
+**Exit when.** The PR description can be read by someone who has
+never seen the code and they know exactly what's now supported,
+what's still broken, and what the score proves.
+
+---
+
+## Handoffs and artifacts
+
+The phases are designed so an agent can stop, a second agent can
+pick up, and no context is lost. The durable artifacts:
+
+| Phase     | Artifact                                                 | Location                                               |
+| --------- | -------------------------------------------------------- | ------------------------------------------------------ |
+| Audit     | Current-state note, priority bucket                      | PR description / task prompt                           |
+| Ground    | Research note (spec + engine cross-ref)                  | PR description or `docs/wg/feat-2d/`                   |
+| Fixture   | `.html` fixture(s), suite entries, viewport measurement  | `fixtures/test-html/L0/`, `fixtures/test-html/suites/` |
+| Implement | Code change, data tests, behavior summary                | `crates/grida-canvas/src/htmlcss/`                     |
+| Verify    | Before/after scores, diff PNG review, divergence surface | PR description                                         |
+
+If a phase's artifact is missing, the phase isn't done — even if
+the code "works."
+
+---
+
+## Gate policy — the part that makes automation safe
+
+The only reason this loop can be automated is that phase 5 has a
+**numeric, unambiguous, byte-exact** pass condition. Everything
+upstream is advisory; verify is the truth.
+
+- `L0.exact.json`: `gate.floor = 1.0`, `threshold = 0`, `aa = off`.
+  Any regression is a real renderer change we made differently
+  from Blink. No tolerance inflation — ever.
+- `L0.coverage.json`: informational scores, no gate. Landing a
+  fixture here is "we know about this case and intend to fix it."
+  Promoting to exact is "we now match Blink."
+
+Automation rules downstream of this prompt (CI gating, auto-merge,
+etc.) must assert on the `report.json` emitted by `@grida/reftest`
+and **not** on free-text agent assertions. The agent's job is to
+drive the loop; the report is the contract.
+
+### What "destructive" means here
+
+A change is destructive if it:
+
+- Lowers `gate.floor` in `L0.exact.json`.
+- Removes an entry from `L0.exact.json` without a corresponding
+  `coverage` entry (or documented reason).
+- Increases `--threshold` or enables `--aa` to absorb real
+  divergence.
+- Suppresses a fixture to dodge a failing score.
+
+None of these are acceptable without explicit human approval. The
+loop fails loudly instead.
+
+---
+
+## Anti-patterns
+
+| Anti-pattern                                    | Why it fails                                                                                | Instead                                                            |
+| ----------------------------------------------- | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
+| Skipping audit, starting with "fix this bug"    | The bug is a symptom; the broken pipeline stage may be a different property.                | Trace parse→compute→layout→paint first. Name the stage.            |
+| Skipping ground, implementing from intuition    | CSS is full of non-obvious spec requirements. "Looks right" to a human ≠ spec-correct.      | Read the spec. Cross-ref one real engine.                          |
+| Combining refactor + feature in one PR          | Reftest deltas can't be attributed.                                                         | Land the refactor alone first (score must not drop).               |
+| Raising threshold to "just pass"                | Hides real bugs. Turns the harness into a rubber stamp.                                     | Fix the divergence. If out of scope, document + leave in coverage. |
+| Using text-heavy fixtures to test non-text feat | Font shaping noise dominates the score; you're measuring the wrong thing.                   | Inject `hide-text.css`. Or use probe-friendly fixtures.            |
+| Promoting to `exact` at 99.xx%                  | The exact suite is a byte-exact contract. Near-passes belong in coverage with a delta note. | Wait for 100.00%. Or fix the residual.                             |
+| Claiming "verified" without reading the diff    | A similarity score is a coarse index; the diff image is the truth.                          | Eyeball every sub-100 diff. Record the specific divergence.        |
+| Inventing new fixtures when WPT covers it       | Duplicates work; WPT has reviewed spec-intent pass criteria.                                | Import the WPT fixture; cite it in the suite entry.                |
+
+---
+
+## The template — paste this to kick off a cycle
+
+Fill in the brackets. The agent you hand it to should produce all
+five artifacts before declaring done. Expect to run the loop in
+passes (audit+ground+fixture → implement → verify), with a
+checkpoint at each pass that future-you or a reviewer can read
+without the conversation.
+
+```text
+Drive the htmlcss feature loop for: <property or behavior>.
+Follow .agents/prompts/cg-htmlcss-feature.md.
+
+Scope:
+- Feature:    <e.g. `border-radius` percentage values>
+- Hypothesis: <e.g. cg parses but drops %-values in the paint stage>
+- Expected:   <e.g. promote paint-border-radius.html to L0.exact>
+
+Produce, in order:
+
+1. Audit note:  current support level, file references, before-scores.
+2. Ground note: spec section(s), stylo/Blink strategy summary.
+3. Fixture(s): `.html` + suite entries. Paint or layout? Declare it.
+4. Implementation: minimal diff. Data tests where assertable.
+5. Verify report: before/after similarity per fixture, diff PNG
+   review for any sub-1.0 score, promoted fixtures listed.
+
+Gate: L0.exact must stay at floor 1.0, threshold 0, aa off. Do not
+relax the gate. If the feature doesn't reach 100.00%, leave it in
+coverage with a specific divergence-surface note.
+
+Use /research for phase 2, /fixtures for phase 3, /cg-reftest for
+phases 3 and 5.
+```
diff --git a/.agents/skills/cg-reftest/SKILL.md b/.agents/skills/cg-reftest/SKILL.md
index 998683440..5ecd4b19d 100644
--- a/.agents/skills/cg-reftest/SKILL.md
+++ b/.agents/skills/cg-reftest/SKILL.md
@@ -28,25 +28,26 @@ How to design, name, and review visual rendering tests in this repo.
 
 Use these terms precisely. Misusing them erodes trust in test results.
 
-| Term                               | Definition                                                                                                                                                                                                                                                            |
-| ---------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| **Reftest**                        | A test that compares renderer output against an **independent reference** (oracle) whose correctness is established outside this project — e.g. a W3C-provided PNG for an SVG test case. The oracle is the source of truth; a mismatch means our renderer is wrong.   |
-| **Independent reference / oracle** | A rendering produced by a separate, trusted implementation or defined by a specification. We do not control its content.                                                                                                                                              |
-| **Golden test**                    | A test that compares renderer output against a **previously accepted snapshot** produced by our own renderer. There is no external truth — the golden file _is_ the expected output because a human reviewed and approved it. Also called a snapshot test.            |
-| **Snapshot test**                  | Synonym for golden test. The snapshot is a frozen output that we assert has not changed.                                                                                                                                                                              |
-| **Render regression test**         | Any test whose purpose is to detect _unintended changes_ in rendering output. Golden tests are regression tests. Reftests are correctness tests.                                                                                                                      |
-| **Pixel diff**                     | Byte-level comparison of two raster images. A single differing channel value is a failure (at zero tolerance).                                                                                                                                                        |
-| **Perceptual diff**                | Comparison in a perceptual color space (e.g. YIQ via the `dify` crate). Weights differences by human visual sensitivity. More forgiving than raw pixel diff but still quantifiable.                                                                                   |
-| **rendiff**                        | Rust crate (`rendiff` v0.2) for histogram-based pixel diffing. Computes a per-channel difference histogram; thresholds are expressed as `[(max_diff, max_count), ...]` pairs. Used in `flatten_rendiff.rs` for equivalence tests. Dep in `crates/grida-canvas/`.      |
-| **dify**                           | Rust crate for perceptual image comparison in YIQ color space. Used by `grida-dev reftest` for SVG reftests. Supports `--threshold` and `--aa` (anti-aliasing detection) flags.                                                                                       |
-| **pixelmatch**                     | Pure-JS perceptual image comparison library. YIQ-based, AA-aware. Used by `@grida/reftest`. Zero native deps; same conceptual model as dify, slightly different threshold semantics — see parity notes below.                                                         |
-| **`@grida/reftest`**               | General-purpose, language-agnostic TS reftest CLI + library at `packages/grida-reftest/`. Takes two directories of PNGs, diffs, scores, writes the same bucket layout and JSON report as the Rust `grida-dev reftest`. Does NOT render anything — producers upstream. |
-| **`grida-dev reftest`**            | Rust reftest runner at `crates/grida-dev/src/reftest/`. SVG-specific: renders SVG via our own cg pipeline, then diffs against a reference PNG. Canonical for SVG. For non-SVG formats, use `@grida/reftest` with an upstream renderer.                                |
-| **refig**                          | Short for "Figma reftest." Fixture suites under `fixtures/local/refig/` containing `.fig` + `document.json` + `images/` + `exports/` (oracle PNGs from Figma's Images API). Consumed by a TS render step + `@grida/reftest`. See `fixtures/local/refig/README.md`.    |
-| **Tolerance / fuzz**               | A configured threshold below which pixel differences are ignored. Expressed as a histogram threshold (rendiff) or a YIQ distance (dify / pixelmatch). Required when rasterization is non-deterministic across platforms.                                              |
-| **Data test**                      | A test that asserts on the scene graph or computed values directly — no rendering needed. E.g. bounding box, resolved transform matrix, computed style. The cheapest possible assertion.                                                                              |
-| **Probe test**                     | A test that asserts correctness by reading pixel values at specific coordinates in the rendered output. Requires a purpose-built fixture with a minimal color palette and documented probe points. No full-image comparison needed.                                   |
-| **Probe-friendly fixture**         | A fixture explicitly designed for probe testing: minimal colors, no decorative elements, shapes at known coordinates. Often accompanied by a `.probe.json` file declaring expected pixel values at specific points.                                                   |
+| Term                               | Definition                                                                                                                                                                                                                                                                |
+| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Reftest**                        | A test that compares renderer output against an **independent reference** (oracle) whose correctness is established outside this project — e.g. a W3C-provided PNG for an SVG test case. The oracle is the source of truth; a mismatch means our renderer is wrong.       |
+| **Independent reference / oracle** | A rendering produced by a separate, trusted implementation or defined by a specification. We do not control its content.                                                                                                                                                  |
+| **Golden test**                    | A test that compares renderer output against a **previously accepted snapshot** produced by our own renderer. There is no external truth — the golden file _is_ the expected output because a human reviewed and approved it. Also called a snapshot test.                |
+| **Snapshot test**                  | Synonym for golden test. The snapshot is a frozen output that we assert has not changed.                                                                                                                                                                                  |
+| **Render regression test**         | Any test whose purpose is to detect _unintended changes_ in rendering output. Golden tests are regression tests. Reftests are correctness tests.                                                                                                                          |
+| **Pixel diff**                     | Byte-level comparison of two raster images. A single differing channel value is a failure (at zero tolerance).                                                                                                                                                            |
+| **Perceptual diff**                | Comparison in a perceptual color space (e.g. YIQ via the `dify` crate). Weights differences by human visual sensitivity. More forgiving than raw pixel diff but still quantifiable.                                                                                       |
+| **rendiff**                        | Rust crate (`rendiff` v0.2) for histogram-based pixel diffing. Computes a per-channel difference histogram; thresholds are expressed as `[(max_diff, max_count), ...]` pairs. Used in `flatten_rendiff.rs` for equivalence tests. Dep in `crates/grida-canvas/`.          |
+| **dify**                           | Rust crate for perceptual image comparison in YIQ color space. Used by `grida-dev reftest` for SVG reftests. Supports `--threshold` and `--aa` (anti-aliasing detection) flags.                                                                                           |
+| **pixelmatch**                     | Pure-JS perceptual image comparison library. YIQ-based, AA-aware. Used by `@grida/reftest`. Zero native deps; same conceptual model as dify, slightly different threshold semantics — see parity notes below.                                                             |
+| **`@grida/reftest`**               | General-purpose, language-agnostic TS reftest CLI + library at `packages/grida-reftest/`. Takes two directories of PNGs, diffs, scores, writes the same bucket layout and JSON report as the Rust `grida-dev reftest`. Does NOT render anything — producers upstream.     |
+| **`grida-dev reftest`**            | Rust reftest runner at `crates/grida-dev/src/reftest/`. SVG-specific: renders SVG via our own cg pipeline, then diffs against a reference PNG. Canonical for SVG. For non-SVG formats, use `@grida/reftest` with an upstream renderer.                                    |
+| **refig**                          | Short for "Figma reftest." Fixture suites under `fixtures/local/refig/` containing `.fig` + `document.json` + `images/` + `exports/` (oracle PNGs from Figma's Images API). Consumed by a TS render step + `@grida/reftest`. See `fixtures/local/refig/README.md`.        |
+| **refbrowser**                     | Short for "headless-browser reftest." HTML/CSS fixtures under `fixtures/test-html/L0/` rendered by Playwright Chromium as the oracle vs. our `cg` htmlcss renderer. Producer script: `.agents/skills/cg-reftest/scripts/refbrowser_render.ts`; diff via `@grida/reftest`. |
+| **Tolerance / fuzz**               | A configured threshold below which pixel differences are ignored. Expressed as a histogram threshold (rendiff) or a YIQ distance (dify / pixelmatch). Required when rasterization is non-deterministic across platforms.                                                  |
+| **Data test**                      | A test that asserts on the scene graph or computed values directly — no rendering needed. E.g. bounding box, resolved transform matrix, computed style. The cheapest possible assertion.                                                                                  |
+| **Probe test**                     | A test that asserts correctness by reading pixel values at specific coordinates in the rendered output. Requires a purpose-built fixture with a minimal color palette and documented probe points. No full-image comparison needed.                                       |
+| **Probe-friendly fixture**         | A fixture explicitly designed for probe testing: minimal colors, no decorative elements, shapes at known coordinates. Often accompanied by a `.probe.json` file declaring expected pixel values at specific points.                                                       |
 
 ---
 
@@ -266,6 +267,301 @@ is the threshold where the renderer needs attention.
   `document.json` — not from a suite-wide viewport. The render step
   must honor each node's preset.
 
+### HTML/CSS — the refbrowser reftest pipeline
+
+HTML/CSS fixtures have **no pre-baked oracle** — the oracle is a real
+browser engine. Like refig, refbrowser renders the same fixture in two
+places and diffs the PNGs; unlike refig, both renders are reproducible
+locally (no cloud round-trip).
+
+```
+fixtures/test-html/
+├── L0/<name>.html                 ── fixtures
+├── _reftest/hide-text.css         ── shared helper stylesheets
+└── suites/
+    ├── L0.exact.json              ── must pass 100.00%; CI gate
+    └── L0.coverage.json           ── aspirational scope; tracks progress
+
+        │
+        ├── cargo run -p cg --example golden_htmlcss -- --suite <suite>
+        │       └─► $TMPDIR/grida-htmlcss-goldens/<name>.png   (cg actual)
+        │
+        └── refbrowser_render.ts --suite <suite>
+                └─► target/refbrowser/<suite>/expected/<name>.png   (Chromium oracle)
+
+                        ▼
+                reftest --actual-dir … --expected-dir … --threshold 0
+                        └─► target/reftests/<suite>/report.json + buckets
+```
+
+**Oracle**: headless Chromium via Playwright. Chromium's Blink is the
+reference implementation for most CSS features; divergence from Blink
+is a gap in our `cg` htmlcss pipeline (or a known difference documented
+in `docs/wg/feat-2d/htmlcss.md`).
+
+> **See also: web-platform-tests (WPT).** The W3C's
+> [wpt.live](https://wpt.live) suite is the standards-body reftest
+> harness — same concept as refbrowser, but cross-engine (Blink,
+> WebKit, Gecko) and backed by spec-author-written fixtures with
+> explicit pass criteria. Consider pulling WPT fixtures into
+> `fixtures/test-html/` when a CSS feature has a mature WPT section
+> and you want spec-conformance signal rather than just "matches
+> Chromium." Out of scope for this skill today; refbrowser is the
+> faster local loop.
+
+#### Suites: `L0.exact` vs `L0.coverage`
+
+Everything is driven by **suite JSON files** at
+`fixtures/test-html/suites/`. A suite enumerates fixtures, their
+per-fixture render config, and the gate policy.
+
+| Suite              | What it contains                                                                              | Gate                       |
+| ------------------ | --------------------------------------------------------------------------------------------- | -------------------------- |
+| `L0.exact.json`    | Fixtures currently at 100.00% byte-exact parity with Chromium. Any drop is a real regression. | `floor: 1.0`, strict diff. |
+| `L0.coverage.json` | All aspirational L0 fixtures — the full backlog. Scores land wherever they land.              | Informational only.        |
+
+**Promoting a fixture to `exact`** — once a fixture reaches 100.00%
+against the current suite config, move its entry from `coverage` →
+`exact`. Do **not** lower the exact suite's floor to fit new entries;
+the bar exists so regressions are loud.
+
+Per-fixture `.reftest.json` sidecars **do not exist** anymore. All
+config lives in the suite file.
+
+#### Suite JSON shape
+
+```json
+{
+  "name": "L0.exact",
+  "description": "Byte-exact fixtures; any drop = regression.",
+  "gate": { "threshold": 0, "aa": false, "floor": 1.0 },
+  "defaults": {
+    "wait_for": ["fonts", "networkidle"],
+    "extra_css": ["../_reftest/hide-text.css"],
+    "full_page": true
+  },
+  "fixtures": [
+    {
+      "path": "../L0/box-dimensions.html",
+      "viewport": { "width": 600, "height": 522 }
+    }
+  ]
+}
+```
+
+- `defaults` — applied to every fixture. Each fixture entry can override any field.
+- `fixtures[].path` and every `extra_css[]` path resolve **relative to the suite file**.
+- `viewport.height` must match cg's cull height for the diff to succeed; render cg once and read `WxH` to calibrate.
+- `gate.threshold` / `gate.aa` are inputs to the pixelmatch diff; `gate.floor` is the aggregate pass bar on similarity.
+
+#### The three-step pipeline
+
+**1. Render expecteds (browser oracle)**
+
+```sh
+# one-time: install Chromium for Playwright
+pnpm --filter @grida/reftest exec playwright install chromium
+
+# render the whole suite
+pnpm --filter @grida/reftest exec tsx \
+  .agents/skills/cg-reftest/scripts/refbrowser_render.ts \
+  --suite   fixtures/test-html/suites/L0.exact.json \
+  --out-dir target/refbrowser/L0.exact/expected
+```
+
+Ad-hoc single-file render (no suite, defaults only) — useful while authoring a fixture:
+
+```sh
+pnpm --filter @grida/reftest exec tsx \
+  .agents/skills/cg-reftest/scripts/refbrowser_render.ts \
+  --fixture fixtures/test-html/L0/paint-background-solid.html \
+  --out-dir /tmp/refbrowser-verify
+```
+
+**2. Render actuals (our pipeline)** — the `golden_htmlcss` example
+reads the same suite JSON, resolves `extra_css` relative to the suite
+file, and applies each stylesheet via
+`htmlcss::with_extra_stylesheets` before rendering, so the cascade is
+symmetric with Chromium.
+
+```sh
+cargo run -p cg --example golden_htmlcss -- \
+  --suite fixtures/test-html/suites/L0.exact.json
+
+mkdir -p target/refbrowser/L0.exact/actual
+cp "${TMPDIR:-/tmp}/grida-htmlcss-goldens/"*.png target/refbrowser/L0.exact/actual/
+```
+
+**3. Diff via `@grida/reftest`** — format-agnostic, same bucket layout
+and `report.json` schema as the Rust and refig runners.
+
+Default refbrowser diff: **`--threshold 0`** (pixelmatch strictest,
+AA off). Pass each fixture's similarity against the suite's
+`gate.floor` — for `L0.exact`, that's `1.0` (100.00% byte-exact).
+
+```sh
+pnpm --filter @grida/reftest exec reftest \
+  --actual-dir   target/refbrowser/L0.exact/actual \
+  --expected-dir target/refbrowser/L0.exact/expected \
+  --output-dir   target/reftests/L0.exact \
+  --bg white \
+  --threshold 0
+```
+
+> **Gate enforcement is not yet wired into the CLI.** Today, read
+> `report.json` and assert every `tests[].similarity_score ≥
+gate.floor` in a wrapper script or CI step. A `--suite` flag on
+> `@grida/reftest` that does this automatically is a pending
+> follow-up.
+
+Output: `S99/S95/S90/S75/err/` bucket directories + `report.json`.
+Pass bar: the suite's `gate.floor`. For `L0.exact`, anything below
+100.00% is a real divergence from Blink (rounding policy, layout
+math, AA emission, etc.) — not noise. See "Reading the score" below.
+
+### Reading the score — do not trust it naively
+
+The similarity score is `1 - diff_pixels / scoring_pixels`, where
+`scoring_pixels ≈ width × height` of the screenshot. **The denominator
+is the whole canvas, not the subject under test.**
+
+This has two consequences you must internalize before reading any
+report:
+
+1. **Background dominates the score.** A fixture that paints a
+   100×100 subject on a 600×800 canvas has 92% background. A renderer
+   that emits _nothing_ for the subject still scores ~92%. A
+   renderer that paints the subject at 50% accuracy scores ~96%.
+   Neither number means what it naively looks like.
+2. **Small fixtures inflate. Full-bleed fixtures are honest.** A
+   card-in-corner composition will always look "good" on the score
+   even when broken; a composition that fills the viewport gives
+   numeric feedback proportional to real error.
+
+**Fixture-authoring rule:** size the fixture so the subject under
+test fills as much of the canvas as practical. Viewport height
+tuned to the subject's bounding box (via the suite entry's
+`viewport.height`) is the usual lever. Padding/margins around the
+subject are scoring dead weight — use them only when the test is
+_about_ spacing.
+
+**Reviewing rule:** never report a similarity number without
+eyeballing the diff PNG. A 96% score on a sparse fixture and a 96%
+score on a full-bleed fixture are _orders of magnitude_ apart in
+severity. The diff image is the source of truth; the score is a
+coarse index.
+
+For a true "fraction of the subject that matches," author a
+probe-friendly fixture (see the probe test section) and assert on
+specific pixels, or mask the background to transparent so
+`mask: alpha` counts only subject pixels. Plain refbrowser scores
+cannot give you that signal.
+
+**Per-fixture fields inside a suite entry** — all optional,
+defaults shown; any field set on an entry overrides `defaults`.
+
+```json
+{
+  "path": "../L0/<name>.html",
+  "viewport": { "width": 600, "height": 800 },
+  "wait_for": ["fonts", "networkidle"],
+  "extra_css": [],
+  "full_page": true
+}
+```
+
+- `viewport` — Chromium viewport (px). Set height to match cg's cull
+  height; mismatched dims score 0.0 at diff time (`@grida/reftest`
+  requires identical dimensions).
+- `wait_for` — `"fonts"` awaits `document.fonts.ready`, `"networkidle"`
+  awaits 500ms of no-network-activity.
+- `extra_css` — CSS files to inject into **both** sides. Paths resolve
+  relative to the suite file. Playwright applies them via `addStyleTag`;
+  cg applies them via `htmlcss::with_extra_stylesheets` before rendering,
+  so the cascade is symmetric. Fields only meaningful to Chromium
+  (`viewport`, `wait_for`, `full_page`) are ignored by cg.
+- `full_page` — capture full scrollable area (default) vs. viewport.
+
+**Pre-built helper stylesheets** under `fixtures/test-html/_reftest/`:
+
+| File            | Effect                                                                                                                         |
+| --------------- | ------------------------------------------------------------------------------------------------------------------------------ |
+| `hide-text.css` | `color: transparent` + `line-height: 1`. Zeros glyph coverage and pins line-box height. Use when a fixture isn't testing text. |
+
+Add more helpers here as divergence patterns emerge. Keep each one
+scoped to a single concern (hide text, normalize scrollbars, force
+web fonts, etc.) so suites can compose them.
+
+**When to reach for `hide-text.css`** — any fixture whose subject is
+paint, layout, box model, flex, grid, or positioning. The text in
+those fixtures is typically decorative labels; its glyph rendering
+and `line-height: normal` metrics diverge between Blink and Skia and
+will dominate the diff otherwise.
+
+**When NOT to use it** — fixtures whose subject IS text:
+`text-decoration`, `text-shadow`, `text-align`, bidi, `writing-mode`,
+font features. For these, leave `extra_css` empty and accept a
+below-100 score; the reftest's value there is human review of the
+diff image, not the numeric score.
+
+**Authoring workflow** for a new fixture:
+
+1. Write the `.html` fixture under `fixtures/test-html/L0/`.
+2. Add an entry to `suites/L0.coverage.json` with at least
+   `{ "path": "../L0/<name>.html" }`.
+3. Render it once via `--suite L0.coverage.json` on the cg side; note
+   the reported `WxH` in the log.
+4. Set `viewport.height = H` on the entry; add `extra_css` helpers if
+   relevant (e.g. `hide-text.css` for non-text fixtures). `defaults`
+   in the suite likely already cover the common case.
+5. Run the refbrowser producer + diff against the same suite. Review
+   the diff PNG — if the diff is dominated by a known divergence zone
+   (see below), record it in the PR description, don't suppress it.
+6. If the fixture reaches 100.00% byte-exact, move its entry from
+   `L0.coverage.json` to `L0.exact.json`.
+
+**Known divergence surfaces** — areas where cg is not yet Blink-exact.
+These are **backlog items, not tolerance excuses**. Do not tune
+thresholds to suppress them. Document the specific divergence in the
+PR description; let the score carry the truth.
+
+- **Alpha compositing rounding** — `rgba()` backgrounds, `opacity`.
+  cg and Blink choose different rounding rules (half-up vs banker's,
+  premul vs straight, operand order), producing 1-unit channel
+  deltas. Small-delta territory but still a real policy divergence.
+- **Layout math under non-uniform padding / intrinsic sizing** —
+  block widths resolving 1-3 px off when computed through flex
+  children, asymmetric padding, or `width: auto` on transparent
+  content. Shows up as diff brackets at box edges.
+- **Text** — glyph rasterization (shaper version, subpixel positioning,
+  hinting) and line-box metrics (`line-height: normal` ascent/descent)
+  diverge. For non-text fixtures inject `hide-text.css`. For
+  text-subject fixtures, accept a below-100 score and rely on the
+  diff image for review.
+- **Antialiasing on curves** — rounded corners, circles, ellipses,
+  stroke ends. cg's path flattener emits different coverage values
+  than Blink's for the same geometry.
+- **Percentage border-radius** — `border-radius: 50%` and the `H / V`
+  two-value form currently render as square in cg. Fixed-length radii
+  (`12px`, `9999px`) work.
+- **Gradients** — linear, radial, conic, repeating. Color-stop
+  interpolation and color-space handling differ; banding and
+  transition boundaries don't match.
+- **Filters and shadows** — `filter: blur`, `backdrop-filter`,
+  `box-shadow` with large blur radii. Kernel and sampling divergence
+  dominates scores.
+- **`<img>` fallbacks** — our `ImageProvider` renders a placeholder
+  rect; Chromium renders broken-image chrome. Prefer fixtures with
+  real image fills or none.
+- **System-font fallback** — bundle fonts with `@font-face` + local
+  paths when the fixture specifically tests font rendering.
+- **Scrollbar width** — default `full_page: true` captures document
+  height and sidesteps scrollbar chrome; flip only when testing
+  scrollbar geometry.
+- **Dimension drift** — changing a fixture's layout invalidates its
+  `viewport.height` in the suite entry. Re-run `golden_htmlcss` with
+  `--suite`, update the entry's `viewport.height`, re-run refbrowser.
+
 **Oracle type summary:**
 
 | Input format            | Oracle source            | Test type   |
@@ -274,8 +570,135 @@ is the threshold where the renderer needs attention.
 | SVG (arbitrary, no PNG) | resvg-rendered PNG       | Reftest     |
 | SVG (Grida extensions)  | Our own prior output     | Golden test |
 | Figma REST / .fig       | Figma-exported PNG       | Reftest     |
+| HTML / CSS (embed)      | Playwright Chromium PNG  | Reftest     |
 | `.grida` native         | Our own prior output     | Golden test |
 
+---
+
+## Heuristic techniques (future work)
+
+Two techniques that scale reftesting beyond "fixture in, score out."
+Both are format-agnostic — they apply anywhere we control the oracle
+pipeline (refbrowser, refsvg-via-resvg), and both are **unimplemented
+today**. They're documented here so the design is shared before
+anyone starts building.
+
+### Subtree bisection — diff attribution
+
+> Aliases: _diff attribution_, _culprit isolation_. Delta debugging
+> applied to rendering.
+>
+> **TODO — tooling not ready.** Manual application only today.
+
+A reftest gives you a single similarity score and a diff PNG. For a
+minimal fixture that's enough — you eyeball the diff and the culprit
+is obvious. As fixtures scale (multi-element compositions, nested
+layout, overlapping subtrees), you know _that_ there's a divergence
+but not _which_ element owns it.
+
+**The technique** narrows "something in this fixture diverges" to
+"this specific element's rendering is wrong," in two modes:
+
+1. **Region → element (fast path).** Extract the bbox of high-delta
+   regions from the diff PNG (connected-components or simple
+   threshold pass). Match each bbox against element bounds in the
+   fixture — confidently possible when elements are absolutely
+   positioned or when the layout tree has dumped bounds available.
+   One-shot lookup; names the culprit directly.
+
+2. **Isolation bisection (slow path).** When region→element is
+   ambiguous (overlapping elements, pure flow layout), generate
+   temporary scoped-down fixtures by injecting override CSS that
+   hides all siblings / cousins of a candidate subtree
+   (`display: none` on the rest, or `visibility: hidden` if
+   layout must be preserved). Re-run the reftest on each isolated
+   view. Iterate through the element tree to produce per-subtree
+   scores and converge on the offending node.
+
+The two-path split matters because mode (1) is O(1) in reftest runs
+and mode (2) is O(log n) at best — prefer (1) whenever bbox→element
+is unambiguous.
+
+**Applicability.**
+
+| Reftest        | Oracle controllable? | Subtree bisection viable? |
+| -------------- | -------------------- | ------------------------- |
+| refbrowser     | Yes (Playwright)     | ✅ Yes                    |
+| refsvg (resvg) | Yes (local CLI)      | ✅ Yes                    |
+| W3C SVG suite  | No (pre-baked PNG)   | ❌ No                     |
+| refig (Figma)  | No (manual export)   | ❌ No                     |
+
+Figma is explicitly out: isolating a subtree would require
+re-exporting from the Figma app, which is an upstream human step.
+
+**Tooling shape (when built).** A script that:
+
+1. Reads a reftest's diff PNG.
+2. Extracts high-delta bounding boxes.
+3. Attempts region→element match against a parsed fixture tree.
+4. On ambiguity, writes override CSS for each candidate subtree,
+   re-runs the producer + diff, accumulates per-subtree scores.
+5. Outputs a JSON report keyed by element selector, with a score
+   and a small preview diff per subtree.
+
+Not unique to htmlcss — the same pattern works for any tree-structured
+oracle with controllable input (SVG `<g>` subtrees, scene graph nodes
+in .grida, etc.).
+
+### Viewport sweep — width-matrix for layout fixtures
+
+> Aliases: _width sweep_, _responsive sweep_, _width matrix_.
+>
+> **TODO — tooling not ready.** Single-width runs only today.
+
+A single-viewport reftest catches a layout bug at that one width. It
+misses bugs that only manifest at a different width — which for CSS
+layout is most bugs (flex basis resolution, wrap points, grid
+`auto-fill`, `min-content` / `max-content` interaction,
+percentage-sized children against unusual parent widths).
+
+**The technique.** Render the same layout fixture at a list of
+viewport widths and diff each independently. A typical sweep:
+
+```
+widths: [320, 600, 768, 1024, 1280]  // mobile → desktop span
+```
+
+Produces N PNG pairs per fixture and N similarity scores. A fixture
+passes only if _every_ width passes.
+
+**Why width, not height.** CSS content flows vertically as a function
+of the containing block's width; height is mostly an output, not an
+input. Width variance exercises most layout regimes. Height variance
+is relevant only for `min-height`/`max-height`/vh-based cases, which
+are narrower and better covered by dedicated single-width fixtures.
+
+**Applicability.** Layout-category fixtures only. Paint fixtures
+(color, opacity, shadow, gradient, border-radius) render a fixed-size
+subject inside a fixed canvas — sweeping widths adds no signal and
+just multiplies work.
+
+**Tooling shape (when built).** Suite schema grows a `widths` array
+on layout entries:
+
+```json
+{
+  "path": "../L0/box-dimensions.html",
+  "widths": [320, 600, 1024]
+}
+```
+
+Producers loop over `widths`, emitting PNGs named
+`<stem>@<width>.png`. `@grida/reftest` treats each as a separate
+test. No per-width `viewport.height` — let each width produce its
+natural cull height (the measurement _is_ the output).
+
+This technique is also format-agnostic — responsive SVG, responsive
+refbrowser, and responsive .grida scenes all benefit from the same
+width-sweep harness.
+
+---
+
 ### Golden tests — native/proprietary/internal formats
 
 Use when **no external truth exists**:
@@ -582,6 +1005,38 @@ pnpm --filter @grida/reftest exec reftest \
 In a PR: _"refig(refig-standard): auto-layout row spacing fix, average
 similarity 0.81 → 0.94, 612 tests S75→S95."_
 
+### True reftest — HTML/CSS refbrowser against Playwright Chromium
+
+```bash
+# Pre-requisite: Chromium installed for Playwright
+pnpm --filter @grida/reftest exec playwright install chromium
+
+# 1. Render expecteds via Playwright Chromium
+pnpm --filter @grida/reftest exec tsx .agents/skills/cg-reftest/scripts/refbrowser_render.ts \
+  --suite   fixtures/test-html/suites/L0.exact.json \
+  --out-dir target/refbrowser/expected
+
+# 2. Render actuals via our cg pipeline
+cargo run -p cg --example golden_htmlcss -- \
+  --suite fixtures/test-html/suites/L0.exact.json
+mkdir -p target/refbrowser/actual
+cp "${TMPDIR:-/tmp}/grida-htmlcss-goldens/"*.png target/refbrowser/actual/
+
+# 3. Diff actuals against Chromium oracle, write bucketed report
+pnpm --filter @grida/reftest exec reftest \
+  --actual-dir   target/refbrowser/actual \
+  --expected-dir target/refbrowser/expected \
+  --output-dir   target/reftests/htmlcss \
+  --bg white
+
+# Result: target/reftests/htmlcss/report.json + S99/S95/S90/S75/err/ buckets.
+# A score < 1.0 means our htmlcss renderer diverges from Chromium.
+# This is a genuine reftest — Playwright Chromium is the oracle.
+```
+
+In a PR: _"refbrowser(htmlcss): background-repeat space/round landed,
+average similarity 0.72 → 0.91 across 6 repeat fixtures."_
+
 ### Golden/regression test — custom effect
 
 ```bash
diff --git a/.agents/skills/cg-reftest/scripts/refbrowser_render.ts b/.agents/skills/cg-reftest/scripts/refbrowser_render.ts
new file mode 100644
index 000000000..46b88178e
--- /dev/null
+++ b/.agents/skills/cg-reftest/scripts/refbrowser_render.ts
@@ -0,0 +1,314 @@
+#!/usr/bin/env -S pnpm dlx tsx
+/**
+ * refbrowser_render.ts — headless Chromium oracle for HTML/CSS reftests.
+ *
+ * Renders each fixture in a suite through Playwright's Chromium and
+ * writes a PNG per fixture to `--out-dir`. The output is the reference
+ * oracle for cg's htmlcss renderer — Chromium is the ground truth.
+ *
+ *  ┌────────────────┐   ┌─────────────────────┐   ┌──────────────────┐
+ *  │ <suite>.json   │ → │ Playwright Chromium │ → │ expected/<n>.png │
+ *  │ + helper CSS   │   │ (full-page screen)  │   │                  │
+ *  └────────────────┘   └─────────────────────┘   └──────────────────┘
+ *
+ * Pair with `cargo run -p cg --example golden_htmlcss --suite` on the
+ * actual side, then diff via `@grida/reftest`.
+ *
+ * ## Usage
+ *
+ * ```sh
+ * pnpm --filter @grida/reftest exec tsx \
+ *   .agents/skills/cg-reftest/scripts/refbrowser_render.ts \
+ *   --suite   fixtures/test-html/suites/L0.exact.json \
+ *   --out-dir target/refbrowser/L0.exact/expected
+ * ```
+ *
+ * Ad-hoc single-file render (no suite, defaults only):
+ *
+ * ```sh
+ * pnpm --filter @grida/reftest exec tsx \
+ *   .agents/skills/cg-reftest/scripts/refbrowser_render.ts \
+ *   --fixture fixtures/test-html/L0/paint-background-solid.html \
+ *   --out-dir /tmp/refbrowser-verify
+ * ```
+ *
+ * ## Dependencies
+ *
+ * - Node 20+.
+ * - `@playwright/test` (devDependency of `@grida/reftest`).
+ * - Chromium binary (one-time):
+ *   `pnpm --filter @grida/reftest exec playwright install chromium`
+ *
+ * ## Suite JSON shape
+ *
+ * ```json
+ * {
+ *   "name": "L0.exact",
+ *   "gate":     { "threshold": 0, "aa": false, "floor": 1.0 },
+ *   "defaults": {
+ *     "viewport":  { "width": 600, "height": 800 },
+ *     "wait_for":  ["fonts", "networkidle"],
+ *     "extra_css": ["../_reftest/hide-text.css"],
+ *     "full_page": true
+ *   },
+ *   "fixtures": [
+ *     { "path": "../L0/box-dimensions.html",
+ *       "viewport": { "width": 600, "height": 522 } }
+ *   ]
+ * }
+ * ```
+ *
+ * Per-fixture entries inherit and override `defaults` field-by-field.
+ * All paths (`fixtures[].path`, `extra_css[]`) resolve **relative to
+ * the suite file**. `gate` is consumed by the diff step, not here.
+ *
+ * ## Caveats
+ *
+ * - `document.fonts.ready` waits for `<link>`/inline `@font-face` loads;
+ *   system-font fallbacks still differ from Skia. Inject
+ *   `_reftest/hide-text.css` for non-text fixtures.
+ * - Each fixture is rendered in a fresh incognito context.
+ */
+import { promises as fs } from "node:fs";
+import * as path from "node:path";
+import { fileURLToPath, pathToFileURL } from "node:url";
+// `@playwright/test` re-exports `chromium` from `playwright-core` and is
+// the package actually installed in this repo (via `editor`). Using it
+// here avoids depending on a separate `playwright` package.
+import { chromium, type Browser, type BrowserContext } from "@playwright/test";
+
+type FixtureConfig = {
+  viewport?: { width?: number; height?: number };
+  wait_for?: Array<"fonts" | "networkidle">;
+  extra_css?: string[];
+  full_page?: boolean;
+};
+
+type FixtureEntry = FixtureConfig & { path: string };
+
+type SuiteFile = {
+  name?: string;
+  description?: string;
+  gate?: unknown; // consumed by the diff step, not here
+  defaults?: FixtureConfig;
+  fixtures: FixtureEntry[];
+};
+
+type ResolvedConfig = {
+  viewport: { width: number; height: number };
+  wait_for: Array<"fonts" | "networkidle">;
+  extra_css: string[];
+  full_page: boolean;
+};
+
+const DEFAULTS: ResolvedConfig = {
+  viewport: { width: 600, height: 800 },
+  wait_for: ["fonts", "networkidle"],
+  extra_css: [],
+  full_page: true,
+};
+
+function mergeConfig(
+  defaults: FixtureConfig | undefined,
+  entry: FixtureConfig
+): ResolvedConfig {
+  const pick = <K extends keyof ResolvedConfig>(key: K): ResolvedConfig[K] => {
+    const a = entry[key] as ResolvedConfig[K] | undefined;
+    const b = defaults?.[key] as ResolvedConfig[K] | undefined;
+    return (a ?? b ?? DEFAULTS[key]) as ResolvedConfig[K];
+  };
+  const vp = entry.viewport ?? defaults?.viewport ?? DEFAULTS.viewport;
+  return {
+    viewport: {
+      width: vp?.width ?? DEFAULTS.viewport.width,
+      height: vp?.height ?? DEFAULTS.viewport.height,
+    },
+    wait_for: pick("wait_for"),
+    extra_css: pick("extra_css"),
+    full_page: pick("full_page"),
+  };
+}
+
+type Resolved = {
+  htmlPath: string;
+  stem: string;
+  config: ResolvedConfig;
+};
+
+async function resolveSuite(suitePath: string): Promise<Resolved[]> {
+  const raw = await fs.readFile(suitePath, "utf8");
+  const suite = JSON.parse(raw) as SuiteFile;
+  if (!Array.isArray(suite.fixtures)) {
+    throw new Error(`suite ${suitePath}: missing fixtures[]`);
+  }
+  const suiteDir = path.dirname(path.resolve(suitePath));
+  return suite.fixtures.map((entry) => {
+    const htmlPath = path.resolve(suiteDir, entry.path);
+    const merged = mergeConfig(suite.defaults, entry);
+    // Resolve extra_css paths relative to the suite file.
+    const extra_css = merged.extra_css.map((rel) =>
+      path.resolve(suiteDir, rel)
+    );
+    const stem = path.basename(entry.path).replace(/\.html?$/i, "");
+    return { htmlPath, stem, config: { ...merged, extra_css } };
+  });
+}
+
+async function loadCssCached(
+  cache: Map<string, string>,
+  abs: string
+): Promise<string | null> {
+  const hit = cache.get(abs);
+  if (hit !== undefined) return hit;
+  try {
+    const content = await fs.readFile(abs, "utf8");
+    cache.set(abs, content);
+    return content;
+  } catch (e) {
+    console.error(`  warn: failed to read ${abs}: ${(e as Error).message}`);
+    return null;
+  }
+}
+
+async function renderOne(
+  ctx: BrowserContext,
+  r: Resolved,
+  outDir: string,
+  cssCache: Map<string, string>
+): Promise<{ file: string; cssCount: number }> {
+  const { htmlPath, stem, config } = r;
+
+  const page = await ctx.newPage();
+  await page.setViewportSize(config.viewport);
+
+  // `file://` URL so relative resources resolve from the fixture's dir.
+  // `pathToFileURL` handles Windows drive letters and percent-encodes
+  // spaces/non-ASCII, which plain string concatenation does not.
+  const fileUrl = pathToFileURL(path.resolve(htmlPath)).href;
+  await page.goto(fileUrl, { waitUntil: "load" });
+
+  if (config.wait_for.includes("networkidle")) {
+    await page.waitForLoadState("networkidle");
+  }
+  if (config.wait_for.includes("fonts")) {
+    // Await inside the page context; `document.fonts.ready` resolves
+    // to a `FontFaceSet` which Playwright cannot serialize across the
+    // boundary. Return void so the wait is effective and typed.
+    await page.evaluate(async () => {
+      await document.fonts.ready;
+    });
+  }
+
+  let cssCount = 0;
+  for (const abs of config.extra_css) {
+    const content = await loadCssCached(cssCache, abs);
+    if (content !== null) {
+      await page.addStyleTag({ content });
+      cssCount++;
+    }
+  }
+
+  const outPath = path.join(outDir, `${stem}.png`);
+  await fs.mkdir(outDir, { recursive: true });
+
+  await page.screenshot({
+    path: outPath,
+    fullPage: config.full_page,
+    animations: "disabled",
+    caret: "hide",
+  });
+
+  await page.close();
+  return { file: outPath, cssCount };
+}
+
+function parseArgs(argv: string[]): {
+  suite?: string;
+  fixture?: string;
+  outDir: string;
+} {
+  const args: Record<string, string> = {};
+  for (let i = 0; i < argv.length; i += 2) {
+    const key = argv[i]?.replace(/^--/, "");
+    const val = argv[i + 1];
+    if (key && val) args[key] = val;
+  }
+  if (!args["out-dir"]) {
+    throw new Error("--out-dir is required");
+  }
+  if (!args["suite"] && !args["fixture"]) {
+    throw new Error("must pass --suite <path> or --fixture <html>");
+  }
+  return {
+    suite: args["suite"],
+    fixture: args["fixture"],
+    outDir: path.resolve(args["out-dir"]),
+  };
+}
+
+async function main() {
+  const args = parseArgs(process.argv.slice(2));
+  let resolved: Resolved[];
+  if (args.suite) {
+    resolved = await resolveSuite(args.suite);
+    console.log(
+      `refbrowser: rendering ${resolved.length} fixture(s) from ${args.suite}`
+    );
+  } else {
+    const htmlPath = path.resolve(args.fixture!);
+    const stem = path.basename(htmlPath).replace(/\.html?$/i, "");
+    resolved = [{ htmlPath, stem, config: DEFAULTS }];
+    console.log(`refbrowser: rendering 1 fixture (ad-hoc, defaults only)`);
+  }
+  console.log(`  out-dir: ${args.outDir}`);
+
+  let browser: Browser | null = null;
+  const cssCache = new Map<string, string>();
+  try {
+    browser = await chromium.launch();
+
+    for (const r of resolved) {
+      const rel = path.relative(process.cwd(), r.htmlPath);
+      // Fresh incognito context per fixture — no cookie/storage/SW
+      // leakage between fixtures, so order can't mask real renderer
+      // changes.
+      let ctx: BrowserContext | null = null;
+      try {
+        ctx = await browser.newContext({
+          // Deterministic: force light color-scheme, standard locale/timezone.
+          colorScheme: "light",
+          locale: "en-US",
+          timezoneId: "UTC",
+          reducedMotion: "reduce",
+        });
+        const { file, cssCount } = await renderOne(
+          ctx,
+          r,
+          args.outDir,
+          cssCache
+        );
+        const hint = cssCount > 0 ? ` [+${cssCount} css]` : "";
+        console.log(`  ${rel} → ${file}${hint}`);
+      } catch (e) {
+        console.error(`  ${rel}: FAILED`);
+        console.error(e);
+        process.exitCode = 1;
+      } finally {
+        await ctx?.close();
+      }
+    }
+  } finally {
+    await browser?.close();
+  }
+}
+
+// Only run when invoked directly (not when imported).
+const invoked =
+  process.argv[1] &&
+  fileURLToPath(import.meta.url) === path.resolve(process.argv[1]);
+if (invoked) {
+  main().catch((e) => {
+    console.error(e);
+    process.exit(1);
+  });
+}
diff --git a/.agents/skills/fixtures/SKILL.md b/.agents/skills/fixtures/SKILL.md
index be1fcee83..953c090a5 100644
--- a/.agents/skills/fixtures/SKILL.md
+++ b/.agents/skills/fixtures/SKILL.md
@@ -43,6 +43,16 @@ edge case** that the codebase supports or intends to support. This includes:
   filename alone should tell you what's being tested.
 - **Labeled specimens.** Within a fixture, label each test case with the
   value being exercised so both humans and heuristics can identify regions.
+- **Match the fixture's subject to the viewport policy.** For refbrowser
+  fixtures under `fixtures/test-html/`, **paint / visual-property**
+  fixtures should size their root to a preset viewport (via `min-height`)
+  so cg's cull and Chromium's screenshot have identical dimensions.
+  **Layout** fixtures (box-model, flex, grid, intrinsic sizing) must
+  NOT force a body size — the output dimensions _are_ what the test
+  measures; a `min-height` hack contaminates the result. See
+  [`fixtures/test-html/README.md`](../../../fixtures/test-html/README.md)
+  for the preset list, the paint-vs-layout rule, and the per-fixture
+  `viewport` workflow for layout tests.
 - **Don't duplicate.** Before adding a fixture, check if an existing one
   already covers the behavior. Extend or split rather than duplicate.
 
diff --git a/crates/grida-canvas/examples/golden_htmlcss.rs b/crates/grida-canvas/examples/golden_htmlcss.rs
index 38da7915b..81ada4db3 100644
--- a/crates/grida-canvas/examples/golden_htmlcss.rs
+++ b/crates/grida-canvas/examples/golden_htmlcss.rs
@@ -4,28 +4,127 @@
 /// temporary directory (printed to stderr) so generated images don't
 /// bloat the repository.
 ///
-/// Usage:
+/// ## Usage
+///
+///   cargo run -p cg --example golden_htmlcss -- \
+///     --suite fixtures/test-html/suites/L0.exact.json
+///
 ///   cargo run -p cg --example golden_htmlcss -- [FILE_OR_DIR...]
 ///
-/// If no arguments given, renders built-in test fixtures.
-/// If a directory is given, renders all .html/.htm files in it.
+/// If no arguments given, renders built-in L0 fixtures.
+/// If FILE_OR_DIR is given, renders ad-hoc (no sidecar config).
+///
+/// ## Suite JSON shape
+///
+///   {
+///     "defaults": {
+///       "viewport": { "width": 600, "height": 800 },
+///       "extra_css": ["../_reftest/hide-text.css"]
+///     },
+///     "fixtures": [
+///       { "path": "../L0/box-dimensions.html",
+///         "viewport": { "width": 600, "height": 522 } }
+///     ]
+///   }
+///
+/// Per-fixture entries inherit and override `defaults`. All paths
+/// (`fixtures[].path`, `extra_css[]`) resolve **relative to the suite
+/// file**. `gate` and other fields unknown to this tool are ignored.
 use cg::htmlcss;
 use cg::resources::ByteStore;
 use cg::runtime::font_repository::FontRepository;
+use serde::Deserialize;
 use skia_safe::{surfaces, Color};
+use std::collections::HashMap;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, Mutex};
 
-fn fonts() -> FontRepository {
+fn build_fonts() -> FontRepository {
     let mut repo = FontRepository::new(Arc::new(Mutex::new(ByteStore::new())));
     repo.enable_system_fallback();
     repo
 }
 
-fn render_to_png(html: &str, width: f32, name: &str, out_dir: &Path) {
-    let fonts = fonts();
+#[derive(Debug, Default, Clone, Copy, Deserialize)]
+#[serde(default)]
+struct Viewport {
+    width: Option<f32>,
+    height: Option<f32>,
+}
+
+#[derive(Debug, Default, Deserialize)]
+#[serde(default)]
+struct FixtureConfig {
+    extra_css: Vec<String>,
+    viewport: Viewport,
+}
+
+#[derive(Debug, Deserialize)]
+struct SuiteEntry {
+    path: String,
+    #[serde(default)]
+    extra_css: Option<Vec<String>>,
+    #[serde(default)]
+    viewport: Option<Viewport>,
+}
+
+#[derive(Debug, Default, Deserialize)]
+#[serde(default)]
+struct SuiteFile {
+    defaults: FixtureConfig,
+    fixtures: Vec<SuiteEntry>,
+}
+
+const DEFAULT_WIDTH: f32 = 600.0;
+const DEFAULT_HEIGHT: f32 = 600.0;
+
+/// Resolve a fixture entry against suite defaults. Suite-relative
+/// paths are anchored at `suite_dir`. Viewport width/height inherit
+/// from `defaults` and fall back to the built-in defaults.
+fn resolve_entry(
+    entry: &SuiteEntry,
+    defaults: &FixtureConfig,
+    suite_dir: &Path,
+) -> (PathBuf, Vec<PathBuf>, f32, f32) {
+    let html = suite_dir.join(&entry.path);
+    let css_rel: &[String] = entry.extra_css.as_deref().unwrap_or(&defaults.extra_css);
+    let css_abs: Vec<PathBuf> = css_rel.iter().map(|r| suite_dir.join(r)).collect();
+    let vp = entry.viewport.unwrap_or(defaults.viewport);
+    let width = vp
+        .width
+        .or(defaults.viewport.width)
+        .unwrap_or(DEFAULT_WIDTH);
+    let height = vp
+        .height
+        .or(defaults.viewport.height)
+        .unwrap_or(DEFAULT_HEIGHT);
+    (html, css_abs, width, height)
+}
+
+/// Populate `cache[abs]` if absent. Missing files warn; absent keys
+/// are treated as a no-op at injection time.
+fn ensure_css_cached(cache: &mut HashMap<PathBuf, String>, abs: &Path) {
+    if cache.contains_key(abs) {
+        return;
+    }
+    match std::fs::read_to_string(abs) {
+        Ok(s) => {
+            cache.insert(abs.to_path_buf(), s);
+        }
+        Err(e) => eprintln!("  warn: failed to read {}: {e}", abs.display()),
+    }
+}
+
+fn render_to_png(
+    html: &str,
+    width: f32,
+    height: f32,
+    name: &str,
+    out_dir: &Path,
+    fonts: &FontRepository,
+) {
     let picture =
-        htmlcss::render(html, width, 600.0, &fonts, &htmlcss::NoImages).expect("render failed");
+        htmlcss::render(html, width, height, fonts, &htmlcss::NoImages).expect("render failed");
     let cull = picture.cull_rect();
     let w = cull.width().max(1.0) as i32;
     let h = cull.height().max(1.0) as i32;
@@ -44,42 +143,177 @@ fn render_to_png(html: &str, width: f32, name: &str, out_dir: &Path) {
     eprintln!("  {name}: {w}x{h} → {}", path.display());
 }
 
-fn render_html_file(path: &Path, out_dir: &Path) {
-    let html = std::fs::read_to_string(path).expect("failed to read HTML file");
-    let name = path
+fn render_with_extras(
+    html_path: &Path,
+    extras_abs: &[PathBuf],
+    width: f32,
+    height: f32,
+    out_dir: &Path,
+    fonts: &FontRepository,
+    css_cache: &mut HashMap<PathBuf, String>,
+) {
+    let html = match std::fs::read_to_string(html_path) {
+        Ok(s) => s,
+        Err(e) => {
+            eprintln!("  warn: failed to read {}: {e}", html_path.display());
+            return;
+        }
+    };
+    let name = html_path
         .file_stem()
         .map(|s| s.to_string_lossy().to_string())
         .unwrap_or_else(|| "unknown".to_string());
-    render_to_png(&html, 600.0, &name, out_dir);
+
+    for abs in extras_abs {
+        ensure_css_cached(css_cache, abs);
+    }
+    let extras: Vec<&str> = extras_abs
+        .iter()
+        .filter_map(|p| css_cache.get(p).map(String::as_str))
+        .collect();
+    let html = if extras.is_empty() {
+        html
+    } else {
+        htmlcss::with_extra_stylesheets(&html, &extras)
+    };
+
+    render_to_png(&html, width, height, &name, out_dir, fonts);
+}
+
+fn render_suite(suite_path: &Path, out_dir: &Path, fonts: &FontRepository) {
+    let raw = std::fs::read_to_string(suite_path)
+        .unwrap_or_else(|e| panic!("failed to read {}: {e}", suite_path.display()));
+    let suite: SuiteFile = serde_json::from_str(&raw)
+        .unwrap_or_else(|e| panic!("failed to parse {}: {e}", suite_path.display()));
+    let suite_dir = suite_path.parent().unwrap_or(Path::new("."));
+
+    eprintln!(
+        "Rendering {} fixture(s) from suite {}",
+        suite.fixtures.len(),
+        suite_path.display()
+    );
+    let mut css_cache: HashMap<PathBuf, String> = HashMap::new();
+    for entry in &suite.fixtures {
+        let (html_path, extras_abs, width, height) =
+            resolve_entry(entry, &suite.defaults, suite_dir);
+        render_with_extras(
+            &html_path,
+            &extras_abs,
+            width,
+            height,
+            out_dir,
+            fonts,
+            &mut css_cache,
+        );
+    }
+}
+
+fn render_directory(dir: &Path, out_dir: &Path, fonts: &FontRepository) {
+    let mut entries: Vec<PathBuf> = std::fs::read_dir(dir)
+        .expect("failed to read directory")
+        .filter_map(|e| e.ok().map(|e| e.path()))
+        .filter(|p| {
+            p.extension()
+                .map(|ext| ext == "html" || ext == "htm")
+                .unwrap_or(false)
+        })
+        .collect();
+    entries.sort();
+
+    eprintln!(
+        "Rendering {} HTML files from {}",
+        entries.len(),
+        dir.display()
+    );
+    let mut css_cache: HashMap<PathBuf, String> = HashMap::new();
+    for path in &entries {
+        render_with_extras(
+            path,
+            &[],
+            DEFAULT_WIDTH,
+            DEFAULT_HEIGHT,
+            out_dir,
+            fonts,
+            &mut css_cache,
+        );
+    }
+}
+
+/// Parse `argv` into (`suite_path`, positional args). If `--suite P`
+/// is present, those two tokens are removed from the positional list.
+fn parse_args(argv: &[String]) -> (Option<String>, Vec<String>) {
+    let mut suite: Option<String> = None;
+    let mut positional: Vec<String> = Vec::new();
+    let mut i = 0;
+    while i < argv.len() {
+        let a = &argv[i];
+        if a == "--suite" {
+            let v = argv
+                .get(i + 1)
+                .unwrap_or_else(|| panic!("--suite requires a path argument"));
+            suite = Some(v.clone());
+            i += 2;
+        } else if a.starts_with("--") {
+            // Unknown long flag. If the next token looks like a value
+            // (doesn't start with `-`) swallow it too, so `--foo bar`
+            // doesn't leak `bar` into the positional stream and get
+            // treated as a file path.
+            match argv.get(i + 1) {
+                Some(next) if !next.starts_with('-') => i += 2,
+                _ => i += 1,
+            }
+        } else {
+            positional.push(a.clone());
+            i += 1;
+        }
+    }
+    (suite, positional)
 }
 
 fn main() {
-    let args: Vec<String> = std::env::args().skip(1).collect();
+    let argv: Vec<String> = std::env::args().skip(1).collect();
+    let (suite, positional) = parse_args(&argv);
 
     // Output to system temp directory
     let out_dir = std::env::temp_dir().join("grida-htmlcss-goldens");
     std::fs::create_dir_all(&out_dir).expect("failed to create output directory");
     eprintln!("Output: {}", out_dir.display());
 
-    if args.is_empty() {
-        // Render built-in test fixtures from fixtures/test-html/L0/
+    let fonts = build_fonts();
+
+    if let Some(suite_path) = suite {
+        render_suite(Path::new(&suite_path), &out_dir, &fonts);
+        eprintln!("Done. Files in: {}", out_dir.display());
+        return;
+    }
+
+    if positional.is_empty() {
         let fixture_dir = PathBuf::from(concat!(
             env!("CARGO_MANIFEST_DIR"),
             "/../../fixtures/test-html/L0"
         ));
         if fixture_dir.is_dir() {
-            render_directory(&fixture_dir, &out_dir);
+            render_directory(&fixture_dir, &out_dir, &fonts);
         } else {
             eprintln!("No fixture directory found at {}", fixture_dir.display());
-            eprintln!("Pass HTML files as arguments instead.");
+            eprintln!("Pass --suite <path> or HTML files as arguments.");
         }
     } else {
-        for arg in &args {
+        let mut css_cache: HashMap<PathBuf, String> = HashMap::new();
+        for arg in &positional {
             let path = PathBuf::from(arg);
             if path.is_dir() {
-                render_directory(&path, &out_dir);
+                render_directory(&path, &out_dir, &fonts);
             } else if path.is_file() {
-                render_html_file(&path, &out_dir);
+                render_with_extras(
+                    &path,
+                    &[],
+                    DEFAULT_WIDTH,
+                    DEFAULT_HEIGHT,
+                    &out_dir,
+                    &fonts,
+                    &mut css_cache,
+                );
             } else {
                 eprintln!("Skipping {}: not a file or directory", path.display());
             }
@@ -88,25 +322,3 @@ fn main() {
 
     eprintln!("Done. Files in: {}", out_dir.display());
 }
-
-fn render_directory(dir: &Path, out_dir: &Path) {
-    let mut entries: Vec<PathBuf> = std::fs::read_dir(dir)
-        .expect("failed to read directory")
-        .filter_map(|e| e.ok().map(|e| e.path()))
-        .filter(|p| {
-            p.extension()
-                .map(|ext| ext == "html" || ext == "htm")
-                .unwrap_or(false)
-        })
-        .collect();
-    entries.sort();
-
-    eprintln!(
-        "Rendering {} HTML files from {}",
-        entries.len(),
-        dir.display()
-    );
-    for path in &entries {
-        render_html_file(path, out_dir);
-    }
-}
diff --git a/crates/grida-canvas/src/htmlcss/mod.rs b/crates/grida-canvas/src/htmlcss/mod.rs
index 8ce3d9042..a69adc5b1 100644
--- a/crates/grida-canvas/src/htmlcss/mod.rs
+++ b/crates/grida-canvas/src/htmlcss/mod.rs
@@ -128,6 +128,45 @@ impl ImageProvider for PreloadedImages {
     }
 }
 
+/// Inject one or more author stylesheets into an HTML document string.
+///
+/// Concatenates `css_bodies` into a single `<style>` block inserted
+/// before `</head>`. Falls back to prepending when no `</head>` is
+/// present. Injected rules are the last author sheet, so they win
+/// cascade ties; mark rules `!important` to override non-important
+/// fixture rules regardless of source order.
+///
+/// # Example
+///
+/// ```ignore
+/// let css = std::fs::read_to_string("theme.css")?;
+/// let patched = htmlcss::with_extra_stylesheets(&html, &[css]);
+/// let picture = htmlcss::render(&patched, width, height, &fonts, &images)?;
+/// ```
+pub fn with_extra_stylesheets<S: AsRef<str>>(html: &str, css_bodies: &[S]) -> String {
+    if css_bodies.is_empty() {
+        return html.to_string();
+    }
+    let mut combined = String::new();
+    combined.push_str("<style>\n");
+    for body in css_bodies {
+        combined.push_str(body.as_ref());
+        combined.push('\n');
+    }
+    combined.push_str("</style>");
+
+    // HTML tag names are ASCII, so lowercasing the haystack is safe.
+    // One pass, correct for any casing of `</head>`.
+    if let Some(idx) = html.to_ascii_lowercase().find("</head>") {
+        let mut out = String::with_capacity(html.len() + combined.len());
+        out.push_str(&html[..idx]);
+        out.push_str(&combined);
+        out.push_str(&html[idx..]);
+        return out;
+    }
+    format!("{combined}{html}")
+}
+
 /// Render HTML+CSS to a Skia Picture.
 ///
 /// Images referenced by `<img src>` or `background-image: url()` are
diff --git a/fixtures/test-html/L0/paint-background-solid.html b/fixtures/test-html/L0/paint-background-solid.html
index 8cbc4d3ba..e63bd3f7c 100644
--- a/fixtures/test-html/L0/paint-background-solid.html
+++ b/fixtures/test-html/L0/paint-background-solid.html
@@ -4,6 +4,11 @@
     <meta charset="utf-8" />
     <title>Paint: Solid Background</title>
     <style>
+      html,
+      body {
+        min-height: 800px;
+        box-sizing: border-box;
+      }
       body {
         margin: 0;
         padding: 24px;
diff --git a/fixtures/test-html/L0/paint-border-radius.html b/fixtures/test-html/L0/paint-border-radius.html
index 2c765a025..3abdd8bee 100644
--- a/fixtures/test-html/L0/paint-border-radius.html
+++ b/fixtures/test-html/L0/paint-border-radius.html
@@ -4,6 +4,11 @@
     <meta charset="utf-8" />
     <title>Paint: Border Radius</title>
     <style>
+      html,
+      body {
+        min-height: 800px;
+        box-sizing: border-box;
+      }
       body {
         margin: 0;
         padding: 24px;
diff --git a/fixtures/test-html/L0/paint-opacity.html b/fixtures/test-html/L0/paint-opacity.html
index a3ad70e19..e89d7f285 100644
--- a/fixtures/test-html/L0/paint-opacity.html
+++ b/fixtures/test-html/L0/paint-opacity.html
@@ -4,6 +4,11 @@
     <meta charset="utf-8" />
     <title>Paint: Opacity</title>
     <style>
+      html,
+      body {
+        min-height: 800px;
+        box-sizing: border-box;
+      }
       body {
         margin: 0;
         padding: 24px;
diff --git a/fixtures/test-html/README.md b/fixtures/test-html/README.md
new file mode 100644
index 000000000..c99c1371f
--- /dev/null
+++ b/fixtures/test-html/README.md
@@ -0,0 +1,160 @@
+# `fixtures/test-html/`
+
+HTML+CSS fixtures for the `cg` htmlcss renderer and the refbrowser
+reftest pipeline.
+
+## Layout
+
+```
+fixtures/test-html/
+├── L0/                # fixtures (source of truth, one concept per file)
+├── _reftest/          # shared helper stylesheets (hide-text.css, …)
+└── suites/            # suite manifests consumed by both producers
+    ├── L0.exact.json      # must pass 100.00% byte-exact; CI gate
+    └── L0.coverage.json   # aspirational scope; tracks progress
+```
+
+See [`.agents/skills/cg-reftest/SKILL.md`](../../.agents/skills/cg-reftest/SKILL.md)
+for the reftest pipeline and suite schema. See
+[`.agents/skills/fixtures/SKILL.md`](../../.agents/skills/fixtures/SKILL.md)
+for general fixture authoring guidance (naming, one-concept-per-file,
+checked-in vs `fixtures/local/`, etc.). This README only covers the
+viewport convention specific to refbrowser.
+
+## Viewport presets
+
+Fixtures should target a **well-known viewport size** so the PNG
+output of both producers is at a predictable dimension. This
+eliminates the brittle "tune `viewport.height` per fixture to cg's
+cull" dance and makes diffs cleaner (subject fills the canvas,
+background doesn't inflate the score — see "Reading the score" in
+the cg-reftest skill).
+
+Recommended presets, ordered by typical use:
+
+| Preset      | Width × Height | When to use                                                                 |
+| ----------- | -------------- | --------------------------------------------------------------------------- |
+| `canvas-md` | 600 × 800      | **Default.** Paint, layout, box-model, flex, grid, positioning tests.       |
+| `canvas-sm` | 400 × 400      | Minimal single-feature demos (one shape, one swatch, probe-style).          |
+| `mobile`    | 390 × 844      | Responsive/mobile-specific behavior (media queries, vh, env(safe-area-\*)). |
+| `tablet`    | 768 × 1024     | Layout transitions at tablet-width breakpoints.                             |
+| `desktop`   | 1280 × 720     | Wide compositions that don't fit in `canvas-md`.                            |
+
+These are conventions, not enforced types. Pick the smallest preset
+that exercises the behavior — small renders diff faster and fit on
+review screens.
+
+## Paint vs. layout fixtures — two authoring rules
+
+Whether a fixture should force its own size depends on **what the
+fixture is testing**. Get this wrong and your test measures the wrong
+thing.
+
+### Paint / visual-property fixtures — force the preset size
+
+Fixtures that test color, opacity, border-radius, shadow, gradient,
+background, transform, or any non-sizing visual property. The canvas
+dimensions are **not** the subject; a known canvas just gives the
+painted pixels room to sit.
+
+**Pattern** — add to every such fixture's `<style>`:
+
+```css
+html,
+body {
+  min-height: 800px; /* match the suite's viewport.height */
+  box-sizing: border-box;
+}
+```
+
+cg then culls to exactly the preset height and Chromium produces the
+same size under `full_page: true`. Both sides match without
+per-fixture `viewport` in the suite.
+
+**Why not `100vh`?** — cg's htmlcss engine currently resolves `vh`
+against an internal viewport that may not match your target (observed:
+100vh → 720px cull). Explicit pixel heights are deterministic across
+both producers.
+
+**Why `box-sizing: border-box`?** — without it, `min-height: 800px` +
+body padding overflows the target height.
+
+### Layout fixtures — never force a body size
+
+Fixtures that test box-model math, padding/margin resolution,
+flex/grid sizing, intrinsic sizing, `width`/`height`/`min-*`/`max-*`
+on elements, positioning, aspect ratio, or anything whose **output
+IS the rendered dimensions**. For these, the body's size IS the
+measurement.
+
+Forcing `min-height: 800px` on a layout fixture contaminates the test:
+you've added a containing block constraint that the fixture's own
+CSS now has to negotiate with. Whatever the fixture was really
+testing becomes entangled with your min-height hack.
+
+**Pattern** — no `html`/`body` min-height. Let the content size
+itself naturally:
+
+```css
+body {
+  margin: 0;
+  padding: 24px;
+  background: #fff;
+  /* ...fixture-specific rules */
+}
+```
+
+The suite entry then carries an explicit `viewport` matching cg's
+natural cull:
+
+```json
+{
+  "path": "../L0/box-dimensions.html",
+  "viewport": { "width": 600, "height": 522 }
+}
+```
+
+To find the right height, render the fixture once with
+`golden_htmlcss --suite` and read the reported `WxH`. Update the
+suite entry's `viewport.height` to match. Re-render refbrowser; both
+sides should now be at identical dimensions.
+
+**Dimension drift** — any change to a layout fixture's internal
+layout changes its natural cull, invalidating `viewport.height` in
+the suite. Re-measure and update.
+
+## Adding a new fixture
+
+1. **Name** — `<domain>-<property>[-<variant>].html`. The filename is
+   the test id.
+2. **Classify** — is this a **paint** fixture (color/border/shadow/
+   opacity/etc.) or a **layout** fixture (sizing/padding/flex/grid)?
+   - **Paint:** add `html, body { min-height: <preset>; box-sizing: border-box; }` so cg cull = preset height.
+   - **Layout:** do **not** set any body `min-height`; let content size itself.
+3. **Keep it minimal** — one property or behavior per file. See the
+   fixtures skill for the full authoring checklist.
+4. **Register it** — add an entry to `suites/L0.coverage.json`:
+   - Paint fixtures: `{ "path": "../L0/<your-file>.html" }` (inherits `defaults.viewport`).
+   - Layout fixtures: run `cargo run -p cg --example golden_htmlcss -- --suite …`, read the reported `WxH`, then
+     ```json
+     { "path": "../L0/<your-file>.html",
+       "viewport": { "width": 600, "height": <natural cull height> } }
+     ```
+5. **Verify** — run both producers against the suite. The cg cull
+   should equal the viewport in the suite, and the refbrowser
+   screenshot should match pixel-for-pixel (or land somewhere in the
+   coverage spectrum, with the diff image showing the real
+   divergence).
+6. **Promote** — once the fixture reaches 100.00% byte-exact parity
+   with Chromium, move its entry from `L0.coverage.json` to
+   `L0.exact.json`.
+
+## Known constraints of the current pipeline
+
+- Per-fixture `.reftest.json` sidecars are **gone**. All per-fixture
+  config lives in the suite manifest.
+- cg htmlcss does not currently resolve `vh` against the refbrowser
+  viewport. Use explicit pixel heights.
+- The refbrowser diff default is `--threshold 0` (pixelmatch
+  strictest). See the cg-reftest skill for rationale and for the list
+  of known Blink-vs-Skia divergence surfaces.
diff --git a/fixtures/test-html/_reftest/hide-text.css b/fixtures/test-html/_reftest/hide-text.css
new file mode 100644
index 000000000..f3126117b
--- /dev/null
+++ b/fixtures/test-html/_reftest/hide-text.css
@@ -0,0 +1,23 @@
+/*
+ * hide-text.css — zero glyph coverage, pin line-box height.
+ *
+ * Inject via sidecar on both sides of a refbrowser reftest for fixtures
+ * that aren't testing text rendering:
+ *
+ *   { "extra_css": ["../_reftest/hide-text.css"] }
+ *
+ * `color: transparent` drops glyph fill; `line-height: 1` pins line-box
+ * height to exactly font-size, removing Blink-vs-Skia metric drift.
+ * Layout is otherwise preserved (advance widths, line wrapping, block
+ * flow). Do not use for fixtures whose subject IS text.
+ */
+
+*,
+*::before,
+*::after {
+  color: transparent !important;
+  text-shadow: none !important;
+  -webkit-text-stroke: 0 !important;
+  caret-color: transparent !important;
+  line-height: 1 !important;
+}
diff --git a/fixtures/test-html/suites/L0.coverage.json b/fixtures/test-html/suites/L0.coverage.json
new file mode 100644
index 000000000..688fbce4c
--- /dev/null
+++ b/fixtures/test-html/suites/L0.coverage.json
@@ -0,0 +1,23 @@
+{
+  "name": "L0.coverage",
+  "description": "Aspirational L0 coverage. Fixtures here are tracked for progress; they are NOT required to be at 100%. Promote a fixture to L0.exact once it reaches 100.00% byte-exact parity with Chromium.",
+  "defaults": {
+    "viewport": { "width": 600, "height": 800 },
+    "wait_for": ["fonts", "networkidle"],
+    "extra_css": ["../_reftest/hide-text.css"],
+    "full_page": true
+  },
+  "fixtures": [
+    {
+      "path": "../L0/box-dimensions.html",
+      "viewport": { "width": 600, "height": 522 }
+    },
+    {
+      "path": "../L0/box-padding.html",
+      "viewport": { "width": 600, "height": 222 }
+    },
+    { "path": "../L0/paint-background-solid.html" },
+    { "path": "../L0/paint-opacity.html" },
+    { "path": "../L0/paint-border-radius.html" }
+  ]
+}
diff --git a/fixtures/test-html/suites/L0.exact.json b/fixtures/test-html/suites/L0.exact.json
new file mode 100644
index 000000000..eed77d830
--- /dev/null
+++ b/fixtures/test-html/suites/L0.exact.json
@@ -0,0 +1,21 @@
+{
+  "name": "L0.exact",
+  "description": "Byte-exact L0 fixtures. Every fixture here MUST stay at 100.00% similarity against the Chromium oracle. Any drop = real regression.",
+  "gate": {
+    "threshold": 0,
+    "aa": false,
+    "floor": 1.0
+  },
+  "defaults": {
+    "viewport": { "width": 600, "height": 800 },
+    "wait_for": ["fonts", "networkidle"],
+    "extra_css": ["../_reftest/hide-text.css"],
+    "full_page": true
+  },
+  "fixtures": [
+    {
+      "path": "../L0/box-dimensions.html",
+      "viewport": { "width": 600, "height": 522 }
+    }
+  ]
+}
diff --git a/packages/grida-reftest/package.json b/packages/grida-reftest/package.json
index 1b5376a42..0b2853094 100644
--- a/packages/grida-reftest/package.json
+++ b/packages/grida-reftest/package.json
@@ -32,10 +32,12 @@
     "smol-toml": "^1.6.1"
   },
   "devDependencies": {
+    "@playwright/test": "^1.52.0",
     "@types/node": "^24",
     "@types/pixelmatch": "^5.2.6",
     "@types/pngjs": "^6.0.5",
     "tsdown": "^0.21.9",
+    "tsx": "^4.19.0",
     "typescript": "^6",
     "vitest": "^4"
   },
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index 4ef388a2e..1ab7eb179 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -1308,6 +1308,9 @@ importers:
         specifier: ^1.6.1
         version: 1.6.1
     devDependencies:
+      '@playwright/test':
+        specifier: ^1.52.0
+        version: 1.58.2
       '@types/node':
         specifier: ^24
         version: 24.12.2
@@ -1320,6 +1323,9 @@ importers:
       tsdown:
         specifier: ^0.21.9
         version: 0.21.9(typescript@6.0.3)
+      tsx:
+        specifier: 4.21.0
+        version: 4.21.0
       typescript:
         specifier: 6.0.3
         version: 6.0.3
@@ -5061,6 +5067,7 @@ packages:
   '@react-email/components@0.0.38':
     resolution: {integrity: sha512-2cjMBZsSPjD1Iyur/MzGrgW/n5A6ONOJQ97pNaVOClxz/EaqNZTo1lFmKdH7p54P7LG9ZxRXxoTe2075VCCGQA==}
     engines: {node: '>=18.0.0'}
+    deprecated: Package no longer supported. Contact Support at https://www.npmjs.com/support for more info.
     peerDependencies:
       react: 19.2.5
 
@@ -6379,6 +6386,7 @@ packages:
 
   '@types/dompurify@3.2.0':
     resolution: {integrity: sha512-Fgg31wv9QbLDA0SpTOXO3MaxySc4DKGLi8sna4/Utjo4r3ZRPdCt4UQee8BWr+Q5z21yifghREPJGYaEOEIACg==}
+    deprecated: This is a stub types definition. dompurify provides its own type definitions, so you do not need this installed.
 
   '@types/draco3d@1.4.10':
     resolution: {integrity: sha512-AX22jp8Y7wwaBgAixaSvkoG4M/+PlAcm3Qs4OW8yT9DM4xUpWKeFhLueTAyZF39pviAdcDdeJoACapiAceqNcw==}
@@ -9201,9 +9209,6 @@ packages:
     resolution: {integrity: sha512-kVCxPF3vQM/N0B1PmoqVUqgHP+EeVjmZSQn+1oCRPxd2P21P2F19lIgbR3HBosbB1PUhOAoctJnfEn2GbN2eZA==}
     engines: {node: '>=18'}
 
-  get-tsconfig@4.10.0:
-    resolution: {integrity: sha512-kGzZ3LWWQcGIAmg6iWvXn0ei6WDtV26wzHRMwDSzmAbcXrTEXxHy6IehI6/4eT6VRKyMP1eF1VqwrVUmE/LR7A==}
-
   get-tsconfig@4.14.0:
     resolution: {integrity: sha512-yTb+8DXzDREzgvYmh6s9vHsSVCHeC0G3PI5bEXNBHtmshPnO+S5O7qgLEOn0I5QvMy6kpZN8K1NKGyilLb93wA==}
 
@@ -9239,6 +9244,7 @@ packages:
 
   glob@10.5.0:
     resolution: {integrity: sha512-DfXN8DfhJ7NH3Oe7cFmu3NCu1wKbkReJ8TorzSAFbSKrlNaQSKfIzqYqVY8zlbs2NLBbWpRiU52GX2PbaBVNkg==}
+    deprecated: Old versions of glob are not supported, and contain widely publicized security vulnerabilities, which have been fixed in the current version. Please update. Support for old versions may be purchased (at exorbitant rates) by contacting i@izs.me
     hasBin: true
 
   glob@7.2.3:
@@ -9389,6 +9395,7 @@ packages:
 
   hast@1.0.0:
     resolution: {integrity: sha512-vFUqlRV5C+xqP76Wwq2SrM0kipnmpxJm7OfvVXpB35Fp+Fn4MV+ozr+JZr5qFvyR1q/U+Foim2x+3P+x9S1PLA==}
+    deprecated: Renamed to rehype
 
   hastscript@9.0.1:
     resolution: {integrity: sha512-g7df9rMFX/SPi34tyGCyUBREQoKkapwdY/T04Qn9TDWfHhAYt4/I0gMVirzK5wEzeUqIjEB+LXC/ypb7Aqno5w==}
@@ -10986,6 +10993,7 @@ packages:
   node-domexception@1.0.0:
     resolution: {integrity: sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==}
     engines: {node: '>=10.5.0'}
+    deprecated: Use your platform's native DOMException instead
 
   node-emoji@2.2.0:
     resolution: {integrity: sha512-Z3lTE9pLaJF47NyMhd4ww1yFTAP8YhYI8SleJiHzM46Fgpm5cnNzSl9XfzFNqbaz+VlJrIj3fXQ4DeN1Rjm6cw==}
@@ -11882,6 +11890,7 @@ packages:
   prebuild-install@7.1.3:
     resolution: {integrity: sha512-8Mf2cbV7x1cXPUILADGI3wuhfqWvtiLA1iclTDbFRZkgRQS0NqsPZphna9V+HyTEadheuPmjaJMsbzKQFOzLug==}
     engines: {node: '>=10'}
+    deprecated: No longer maintained. Please contact the author of the relevant native addon; alternatives are available.
     hasBin: true
 
   prettier@2.8.8:
@@ -23714,10 +23723,6 @@ snapshots:
       '@sec-ant/readable-stream': 0.4.1
       is-stream: 4.0.1
 
-  get-tsconfig@4.10.0:
-    dependencies:
-      resolve-pkg-maps: 1.0.0
-
   get-tsconfig@4.14.0:
     dependencies:
       resolve-pkg-maps: 1.0.0
@@ -29020,7 +29025,7 @@ snapshots:
   tsx@4.21.0:
     dependencies:
       esbuild: 0.27.3
-      get-tsconfig: 4.10.0
+      get-tsconfig: 4.14.0
     optionalDependencies:
       fsevents: 2.3.3