diff --git a/.agents/skills/cg-perf/SKILL.md b/.agents/skills/cg-perf/SKILL.md
index bfca27c447..14f3ace93c 100644
--- a/.agents/skills/cg-perf/SKILL.md
+++ b/.agents/skills/cg-perf/SKILL.md
@@ -123,19 +123,19 @@ reports `min/p50/p95/p99/MAX` plus per-stage breakdown and settle cost.
**Scenario types in the expanded matrix:**
-| Kind | Scenarios | What it tests |
-| ----------------- | --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
-| `pan` | slow/fast × fit/zoomed | Linear back-and-forth panning |
-| `circle_pan` | small/large radius × fit/zoomed | Circular trackpad gesture (unpredictable edges) |
-| `zigzag` | fast (continuous) / slow (with pauses) × fit/zoomed | Diagonal reading pattern with direction changes |
-| `zoom` | slow/fast × around-fit/high | Zoom oscillation at different levels |
-| `pan_with_settle` | slow/fast × fit/zoomed | Pan with settle frames interleaved every 12 frames |
-| `zoom_with_settle`| slow/fast × fit/high | Zoom with settle frames interleaved every 12 frames — captures cache-cold spike after settle nukes zoom cache |
-| `zoom_forced_stable` | slow/fast × fit/high (BUG prefix) | Forces `stable=true` on every zoom frame — reproduces the `redraw()` bug for A/B comparison |
-| `realtime` | fast/slow × fit/zoomed | **Real-time event loop simulation** with sleep, 240Hz tick thread, and settle countdown matching the native viewer |
-| `frameloop` | 16/50/80/120/200/300/500ms interval | **Real FrameLoop path** — the only bench that captures stable-frame jank during panning (see below) |
-| `frameloop_zoom` | 16/50/80/120/200/500ms interval | **Real FrameLoop path for zoom** — captures stable-frame intrusion during zoom gestures |
-| `resize` | alternating viewport sizes | `--resize` flag. Measures `resize()` + `redraw()` cost per cycle (layout rebuild + cache invalidation + repaint) |
+| Kind | Scenarios | What it tests |
+| -------------------- | --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
+| `pan` | slow/fast × fit/zoomed | Linear back-and-forth panning |
+| `circle_pan` | small/large radius × fit/zoomed | Circular trackpad gesture (unpredictable edges) |
+| `zigzag` | fast (continuous) / slow (with pauses) × fit/zoomed | Diagonal reading pattern with direction changes |
+| `zoom` | slow/fast × around-fit/high | Zoom oscillation at different levels |
+| `pan_with_settle` | slow/fast × fit/zoomed | Pan with settle frames interleaved every 12 frames |
+| `zoom_with_settle` | slow/fast × fit/high | Zoom with settle frames interleaved every 12 frames — captures cache-cold spike after settle nukes zoom cache |
+| `zoom_forced_stable` | slow/fast × fit/high (BUG prefix) | Forces `stable=true` on every zoom frame — reproduces the `redraw()` bug for A/B comparison |
+| `realtime` | fast/slow × fit/zoomed | **Real-time event loop simulation** with sleep, 240Hz tick thread, and settle countdown matching the native viewer |
+| `frameloop` | 16/50/80/120/200/300/500ms interval | **Real FrameLoop path** — the only bench that captures stable-frame jank during panning (see below) |
+| `frameloop_zoom` | 16/50/80/120/200/500ms interval | **Real FrameLoop path for zoom** — captures stable-frame intrusion during zoom gestures |
+| `resize` | alternating viewport sizes | `--resize` flag. Measures `resize()` + `redraw()` cost per cycle (layout rebuild + cache invalidation + repaint) |
**SurfaceUI overlay measurement (`--overlay`):**
@@ -157,7 +157,7 @@ cargo run -p grida-dev --release -- bench-report ./fixtures/ --frames 100 --over
The overlay cost is opt-in because it is a devtools feature, not user
content. Overlay cost scales with visible labeled nodes — viewport
culling skips off-screen labels, so zoomed-in views are nearly free.
-At fit-zoom on large scenes (yrr-main, 437 labels visible), overlay
+At fit-zoom on large scenes (e.g. 500 labels visible), overlay
adds ~1.8ms per frame (paragraph layout dominates). At typical editing
zoom, the cost drops to ~190µs or less.
@@ -552,7 +552,7 @@ after content `flush()` and requires a second GPU flush. The overlay
cost is dominated by Skia paragraph creation (one per visible label) —
viewport culling skips off-screen labels, and style objects are hoisted
out of the per-label loop. On scenes with many labeled nodes at
-fit-zoom (e.g. yrr-main with 437 labels), the overlay adds ~1.8ms per
+fit-zoom (e.g. 500 visible labels), the overlay adds ~1.8ms per
frame. At typical editing zoom, most labels are culled and cost drops
to ~190µs. Standard benchmarks exclude overlay by default — use
`--overlay` to include it. If the app feels slower after adding new
@@ -627,7 +627,7 @@ WASM-on-Node benchmark:
# Build WASM first
just --justfile crates/grida-canvas-wasm/justfile build
-# Run benchmark (requires fixtures/local/perf/local/yrr-main.grida for 136k test)
+# Run benchmark (requires a large .grida fixture in fixtures/local/perf/local/)
cd crates/grida-canvas-wasm && npx vitest run __test__/bench-load-scene.test.ts
```
diff --git a/crates/grida-canvas/Cargo.toml b/crates/grida-canvas/Cargo.toml
index 40d5e1e313..2c582b26dd 100644
--- a/crates/grida-canvas/Cargo.toml
+++ b/crates/grida-canvas/Cargo.toml
@@ -117,6 +117,16 @@ name = "skia_bench_primitives"
path = "examples/skia_bench/skia_bench_primitives.rs"
required-features = ["native-gl-context"]
+[[example]]
+name = "skia_bench_rrect_vs_rect"
+path = "examples/skia_bench/skia_bench_rrect_vs_rect.rs"
+required-features = ["native-gl-context"]
+
+[[example]]
+name = "skia_bench_text_lod"
+path = "examples/skia_bench/skia_bench_text_lod.rs"
+required-features = ["native-gl-context"]
+
[[example]]
name = "skia_bench_effects"
path = "examples/skia_bench/skia_bench_effects.rs"
diff --git a/crates/grida-canvas/examples/skia_bench/skia_bench_rrect_vs_rect.rs b/crates/grida-canvas/examples/skia_bench/skia_bench_rrect_vs_rect.rs
new file mode 100644
index 0000000000..b10230a226
--- /dev/null
+++ b/crates/grida-canvas/examples/skia_bench/skia_bench_rrect_vs_rect.rs
@@ -0,0 +1,346 @@
+//! Skia RRect vs Rect — device-space cost measurement.
+//!
+//! Question: does `drawRRect(r)` cost more than `drawRect` on GPU, and
+//! does that cost remain present at tiny (sub-pixel) device radii?
+//!
+//! This directly answers whether a zoom-aware LOD policy that collapses
+//! `rrect → rect` when `radius · camera_zoom < 0.5 px` would be
+//! complementary to Skia's internal behavior or redundant.
+//!
+//! Skia's own auto-collapse (`SkRRect::isRect()`) only triggers on
+//! EXACTLY-zero radii. Our theory: non-zero sub-pixel radii still take
+//! the rrect shader path. This bench verifies that claim.
+//!
+//! ```bash
+//! cargo run -p cg --example skia_bench_rrect_vs_rect --features native-gl-context --release
+//! ```
+
+#[cfg(feature = "native-gl-context")]
+use cg::window::headless::HeadlessGpu;
+use std::time::Instant;
+
+#[cfg(not(feature = "native-gl-context"))]
+fn main() {
+ eprintln!("This example requires --features native-gl-context");
+}
+
+#[cfg(feature = "native-gl-context")]
+fn flush_gpu(surface: &mut skia_safe::Surface) {
+ if let Some(mut ctx) = surface.recording_context() {
+ if let Some(mut direct) = ctx.as_direct_context() {
+ direct.flush_and_submit();
+ }
+ }
+}
+
+#[cfg(feature = "native-gl-context")]
+fn main() {
+ let mut gpu = HeadlessGpu::new(1000, 1000).expect("GPU init");
+ gpu.print_gl_info();
+ println!();
+
+ let surface = &mut gpu.surface;
+ let n_iter = 300;
+
+ println!("=== Rect vs RRect — device-space cost ===");
+ println!("5000 shapes/frame, non-overlapping 100×50 grid.");
+ println!("Each shape is 8×8 device px. All coordinates device-space.");
+ println!("Corner radius varied from 0 → 4 px.");
+ println!();
+
+ let count = 5000usize;
+
+ // Warmup (compile shaders, prime GPU)
+ for _ in 0..30 {
+ flush_gpu(surface);
+ bench_rects_device(surface, count, 1);
+ bench_rrects_device(surface, count, 1.0, 1);
+ flush_gpu(surface);
+ }
+
+ // Baseline: drawRect
+ let rect_us = bench_rects_device(surface, count, n_iter);
+ println!(
+ " drawRect (baseline): {:>8} us | {:.3} us/shape",
+ rect_us,
+ rect_us as f64 / count as f64
+ );
+ println!();
+
+ println!(
+ "{:>12} {:>12} {:>12} {:>12} {:>14}",
+ "radius(dev-px)", "us/frame", "us/shape", "Δ vs rect", "rrect/rect"
+ );
+ println!("{}", "─".repeat(76));
+
+ // Sub-pixel radii
+ for &radius in &[0.0_f32, 0.05, 0.1, 0.25, 0.49] {
+ let us = bench_rrects_device(surface, count, radius, n_iter);
+ let delta = us as i64 - rect_us as i64;
+ let ratio = us as f64 / rect_us as f64;
+ let note = if radius == 0.0 {
+ " (r=0 auto-fast-path)"
+ } else {
+ " ← subpixel"
+ };
+ println!(
+ "{:>14.3} {:>12} {:>12.3} {:>+12} {:>13.2}x{}",
+ radius,
+ us,
+ us as f64 / count as f64,
+ delta,
+ ratio,
+ note
+ );
+ }
+
+ println!();
+ // Near-pixel radii (rrect shader engaged)
+ for &radius in &[0.5, 1.0, 2.0, 4.0, 8.0] {
+ let us = bench_rrects_device(surface, count, radius, n_iter);
+ let delta = us as i64 - rect_us as i64;
+ let ratio = us as f64 / rect_us as f64;
+ println!(
+ "{:>14.3} {:>12} {:>12.3} {:>+12} {:>13.2}x",
+ radius,
+ us,
+ us as f64 / count as f64,
+ delta,
+ ratio
+ );
+ }
+
+ println!();
+
+ // Repeat with larger 32x32 shapes to see if shape size changes the pattern
+ println!("=== Larger 32×32 shapes (different GPU path?) ===");
+ let rect_us32 = bench_rects_device_sized(surface, count, 32.0, n_iter);
+ println!(" drawRect(32×32): {:>8} us", rect_us32);
+ for &radius in &[0.0_f32, 0.25, 0.5, 1.0, 4.0, 16.0] {
+ let us = bench_rrects_device_sized(surface, count, 32.0, radius, n_iter);
+ let delta = us as i64 - rect_us32 as i64;
+ let ratio = us as f64 / rect_us32 as f64;
+ println!(
+ " drawRRect(32×32, r={:>5.2}):{:>8} us Δ={:>+6} ({:.2}x)",
+ radius, us, delta, ratio
+ );
+ }
+ println!();
+
+ // === Part 2: Application-level projected-radius scenario ===
+ println!("=== Application-level projected-radius scenario ===");
+ println!("World radius=4.0, scale varies. Projected radius = 4·scale.");
+ println!("Measures what happens when an app DOES NOT collapse rrect→rect:");
+ println!();
+ println!(
+ "{:>8} {:>14} {:>12} {:>12}",
+ "scale", "projected r (px)", "rrect(us)", "rect(us)"
+ );
+ println!("{}", "─".repeat(52));
+ for &scale in &[1.0_f32, 0.5, 0.25, 0.1, 0.05, 0.02] {
+ let rrect_us = bench_rrects_scaled(surface, count, scale, 4.0, n_iter);
+ let rect_us = bench_rects_scaled(surface, count, scale, n_iter);
+ println!(
+ "{:>8.3} {:>14.3} {:>12} {:>12}",
+ scale,
+ 4.0 * scale,
+ rrect_us,
+ rect_us
+ );
+ }
+ println!();
+
+ // === Part 3: Path-wrapped rrect (sanity check) ===
+ println!("=== Skia auto-collapse verification ===");
+ let rrect_zero = bench_rrects_device(surface, count, 0.0, n_iter);
+ let rect = bench_rects_device(surface, count, n_iter);
+ println!(" drawRect: {:>6} us", rect);
+ println!(
+ " drawRRect(r=0): {:>6} us (SkRRect::isRect() == true)",
+ rrect_zero
+ );
+ println!(
+ " overhead at r=0: {:>+6} us ← fast-path kicks in",
+ rrect_zero as i64 - rect as i64
+ );
+ println!();
+ println!("This is the ONLY zoom-independent collapse Skia does.");
+ println!("At r=0.01 (still near-zero but not exactly 0), the rrect shader runs:");
+ let rrect_tiny = bench_rrects_device(surface, count, 0.01, n_iter);
+ println!(
+ " drawRRect(r=0.01): {:>6} us ← dispatches rrect shader!",
+ rrect_tiny
+ );
+ println!(
+ " Δ vs r=0: {:>+6} us = cost of invoking rrect pipeline for invisible radius",
+ rrect_tiny as i64 - rrect_zero as i64
+ );
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rects_device(surface: &mut skia_safe::Surface, count: usize, n_iter: usize) -> u128 {
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ for i in 0..count {
+ let x = (i % 100) as f32 * 10.0;
+ let y = (i / 100) as f32 * 10.0;
+ canvas.draw_rect(skia_safe::Rect::from_xywh(x, y, 8.0, 8.0), &paint);
+ }
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rrects_device(
+ surface: &mut skia_safe::Surface,
+ count: usize,
+ radius: f32,
+ n_iter: usize,
+) -> u128 {
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ for i in 0..count {
+ let x = (i % 100) as f32 * 10.0;
+ let y = (i / 100) as f32 * 10.0;
+ let r = skia_safe::Rect::from_xywh(x, y, 8.0, 8.0);
+ let rrect = skia_safe::RRect::new_rect_xy(r, radius, radius);
+ canvas.draw_rrect(rrect, &paint);
+ }
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rects_device_sized(
+ surface: &mut skia_safe::Surface,
+ count: usize,
+ size: f32,
+ n_iter: usize,
+) -> u128 {
+ flush_gpu(surface);
+ let start = Instant::now();
+ let step = (size + 2.0).max(10.0);
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ let cols = (1000.0 / step) as usize;
+ for i in 0..count {
+ let x = (i % cols) as f32 * step;
+ let y = (i / cols) as f32 * step;
+ canvas.draw_rect(skia_safe::Rect::from_xywh(x, y, size, size), &paint);
+ }
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rrects_device_sized(
+ surface: &mut skia_safe::Surface,
+ count: usize,
+ size: f32,
+ radius: f32,
+ n_iter: usize,
+) -> u128 {
+ flush_gpu(surface);
+ let start = Instant::now();
+ let step = (size + 2.0).max(10.0);
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ let cols = (1000.0 / step) as usize;
+ for i in 0..count {
+ let x = (i % cols) as f32 * step;
+ let y = (i / cols) as f32 * step;
+ let r = skia_safe::Rect::from_xywh(x, y, size, size);
+ let rrect = skia_safe::RRect::new_rect_xy(r, radius, radius);
+ canvas.draw_rrect(rrect, &paint);
+ }
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rrects_scaled(
+ surface: &mut skia_safe::Surface,
+ count: usize,
+ scale: f32,
+ world_radius: f32,
+ n_iter: usize,
+) -> u128 {
+ // Keep shapes non-overlapping at every scale: step in world-space = 10/scale
+ let step = 10.0 / scale;
+ let size = 8.0 / scale;
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ canvas.save();
+ canvas.scale((scale, scale));
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ for i in 0..count {
+ let x = (i % 100) as f32 * step;
+ let y = (i / 100) as f32 * step;
+ let r = skia_safe::Rect::from_xywh(x, y, size, size);
+ let rrect = skia_safe::RRect::new_rect_xy(r, world_radius, world_radius);
+ canvas.draw_rrect(rrect, &paint);
+ }
+ canvas.restore();
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
+
+#[cfg(feature = "native-gl-context")]
+fn bench_rects_scaled(
+ surface: &mut skia_safe::Surface,
+ count: usize,
+ scale: f32,
+ n_iter: usize,
+) -> u128 {
+ let step = 10.0 / scale;
+ let size = 8.0 / scale;
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ canvas.save();
+ canvas.scale((scale, scale));
+ let mut paint = skia_safe::Paint::default();
+ paint.set_anti_alias(true);
+ paint.set_color(skia_safe::Color::from_argb(255, 100, 150, 200));
+ for i in 0..count {
+ let x = (i % 100) as f32 * step;
+ let y = (i / 100) as f32 * step;
+ canvas.draw_rect(skia_safe::Rect::from_xywh(x, y, size, size), &paint);
+ }
+ canvas.restore();
+ flush_gpu(surface);
+ }
+ (start.elapsed() / n_iter as u32).as_micros()
+}
diff --git a/crates/grida-canvas/examples/skia_bench/skia_bench_text_lod.rs b/crates/grida-canvas/examples/skia_bench/skia_bench_text_lod.rs
new file mode 100644
index 0000000000..a811878905
--- /dev/null
+++ b/crates/grida-canvas/examples/skia_bench/skia_bench_text_lod.rs
@@ -0,0 +1,175 @@
+//! Skia Text LOD — paragraph-paint vs greek-rect cost comparison.
+//!
+//! Measures whether replacing a paragraph paint with a single drawRect
+//! ("greeking") is worth doing at low zoom.
+//!
+//! Scenario: N text nodes on a GPU surface. For each configuration, we
+//! pre-shape a paragraph once (mirroring ParagraphCache behaviour), then
+//! measure the per-frame cost of:
+//! - `paragraph.paint()` — current path
+//! - `drawRect` — greeking candidate
+//! - `skip` — cull candidate
+//!
+//! The test varies:
+//! - font size (in device pixels after projection)
+//! - number of glyphs per paragraph
+//! - number of paragraphs per frame
+//!
+//! ```bash
+//! cargo run -p cg --example skia_bench_text_lod --features native-gl-context --release
+//! ```
+
+#[cfg(feature = "native-gl-context")]
+use cg::window::headless::HeadlessGpu;
+use std::time::Instant;
+
+#[cfg(not(feature = "native-gl-context"))]
+fn main() {
+ eprintln!("This example requires --features native-gl-context");
+}
+
+#[cfg(feature = "native-gl-context")]
+fn flush_gpu(surface: &mut skia_safe::Surface) {
+ if let Some(mut ctx) = surface.recording_context() {
+ if let Some(mut direct) = ctx.as_direct_context() {
+ direct.flush_and_submit();
+ }
+ }
+}
+
+#[cfg(feature = "native-gl-context")]
+fn make_font_collection() -> skia_safe::textlayout::FontCollection {
+ use skia_safe::FontMgr;
+ let mut fc = skia_safe::textlayout::FontCollection::new();
+ fc.set_default_font_manager(FontMgr::new(), None);
+ fc
+}
+
+#[cfg(feature = "native-gl-context")]
+fn build_paragraph(
+ fc: &skia_safe::textlayout::FontCollection,
+ text: &str,
+ font_size: f32,
+ max_width: f32,
+) -> skia_safe::textlayout::Paragraph {
+ use skia_safe::textlayout;
+ let mut ps = textlayout::ParagraphStyle::new();
+ let mut ts = textlayout::TextStyle::new();
+ ts.set_font_size(font_size);
+ ts.set_color(skia_safe::Color::BLACK);
+ ps.set_text_style(&ts);
+ let mut builder = textlayout::ParagraphBuilder::new(&ps, fc);
+ builder.add_text(text);
+ let mut para = builder.build();
+ para.layout(max_width);
+ para
+}
+
+#[cfg(feature = "native-gl-context")]
+fn main() {
+ let mut gpu = HeadlessGpu::new(1000, 1000).expect("GPU init");
+ gpu.print_gl_info();
+ println!();
+
+ let surface = &mut gpu.surface;
+ let fc = make_font_collection();
+ let n_iter = 200;
+
+ // Pre-shape paragraphs at each test font size. In real use the engine
+ // caches these in ParagraphCache, so re-shaping cost is NOT part of
+ // the per-frame measurement.
+ let sample_text = "The quick brown fox jumps over the lazy dog";
+
+ println!("=== Text paragraph.paint() vs drawRect vs skip ===");
+ println!("Each test: 1000 paragraphs per frame, grid-positioned, non-overlapping.");
+ println!("Paragraphs pre-shaped (paint-only cost measured).");
+ println!();
+
+ let count = 1000usize;
+ let font_sizes: &[f32] = &[0.25, 0.5, 1.0, 2.0, 4.0, 6.0, 8.0, 12.0, 16.0, 24.0, 48.0];
+
+ // Warmup: run a big paragraph paint to compile shaders + prime atlas
+ {
+ let para = build_paragraph(&fc, sample_text, 16.0, 300.0);
+ for _ in 0..20 {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ para.paint(canvas, (10.0, 10.0));
+ flush_gpu(surface);
+ }
+ }
+
+ println!(
+ "{:>10} {:>12} {:>12} {:>12} {:>12} {:>10}",
+ "font(px)", "paint(us)", "rect(us)", "skip(us)", "paint/rect", "per-node"
+ );
+ println!("{}", "─".repeat(78));
+
+ for &font_size in font_sizes {
+ // Build paragraph once per font size — pre-shaped so paint() is measured.
+ let para = build_paragraph(&fc, sample_text, font_size, 300.0);
+ let para_h = para.height();
+ let para_w = para.max_width();
+
+ // Test 1: N × paragraph.paint()
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ for i in 0..count {
+ let x = (i % 40) as f32 * 20.0;
+ let y = (i / 40) as f32 * 20.0;
+ para.paint(canvas, (x, y));
+ }
+ flush_gpu(surface);
+ }
+ let paint_us = (start.elapsed() / n_iter as u32).as_micros();
+
+ // Test 2: N × drawRect (greek)
+ let mut paint_obj = skia_safe::Paint::default();
+ paint_obj.set_color(skia_safe::Color::from_argb(180, 80, 80, 80));
+ paint_obj.set_anti_alias(false); // greeking doesn't need AA
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ for i in 0..count {
+ let x = (i % 40) as f32 * 20.0;
+ let y = (i / 40) as f32 * 20.0;
+ canvas.draw_rect(skia_safe::Rect::from_xywh(x, y, para_w, para_h), &paint_obj);
+ }
+ flush_gpu(surface);
+ }
+ let rect_us = (start.elapsed() / n_iter as u32).as_micros();
+
+ // Test 3: skip (just clear, measure clear overhead alone)
+ flush_gpu(surface);
+ let start = Instant::now();
+ for _ in 0..n_iter {
+ let canvas = surface.canvas();
+ canvas.clear(skia_safe::Color::WHITE);
+ flush_gpu(surface);
+ }
+ let skip_us = (start.elapsed() / n_iter as u32).as_micros();
+
+ let paint_net = paint_us as i64 - skip_us as i64;
+ let rect_net = rect_us as i64 - skip_us as i64;
+ let ratio = paint_net.max(0) as f64 / rect_net.max(1) as f64;
+ let per_node_us = paint_net as f64 / count as f64;
+
+ println!(
+ "{:>10.2} {:>12} {:>12} {:>12} {:>12.2} {:>8.3}µs",
+ font_size, paint_us, rect_us, skip_us, ratio, per_node_us
+ );
+ }
+
+ println!();
+ println!("Notes:");
+ println!("- paint(us) = clear + 1000 × paragraph.paint() + flush");
+ println!("- rect(us) = clear + 1000 × drawRect + flush");
+ println!("- skip(us) = clear + flush (baseline overhead)");
+ println!("- per-node = (paint - skip) / 1000 (per-node text cost)");
+ println!("- paint/rect = how much cheaper greeking is (higher = bigger win)");
+}
diff --git a/crates/grida-canvas/src/runtime/scene.rs b/crates/grida-canvas/src/runtime/scene.rs
index 160d1e9e39..de6daec811 100644
--- a/crates/grida-canvas/src/runtime/scene.rs
+++ b/crates/grida-canvas/src/runtime/scene.rs
@@ -466,7 +466,7 @@ impl Renderer {
// and stable frames — the first frame records the picture,
// and the settle frame finds it immediately.
//
- // On yrr-main (135K nodes, 0 effects), this eliminates ~800 us
+ // On a 135K-node scene with 0 effects, this eliminates ~800 us
// of LayerEntry clones + SkPicture recordings on every settle.
let effective_key = if can_unify && entry.layer.effects_empty() {
0
diff --git a/docs/wg/feat-2d/lod-properties.md b/docs/wg/feat-2d/lod-properties.md
new file mode 100644
index 0000000000..51de6326e9
--- /dev/null
+++ b/docs/wg/feat-2d/lod-properties.md
@@ -0,0 +1,211 @@
+---
+title: LOD Properties — Reference Sheet
+format: md
+tags:
+ - internal
+ - wg
+ - canvas
+ - performance
+ - rendering
+ - lod
+---
+
+# LOD Properties — Reference Sheet
+
+A catalog of node and subtree properties where a zoom-aware Level-of-Detail
+(LOD) decision can reduce per-frame work. Pairs with
+**item 51 (Subpixel LOD Culling)** in `optimization.md`, which drops
+entire leaves whose projected bounds collapse below a threshold.
+
+This document defines **what is LOD-able**, **what the decision metric
+is**, and **where in the pipeline the decision applies**. It does NOT
+prescribe specific thresholds or promise specific wins — both require
+empirical verification per backend.
+
+## Principles
+
+1. **LOD decisions are camera-zoom-indexed.** A node's visual
+ significance depends on how it projects to device pixels.
+2. **Two kinds of LOD:**
+ - **Skip work** — eliminate draw dispatches entirely (safe, portable)
+ - **Replace with cheaper primitive** — swap a complex draw for a
+ simpler one (requires per-backend validation; modern GPUs may
+ already short-circuit)
+3. **The only trustworthy reason to implement a rule is a measured
+ win.** Categories that "look" cheap on paper may already be handled
+ by the underlying graphics backend.
+4. **Threshold policy is pluggable, not hard-coded.** Per-property
+ thresholds live in a runtime config so they can be tuned per
+ backend / per fixture / per workload.
+
+## Notation
+
+- `z` — camera zoom (device pixels per world unit)
+- `px(x) = x · z` — project a world-space length to device pixels
+- A property is "subpixel" when its projection falls below a threshold
+ matching the backend's AA resolution (typically 0.5 px for coverage,
+ 1.0 px for structural features)
+
+## Pipeline Stages
+
+Each LOD decision applies at one of three stages:
+
+| Stage | Work avoided | Constraint |
+| ------------------ | ----------------------------------------------- | ----------------------------------------------- |
+| **Frame plan** | skip node / subtree entirely | needs zoom + bounds at plan time |
+| **Picture record** | emit cheaper primitives into cached SkPicture | per-node pictures must become zoom-variant |
+| **Draw time** | dynamic per-frame decision against current zoom | cheap decision; compatible with cached pictures |
+
+---
+
+## Catalog
+
+### A. Geometric node / bounds
+
+| ID | Property | Metric | Action |
+| --- | ------------------------ | ----------------------- | ---------------------------- |
+| A1 | render bounds | both axes projected < ε | cull leaf ✅ item 51 |
+| A2 | render bounds area | area·z² < ε² | cull leaf |
+| A3 | render bounds diagonal | diag·z < ε | cull leaf |
+| A4 | stroke-only contribution | stroke_w·z < ε | drop stroke paint, keep fill |
+| A5 | subtree cumulative area | Σ child area·z² < ε² | cull subtree |
+
+### B. Corner & rounding
+
+| ID | Property | Metric | Action |
+| --- | -------------------- | ----------- | --------------------------- |
+| B1 | corner radius (rect) | r·z < ε | RRect → Rect |
+| B2 | corner radius (path) | r·z < ε | drop corner arcs → polyline |
+| B3 | stroke join miter | miter·z < ε | force bevel fallback |
+
+### C. Stroke & outline
+
+| ID | Property | Metric | Action |
+| --- | ------------------------ | -------------- | ------------------------------ |
+| C1 | stroke width (thin) | width·z < ε | skip stroke draw |
+| C2 | stroke width (hairline) | width·z ≈ 1 px | clamp to width=0 hairline path |
+| C3 | dash segment length | dash·z < ε | replace with solid stroke |
+| C4 | dash gap length | gap·z < ε | replace with solid stroke |
+| C5 | variable-width amplitude | amp·z < ε | collapse to constant stroke |
+| C6 | marker size | marker·z < ε | omit marker |
+
+### D. Path / vector complexity
+
+| ID | Property | Metric | Action |
+| --- | ---------------------- | --------------- | ----------------------------------- |
+| D1 | segment chord length | chord·z < ε | drop consecutive near-coincident pt |
+| D2 | bezier flattening tol | tolerance = 1/z | coarser curve tessellation |
+| D3 | sub-path bbox area | bbox·z² < ε² | drop sub-path |
+| D4 | near-coincident points | d·z < ε | merge points |
+
+### E. Effects (save_layer / filter avoidance)
+
+| ID | Property | Metric | Action |
+| --- | ----------------------- | ---------------- | -------------------- |
+| E1 | drop-shadow blur radius | r·z < ε | skip shadow |
+| E2 | drop-shadow offset | \|offset\|·z < ε | fold color into fill |
+| E3 | inner-shadow radius | r·z < ε | skip |
+| E4 | layer blur sigma | σ·z < ε | skip blur |
+| E5 | backdrop blur sigma | σ·z < ε | skip backdrop blur |
+| E6 | glass displacement | d·z < ε | skip |
+| E7 | noise grain scale | grain·z < ε | skip |
+
+### F. Opacity & blend
+
+| ID | Property | Metric | Action |
+| --- | --------------------- | -------------------- | ------------------ |
+| F1 | alpha near zero | opacity < 1/255 | cull node |
+| F2 | opacity × area | α·w·h·z² < ε | cull node |
+| F3 | blend on tiny subtree | subtree area·z² < ε² | force Normal blend |
+
+### G. Fills
+
+| ID | Property | Metric | Action |
+| --- | ----------------------- | ------------------ | -------------------- |
+| G1 | gradient projected span | span·z < ε | averaged solid |
+| G2 | gradient stop density | stops > pixel span | collapse to average |
+| G3 | image fill size | img_display_px < ε | center-pixel solid |
+| G4 | pattern tile size | tile·z < ε | tile-averaged solid |
+| G5 | occluded paint | opaque paint above | skip occluded paints |
+
+### H. Text
+
+| ID | Property | Metric | Action |
+| --- | -------------------- | ------------------------- | ------------------------------ |
+| H1 | font size (cull) | font·z < ε_cull | skip text entirely ✅ item 52 |
+| H2 | font size (greek) | ε_cull ≤ font·z < ε_greek | render as SkRect(s) ✅ item 52 |
+| H3 | line height | lh·z < ε | collapse to thin rect |
+| H4 | glyph advance | adv·z < ε | merge adjacent glyphs |
+| H5 | attributed run span | run·z < ε | merge runs |
+| H6 | decoration thickness | thickness·z < ε | skip decoration |
+| H7 | text-shadow blur | r·z < ε | skip |
+
+### I. Clip & mask
+
+| ID | Property | Metric | Action |
+| --- | --------------- | -------------------- | ---------------------- |
+| I1 | clip path area | bbox·z² < ε² | drop clipped subtree |
+| I2 | clip complexity | many segments, low z | replace with bbox clip |
+| I3 | mask area | bbox·z² < ε² | drop masked subtree |
+
+### J. Container / subtree
+
+| ID | Property | Metric | Action |
+| --- | ---------------------------- | -------------------- | -------------------------- |
+| J1 | subtree cumulative area | Σ children·z² < ε² | rasterize once as snapshot |
+| J2 | container vs sparse children | children « container | skip container paint |
+| J3 | nested container depth | depth > N at low z | flatten subtree to image |
+
+### K. Render-surface backing
+
+| ID | Property | Metric | Action |
+| --- | -------------------------- | ------------------ | ---------------------------- |
+| K1 | surface backing resolution | bounds·z | allocate at projected size |
+| K2 | filter quality | surface_px small | nearest sampling |
+| K3 | compositor promotion | cost estimate at z | don't promote if blit ≥ live |
+
+### L. Devtools overlays
+
+| ID | Property | Metric | Action |
+| --- | ----------------- | ------------------ | -------------- |
+| L1 | frame title label | node_w·z < label_w | hide label |
+| L2 | selection handles | node_area·z² < ε² | hide handles |
+| L3 | hit badges | density at z | cluster badges |
+
+---
+
+## Verification
+
+Each property must be verified before implementation. Two checks:
+
+1. **Skia cost probe** — measure the raw per-primitive cost of the
+ operation to be avoided OR of the replacement primitive. If the
+ backend already short-circuits the condition, the LOD rule is moot
+ or regressive. See `examples/skia_bench/*` for the probe pattern.
+2. **Scene-level bench-report diff** — run with/without the LOD rule
+ across a diverse fixture set, compare per-stage timings.
+
+Two independent sources of possible redundancy:
+
+- Skia's existing fast paths (e.g. `SkRRect::isRect()` for r=0)
+- GPU driver's analytic-coverage shaders that early-exit on sub-pixel
+ inputs (varies per backend — Metal, Ganesh GL, Graphite, WebGL, …)
+
+Rules that **skip work entirely** (A, E, H1/H2, F1, G5) are generally
+safe to implement without per-backend validation: they remove draw
+dispatches the backend would otherwise execute.
+
+Rules that **replace with a cheaper primitive** (B, C, D, G1–G4) need
+per-backend measurement because modern analytic-AA shaders may already
+handle the sub-pixel case efficiently.
+
+## Applied Findings
+
+Findings are tracked inline in `optimization.md` (numbered items) and
+in per-property verification notes alongside their benchmarks.
+
+- **Item 51 (A1)** — implemented. Subpixel leaf-bounds culling.
+- **Item 52 (H1)** — implemented. Text font-size-below-threshold cull.
+- **B1 (RRect → Rect)** — measured via `skia_bench_rrect_vs_rect`.
+ Needs per-backend decision; on some backends the analytic rrect
+ shader is already cheaper than `drawRect` at sub-pixel radii.
diff --git a/docs/wg/feat-2d/optimization.md b/docs/wg/feat-2d/optimization.md
index 2a394d0f2c..72caac7f85 100644
--- a/docs/wg/feat-2d/optimization.md
+++ b/docs/wg/feat-2d/optimization.md
@@ -842,7 +842,7 @@ missing the cheapest possible camera-change path.
when the camera is actively changing, otherwise every interaction
frame nukes the zoom cache and forces a full O(N) draw.
- **Measured impact (yrr-main.grida, 136K nodes, 100 frames):**
+ **Measured impact (136K-node scene, 100 frames):**
| Scenario | Before µs (fps) | After µs (fps) | Speedup |
| -------------------- | --------------- | -------------- | --------- |
@@ -1327,6 +1327,75 @@ expensive full redraws.
bounded by the largest bucket ratio (~±12.5%) and is imperceptible
on in-flight gestures for static content.
+## LOD (Level-of-Detail) at Low Zoom
+
+The following items describe zoom-aware LOD strategies for reducing
+per-frame work when the camera is zoomed out. They are **designed and
+measured** but not yet shipped — see `docs/wg/feat-2d/lod-properties.md`
+for the full property catalog across all node types, and the Skia cost
+probes in `examples/skia_bench/` for per-primitive validation data.
+
+### Key validation findings
+
+- **RRect → Rect collapse (B1):** On Apple M2 Metal, Skia's analytic
+ rrect shader is **faster** than `drawRect` at sub-pixel radii
+ (0.72–0.84× rect cost). The replacement would regress performance.
+ Needs per-backend re-measurement before implementing.
+ Probe: `examples/skia_bench/skia_bench_rrect_vs_rect.rs`.
+
+- **Text paragraph.paint() cost:** 2.4–6× more expensive than a single
+ `drawRect` across all font sizes (0.8 µs/node at 0.25–12 px,
+ 2.1 µs/node at 48 px). Greeking or culling text is a clear win.
+ Probe: `examples/skia_bench/skia_bench_text_lod.rs`.
+
+- **Principle: "skip work" LOD rules are safe; "replace with cheaper
+ primitive" rules need per-backend validation.** Modern analytic-AA
+ GPU shaders may already handle sub-pixel inputs efficiently.
+
+52. **Subpixel LOD Culling** (A1)
+
+ Drop leaf nodes from the frame plan when both projected dimensions
+ fall below a threshold (e.g. 0.5 px). At fit-zoom on the 136K-node
+ fixture, ~38% of visible leaves have both dimensions below 0.5 px
+ at zoom 0.02. Culling them reduces `draw_us` by 6–18% and GPU
+ `mid_flush_us` by up to 24%.
+
+ Decision: `w·z < ε && h·z < ε` — both axes must be subpixel.
+ Thin shapes (large in one axis) survive. Gated by `zoom < 1.0`.
+
+ Mirrors Chromium's `MinimumContentsScale` (`cc/layers/
+ picture_layer_impl.cc`).
+
+ Design: filter `indices` in `Renderer::frame()` after R-tree
+ query, using per-layer bounds stored in a parallel
+ `Vec