diff --git a/.agents/skills/cg-perf/SKILL.md b/.agents/skills/cg-perf/SKILL.md index 711d8de4f1..bfca27c447 100644 --- a/.agents/skills/cg-perf/SKILL.md +++ b/.agents/skills/cg-perf/SKILL.md @@ -130,8 +130,11 @@ reports `min/p50/p95/p99/MAX` plus per-stage breakdown and settle cost. | `zigzag` | fast (continuous) / slow (with pauses) × fit/zoomed | Diagonal reading pattern with direction changes | | `zoom` | slow/fast × around-fit/high | Zoom oscillation at different levels | | `pan_with_settle` | slow/fast × fit/zoomed | Pan with settle frames interleaved every 12 frames | +| `zoom_with_settle`| slow/fast × fit/high | Zoom with settle frames interleaved every 12 frames — captures cache-cold spike after settle nukes zoom cache | +| `zoom_forced_stable` | slow/fast × fit/high (BUG prefix) | Forces `stable=true` on every zoom frame — reproduces the `redraw()` bug for A/B comparison | | `realtime` | fast/slow × fit/zoomed | **Real-time event loop simulation** with sleep, 240Hz tick thread, and settle countdown matching the native viewer | | `frameloop` | 16/50/80/120/200/300/500ms interval | **Real FrameLoop path** — the only bench that captures stable-frame jank during panning (see below) | +| `frameloop_zoom` | 16/50/80/120/200/500ms interval | **Real FrameLoop path for zoom** — captures stable-frame intrusion during zoom gestures | | `resize` | alternating viewport sizes | `--resize` flag. Measures `resize()` + `redraw()` cost per cycle (layout rebuild + cache invalidation + repaint) | **SurfaceUI overlay measurement (`--overlay`):** @@ -169,15 +172,15 @@ and simulate the native viewer's 240Hz tick thread + settle countdown. These produce frame timings that match what users actually see, including settle-induced frame drops at their natural frequency. -The `frameloop` scenarios go through the actual `FrameLoop.poll()` / -`complete()` path — the same code path as `Application::frame()`. All -other pan/zoom scenarios bypass `FrameLoop` and call `queue_unstable()` -directly, which means they never produce stable frames mid-interaction. -The `frameloop` scenarios sweep scroll intervals from 16ms (fast flick) -to 500ms (discrete clicks) and reveal how `FrameLoop`'s stable-frame -decisions affect the frame time distribution at each speed. Use these -when investigating panning jank, adaptive timing, or pan/zoom image -cache behavior. +The `frameloop` and `frameloop_zoom` scenarios go through the actual +`FrameLoop.poll()` / `complete()` path — the same code path as +`Application::frame()`. All other pan/zoom scenarios bypass `FrameLoop` +and call `queue_unstable()` directly, which means they never produce +stable frames mid-interaction. The `frameloop` scenarios sweep event +intervals from 16ms (fast flick) to 500ms (discrete clicks) and reveal +how `FrameLoop`'s stable-frame decisions affect the frame time +distribution at each speed. Use these when investigating panning or +zooming jank, adaptive timing, or pan/zoom image cache behavior. **Choosing scenes:** Use `--list-scenes` to see what's available. Pick scenes that stress the subsystem you're optimizing. For effects/caching @@ -527,10 +530,11 @@ catch settle-induced spikes. All pan/zoom/circle/zigzag scenarios call `queue_unstable()` directly — they never go through `FrameLoop.poll()`. This means they never produce stable frames mid-interaction and cannot capture the jank -pattern where a stable frame interrupts slow panning. Only the -`frameloop` scenarios use the real `FrameLoop` decision path. When -investigating panning smoothness or adaptive timing, always use the -`frameloop` scenarios. +pattern where a stable frame interrupts slow panning or zooming. Only +the `frameloop` (pan) and `frameloop_zoom` scenarios use the real +`FrameLoop` decision path. When investigating panning or zooming +smoothness or adaptive timing, always use the `frameloop` / +`frameloop_zoom` scenarios. ### Stable frames must recapture caches diff --git a/crates/grida-canvas/src/runtime/scene.rs b/crates/grida-canvas/src/runtime/scene.rs index 094364ec66..d92990ca67 100644 --- a/crates/grida-canvas/src/runtime/scene.rs +++ b/crates/grida-canvas/src/runtime/scene.rs @@ -273,13 +273,11 @@ const COMPOSITOR_SURFACE_SIZE: i32 = 4096; // No threshold-based invalidation — the settle frame handles full-quality // rendering after the gesture ends. -/// Maximum zoom ratio (cached_zoom / current_zoom or inverse) before the -/// zoom image cache is invalidated. A ratio of 4.0 means we tolerate up to -/// 4× scale in either direction before the scaled texture becomes too blurry -/// or aliased. This is aggressive but acceptable during active interaction -/// (unstable frames) — the stable frame after interaction ends always -/// produces a full-quality render at the correct zoom level. -const ZOOM_IMAGE_CACHE_MAX_RATIO: f32 = 4.0; +// NOTE: The zoom image cache no longer has a hard eviction ratio. During +// active interaction, the cached texture is stretched at any zoom ratio — +// blurry content is acceptable and avoids catastrophic full-draw spikes. +// The settle frame always produces a full-quality render at the correct +// zoom level. See optimization.md item 21/22. /// Cached GPU snapshot of the composited frame for pan-only fast path. /// @@ -309,8 +307,6 @@ struct PanImageCache { struct ZoomImageCache { /// GPU texture snapshot of the composited frame. image: Image, - /// Zoom level at capture time. - zoom: f32, /// Full view matrix at capture time (includes translation + zoom). view_matrix: math2::transform::AffineTransform, } @@ -1100,11 +1096,18 @@ impl Renderer { } // --- Try zoom image cache fast path (no plan needed) --- - if !stable - && self.backend.is_gpu() - && self.zoom_image_cache.is_some() - && camera_change.zoom_changed() - { + // + // Use the zoom cache for: + // - Zoom-change frames (the primary use case during active zooming) + // - No-change frames when a zoom cache exists (zoom steps may + // quantize to identical values at gesture bounds, producing a + // no-change frame — blitting the existing cache is correct and + // avoids a catastrophic full draw on large scenes) + // + // Exclude pan-only frames — pan has its own faster blit cache. + let use_zoom_cache = self.zoom_image_cache.is_some() + && (camera_change.zoom_changed() || camera_change == CameraChangeKind::None); + if !stable && self.backend.is_gpu() && use_zoom_cache { let zoom_cache_hit = self.try_zoom_cache_blit( surface, scene, @@ -1243,20 +1246,14 @@ impl Renderer { // // Triggers on: // - Zoom-change frames (the primary use case during active zooming) - // - No-change frames during zoom gestures (zoom steps may quantize - // to identical values at bounds, but we still have a valid cache) + // - No-change frames when a zoom cache exists (zoom steps may + // quantize to identical values at gesture bounds, but we still + // have a valid cache — blitting it avoids catastrophic full draws) // - // The image is rendered at a stale zoom level, so text and fine - // details are slightly blurry — acceptable during active interaction. - // The stable frame after interaction ends always does a full redraw. - // Don't use zoom cache for pan-only or no-change frames — pan has its - // own faster cache, and no-change frames (e.g. scene mutations without - // camera movement) must not replay a stale zoom snapshot. - if !plan.stable - && self.backend.is_gpu() - && self.zoom_image_cache.is_some() - && plan.camera_change.zoom_changed() - { + // Excludes pan-only frames — those use the dedicated pan cache. + let zoom_cache_usable = self.zoom_image_cache.is_some() + && (plan.camera_change.zoom_changed() || plan.camera_change == CameraChangeKind::None); + if !plan.stable && self.backend.is_gpu() && zoom_cache_usable { let zoom_cache_hit = self.try_zoom_cache_blit(surface, scene, &plan); if let Some((mid_flush_duration, frame_duration)) = zoom_cache_hit { return FrameFlushStats { @@ -1351,7 +1348,6 @@ impl Renderer { // re-drawing. self.zoom_image_cache = Some(ZoomImageCache { image, - zoom: self.camera.get_zoom(), view_matrix: vm, }); } @@ -1398,16 +1394,17 @@ impl Renderer { _plan: &FramePlan, ) -> Option<(Duration, Duration)> { let cache = self.zoom_image_cache.as_ref()?; - let current_zoom = self.camera.get_zoom(); - let zoom_ratio = current_zoom / cache.zoom; - // Only use cache if zoom ratio is within acceptable range. - if !((1.0 / ZOOM_IMAGE_CACHE_MAX_RATIO)..=ZOOM_IMAGE_CACHE_MAX_RATIO).contains(&zoom_ratio) - { - // Too extreme — invalidate and fall through. - self.zoom_image_cache = None; - return None; - } + // Never evict the zoom cache during active interaction — even at + // extreme ratios the scaled blit is O(1) and avoids catastrophic + // frame spikes (50-60 ms full draws on large scenes). The settle + // frame always produces a full-quality render at the correct zoom. + // + // At ratios beyond ZOOM_IMAGE_CACHE_SOFT_RATIO the stretched + // texture is visibly blurry, but this is acceptable during fast + // interaction. Chromium's compositor uses the same strategy: + // stale tiles are stretched during pinch-zoom and re-rasterized + // asynchronously after the gesture ends. let inv_cached = cache.view_matrix.inverse()?; let cur_vm = self.camera.view_matrix(); @@ -1614,12 +1611,14 @@ impl Renderer { let can_defer = !stable && self.backend.is_gpu() && ( - // No content or camera change — overlay-only (marquee, hover) - (!camera_change.any_changed() && self.pan_image_cache.is_some()) - // Pan cache will likely hit - || (camera_change == CameraChangeKind::PanOnly && self.pan_image_cache.is_some()) - // Zoom cache will likely hit + // Pan cache will likely hit (pan-only or overlay-only with pan cache) + (camera_change == CameraChangeKind::PanOnly && self.pan_image_cache.is_some()) + // Zoom cache will likely hit (zoom change or no-change with zoom cache) || (camera_change.zoom_changed() && self.zoom_image_cache.is_some()) + // No-change: prefer zoom cache (covers mid-zoom zero-delta frames), + // fall back to pan cache (covers overlay-only marquee/hover frames) + || (camera_change == CameraChangeKind::None + && (self.zoom_image_cache.is_some() || self.pan_image_cache.is_some())) ); if can_defer { diff --git a/crates/grida-canvas/src/window/application.rs b/crates/grida-canvas/src/window/application.rs index 8175e2ddc6..ea2f8ad59b 100644 --- a/crates/grida-canvas/src/window/application.rs +++ b/crates/grida-canvas/src/window/application.rs @@ -1450,9 +1450,15 @@ impl UnknownTargetApplication { // flushing so the picture cache is invalidated and the new content // is re-recorded. In the frame() path this happens automatically; // in the redraw() path (native host) we must do it explicitly. + // + // Use unstable (stable=false) when a camera change is active — this + // preserves the zoom/pan image caches and allows the fast blit paths. + // Without this, every zoom frame would nuke the zoom cache and force + // a full O(N) draw, causing ~3 FPS on large scenes. { let camera_change = self.renderer.camera.change_kind(); - self.renderer.apply_changes(camera_change, true); + let stable = !camera_change.any_changed(); + self.renderer.apply_changes(camera_change, stable); } let __frame_start = std::time::Instant::now(); diff --git a/crates/grida-dev/src/bench/runner.rs b/crates/grida-dev/src/bench/runner.rs index f204f3f3ac..68f331670e 100644 --- a/crates/grida-dev/src/bench/runner.rs +++ b/crates/grida-dev/src/bench/runner.rs @@ -997,6 +997,319 @@ fn run_pan_with_settle_pass( ) } +/// Run a zoom pass that interleaves settle (stable) frames at a fixed interval, +/// simulating what happens in real usage: zoom → pause → settle fires → zoom again. +/// +/// This is the zoom equivalent of `run_pan_with_settle_pass`. It captures: +/// - The expensive stable frame cost (full-quality redraw + cache invalidation) +/// - The cache-cold first frame after settle (zoom cache was nuked) +/// - The overall frame time distribution including settle spikes +/// +/// `settle_interval` = number of zoom frames between each settle frame. +/// settle_interval=12 matches the native viewer's 12-tick countdown at 240Hz (~50ms). +fn run_zoom_with_settle_pass( + renderer: &mut cg::runtime::scene::Renderer, + mut overlay: Option, + frames: u32, + step: f32, + z_min: f32, + z_max: f32, + settle_interval: u32, +) -> PassStats { + let start_z = (z_min + z_max) / 2.0; + renderer.camera.set_zoom(start_z); + renderer.queue_stable(); + let _ = renderer.flush(); + + let wall_start = Instant::now(); + let mut frame_times = + Vec::with_capacity(frames as usize + frames as usize / settle_interval as usize); + let mut queue_us_acc = Vec::new(); + let mut draw_us_acc = Vec::new(); + let mut mid_flush_us_acc = Vec::new(); + let mut compositor_us_acc = Vec::new(); + let mut flush_us_acc = Vec::new(); + let mut settle_times = Vec::new(); + + let mut z = start_z; + let mut zdir: i32 = 1; + let mut since_settle = 0u32; + + for i in 0..frames { + let next_z = z + zdir as f32 * step; + if next_z > z_max { + zdir = -1; + z = z_max; + } else if next_z < z_min { + zdir = 1; + z = z_min; + } else { + z = next_z; + } + renderer.camera.set_zoom(z); + + // Interaction frame (unstable) + if let Some((total, q, d, mf, c, f)) = measure_frame(renderer, false, overlay.as_mut()) { + frame_times.push(total); + queue_us_acc.push(q); + draw_us_acc.push(d); + mid_flush_us_acc.push(mf); + compositor_us_acc.push(c); + flush_us_acc.push(f); + } + + since_settle += 1; + + // Insert settle frame at interval (simulates native viewer countdown) + if since_settle >= settle_interval && i < frames - 1 { + since_settle = 0; + if let Some((total, q, d, mf, c, f)) = measure_frame(renderer, true, overlay.as_mut()) { + settle_times.push(total); + // Include in overall stats — this IS a real frame the user sees + frame_times.push(total); + queue_us_acc.push(q); + draw_us_acc.push(d); + mid_flush_us_acc.push(mf); + compositor_us_acc.push(c); + flush_us_acc.push(f); + } + } + } + + let wall = wall_start.elapsed(); + let avg_settle = if settle_times.is_empty() { + 0 + } else { + settle_times.iter().sum::() / settle_times.len() as u64 + }; + + compute_pass_stats( + &frame_times, + &queue_us_acc, + &draw_us_acc, + &mid_flush_us_acc, + &compositor_us_acc, + &flush_us_acc, + wall, + avg_settle, + ) +} + +/// Real-time FrameLoop zoom pass. +/// +/// Reproduces **exactly** what happens in `Application::frame()` during +/// zooming — same FrameLoop, same apply_changes/build_plan/flush_with_plan +/// path, same GPU backend, with real `thread::sleep()` between ticks so +/// the GPU pipeline sees realistic idle gaps. +/// +/// This is the zoom equivalent of `run_frameloop_pan_pass`. It captures +/// the actual user-facing bottleneck: stable frames interrupting zoom +/// interactions. The zoom cache blit fast path, compositor re-rasterization, +/// and GPU flush stalls are all exercised on the real GPU backend. +/// +/// # How it works +/// +/// Runs a 60fps RAF loop (real 16ms sleeps). Zoom events inject camera +/// zoom changes at `event_interval_ms` intervals. `FrameLoop` decides +/// whether each tick produces a frame and at what quality. Stable frames +/// nuke the zoom image cache, forcing the next unstable frame into a full +/// draw — this is the spike users feel as "3 FPS during zoom". +fn run_frameloop_zoom_pass( + renderer: &mut cg::runtime::scene::Renderer, + event_interval_ms: f64, + step: f32, + z_min: f32, + z_max: f32, + duration_ms: f64, +) -> PassStats { + let raf_interval_us: u64 = 16_000; // 60fps host cadence + let t_origin = Instant::now(); + + let mut frame_loop = FrameLoop::new(); + + let mut frame_times = Vec::new(); + let mut queue_us_acc = Vec::new(); + let mut draw_us_acc = Vec::new(); + let mut mid_flush_us_acc = Vec::new(); + let mut compositor_us_acc = Vec::new(); + let mut flush_us_acc = Vec::new(); + let mut stable_count = 0u32; + let mut unstable_count = 0u32; + + let mut next_event_ms = 0.0f64; + let mut zoom_events_fired = 0u32; + let mut z = (z_min + z_max) / 2.0; + let mut zdir: i32 = 1; + + loop { + let now_ms = t_origin.elapsed().as_secs_f64() * 1000.0; + if now_ms >= duration_ms { + break; + } + + // --- Inject zoom event if due --- + if now_ms >= next_event_ms { + let next_z = z + zdir as f32 * step; + if next_z > z_max { + zdir = -1; + z = z_max; + } else if next_z < z_min { + zdir = 1; + z = z_min; + } else { + z = next_z; + } + renderer.camera.set_zoom(z); + frame_loop.invalidate(now_ms); + next_event_ms += event_interval_ms; + zoom_events_fired += 1; + } + + // --- Application::frame() equivalent --- + if let Some(quality) = frame_loop.poll(now_ms) { + let camera_change = renderer.camera.change_kind(); + let stable = quality == FrameQuality::Stable || !camera_change.any_changed(); + + // apply_changes (central invalidation dispatch) + renderer.apply_changes(camera_change, stable); + + // warm camera cache + renderer.camera.warm_cache(); + + // build frame plan + let rect = renderer.camera.rect(); + let zoom = renderer.camera.get_zoom(); + let plan = renderer.build_frame_plan(rect, zoom, stable, camera_change); + + // consume camera change + renderer.camera.consume_change(); + + // flush (draw + GPU submit) — MEASURED + let t0 = Instant::now(); + let stats_opt = renderer.flush_with_plan(plan); + let wall_time = t0.elapsed().as_micros() as u64; + + // complete frame + frame_loop.complete(quality); + + if quality == FrameQuality::Stable { + stable_count += 1; + } else { + unstable_count += 1; + } + + if let Some(stats) = stats_opt { + frame_times.push(wall_time); + queue_us_acc.push(0); + draw_us_acc.push(stats.draw.painter_duration.as_micros() as u64); + mid_flush_us_acc.push(stats.mid_flush_duration.as_micros() as u64); + compositor_us_acc.push(stats.compositor_duration.as_micros() as u64); + flush_us_acc.push(stats.flush_duration.as_micros() as u64); + } + } + + // --- Real sleep to next RAF tick --- + let elapsed_us = t_origin.elapsed().as_micros() as u64; + let next_tick_us = (elapsed_us / raf_interval_us + 1) * raf_interval_us; + let sleep_us = next_tick_us.saturating_sub(t_origin.elapsed().as_micros() as u64); + if sleep_us > 500 { + std::thread::sleep(std::time::Duration::from_micros(sleep_us)); + } + } + + let wall = t_origin.elapsed(); + + eprintln!( + " [frameloop-zoom] events every {event_interval_ms:.0}ms | \ + {zoom_events_fired} events | \ + {} frames ({unstable_count} unstable, {stable_count} stable) | \ + wall: {:.0}ms", + frame_times.len(), + wall.as_millis(), + ); + + compute_pass_stats( + &frame_times, + &queue_us_acc, + &draw_us_acc, + &mid_flush_us_acc, + &compositor_us_acc, + &flush_us_acc, + wall, + 0, // settle is implicit — stable frames are in frame_times + ) +} + +/// Run a zoom pass where EVERY frame is forced stable (stable=true). +/// +/// This reproduces the bug in `Application::redraw()` which passes +/// `apply_changes(camera_change, true)` on every frame — even during +/// active zoom interaction. The result: +/// - Zoom image cache blit fast path is never used (gated on `!plan.stable`) +/// - Zoom cache is nuked every frame (`invalidate_zoom` fires when stable + camera changed) +/// - Every frame does a full draw (R-tree query + sort + paint all nodes) +/// +/// Compare against `run_zoom_pass_at` (which uses stable=false) to see the +/// performance impact of the bug. +fn run_zoom_pass_forced_stable( + renderer: &mut cg::runtime::scene::Renderer, + frames: u32, + step: f32, + z_min: f32, + z_max: f32, + mut overlay: Option, +) -> PassStats { + let start_z = (z_min + z_max) / 2.0; + renderer.camera.set_zoom(start_z); + renderer.queue_stable(); + let _ = renderer.flush(); + + let wall_start = Instant::now(); + let mut frame_times = Vec::with_capacity(frames as usize); + let mut queue_us_acc = Vec::with_capacity(frames as usize); + let mut draw_us_acc = Vec::with_capacity(frames as usize); + let mut mid_flush_us_acc = Vec::with_capacity(frames as usize); + let mut compositor_us_acc = Vec::with_capacity(frames as usize); + let mut flush_us_acc = Vec::with_capacity(frames as usize); + let mut z = start_z; + let mut zdir: i32 = 1; + + for _ in 0..frames { + let next_z = z + zdir as f32 * step; + if next_z > z_max { + zdir = -1; + z = z_max; + } else if next_z < z_min { + zdir = 1; + z = z_min; + } else { + z = next_z; + } + renderer.camera.set_zoom(z); + // BUG REPRODUCTION: always pass stable=true, same as redraw() does + if let Some((total, q, d, mf, c, f)) = measure_frame(renderer, true, overlay.as_mut()) { + frame_times.push(total); + queue_us_acc.push(q); + draw_us_acc.push(d); + mid_flush_us_acc.push(mf); + compositor_us_acc.push(c); + flush_us_acc.push(f); + } + } + let wall = wall_start.elapsed(); + + compute_pass_stats( + &frame_times, + &queue_us_acc, + &draw_us_acc, + &mid_flush_us_acc, + &compositor_us_acc, + &flush_us_acc, + wall, + 0, + ) +} + /// Diagnostic: pan with settle frames interleaved, printing per-frame timing. /// Shows the settle cost and whether the cache recapture works (the frame /// AFTER settle should be fast if the cache was recaptured). @@ -1427,6 +1740,147 @@ fn run_scenarios( }); } + // Forced-stable zoom scenarios: reproduce the redraw() bug where every frame + // passes stable=true, defeating the zoom cache blit fast path. + // Compare these against the regular zoom_* scenarios to see the impact. + struct ForcedStableZoomScenario { + name: &'static str, + step: f32, + z_min: f32, + z_max: f32, + } + + let fs_lo = (fit_zoom * 0.5).max(0.01); + let fs_hi = fit_zoom * 2.0; + let fs_zoomed_in = (fit_zoom * 4.0).min(10.0); + + let forced_stable_scenarios = vec![ + ForcedStableZoomScenario { + name: "BUG_zoom_stable_slow_fit", + step: 0.005, + z_min: fs_lo, + z_max: fs_hi, + }, + ForcedStableZoomScenario { + name: "BUG_zoom_stable_fast_fit", + step: 0.05, + z_min: fs_lo, + z_max: fs_hi, + }, + ForcedStableZoomScenario { + name: "BUG_zoom_stable_slow_high", + step: 0.01, + z_min: fs_zoomed_in * 0.5, + z_max: fs_zoomed_in, + }, + ForcedStableZoomScenario { + name: "BUG_zoom_stable_fast_high", + step: 0.1, + z_min: fs_zoomed_in * 0.5, + z_max: fs_zoomed_in, + }, + ]; + + for fss in &forced_stable_scenarios { + renderer.camera.set_zoom((fss.z_min + fss.z_max) / 2.0); + renderer.queue_stable(); + let _ = renderer.flush(); + + let stats = + run_zoom_pass_forced_stable(renderer, frames, fss.step, fss.z_min, fss.z_max, ov()); + results.push(ScenarioResult { + name: fss.name.to_string(), + kind: "zoom_forced_stable".to_string(), + params: ScenarioParams { + speed: Some(fss.step), + zoom: None, + zoom_min: Some(fss.z_min), + zoom_max: Some(fss.z_max), + }, + stats, + }); + } + + // Settle-interleaved zoom scenarios: simulate native viewer's settle countdown + // during zoom interactions. This is the zoom equivalent of the pan settle + // scenarios — the key missing piece that captures the real UX bottleneck. + // + // settle_interval=12 matches the native viewer's 12-tick countdown at 240Hz (~50ms). + let szs_zoomed_in = (fit_zoom * 4.0).min(10.0); + let szs_lo = (fit_zoom * 0.5).max(0.01); + let szs_hi = fit_zoom * 2.0; + + struct SettleZoomScenario { + name: &'static str, + step: f32, + z_min: f32, + z_max: f32, + settle_interval: u32, + } + + let settle_zoom_scenarios = vec![ + SettleZoomScenario { + name: "zoom_settle_slow_fit", + step: 0.005, + z_min: szs_lo, + z_max: szs_hi, + settle_interval: 12, + }, + SettleZoomScenario { + name: "zoom_settle_fast_fit", + step: 0.05, + z_min: szs_lo, + z_max: szs_hi, + settle_interval: 12, + }, + SettleZoomScenario { + name: "zoom_settle_slow_high", + step: 0.01, + z_min: szs_zoomed_in * 0.5, + z_max: szs_zoomed_in, + settle_interval: 12, + }, + SettleZoomScenario { + name: "zoom_settle_fast_high", + step: 0.1, + z_min: szs_zoomed_in * 0.5, + z_max: szs_zoomed_in, + settle_interval: 12, + }, + ]; + + for szs in &settle_zoom_scenarios { + renderer.camera.set_zoom((szs.z_min + szs.z_max) / 2.0); + renderer.queue_stable(); + let _ = renderer.flush(); + for _ in 0..5 { + renderer.camera.translate(1.0, 0.0); + renderer.queue_unstable(); + let _ = renderer.flush(); + } + + let stats = run_zoom_with_settle_pass( + renderer, + ov(), + frames, + szs.step, + szs.z_min, + szs.z_max, + szs.settle_interval, + ); + results.push(ScenarioResult { + name: szs.name.to_string(), + kind: "zoom_with_settle".to_string(), + params: ScenarioParams { + speed: Some(szs.step), + zoom: None, + zoom_min: Some(szs.z_min), + zoom_max: Some(szs.z_max), + }, + stats, + }); + } + // Realtime event loop simulation scenarios. // These use real sleep() and simulate the native viewer's 240Hz tick // thread + settle countdown, producing timings that match actual UX. @@ -1596,6 +2050,102 @@ fn run_scenarios( }); } + // FrameLoop-based zoom scenarios: the real FrameLoop decision path for zoom. + // Unlike the plain zoom scenarios, these go through FrameLoop.poll() which + // decides Stable vs Unstable based on adaptive delay. This captures: + // - Zoom image cache hit rate (GPU-only: unstable zoom = cache blit) + // - Stable frame intrusion frequency (the settle frame that nukes zoom cache) + // - Cache-cold first frame cost after settle (the "3 FPS" spike) + // - Compositor re-rasterization budget impact + let flz_lo = (fit_zoom * 0.5).max(0.01); + let flz_hi = fit_zoom * 2.0; + + struct FrameLoopZoomScenario { + name: &'static str, + event_interval_ms: f64, + step: f32, + z_min: f32, + z_max: f32, + } + + let frameloop_zoom_scenarios = vec![ + // Continuous fast pinch — baseline, no stable frames should fire + FrameLoopZoomScenario { + name: "flz_16ms", + event_interval_ms: 16.0, + step: 0.01, + z_min: flz_lo, + z_max: flz_hi, + }, + // Moderate pinch — gaps start approaching old 50ms debounce + FrameLoopZoomScenario { + name: "flz_50ms", + event_interval_ms: 50.0, + step: 0.02, + z_min: flz_lo, + z_max: flz_hi, + }, + // Slow pinch — exceeds old 50ms debounce, adaptive should extend + FrameLoopZoomScenario { + name: "flz_80ms", + event_interval_ms: 80.0, + step: 0.02, + z_min: flz_lo, + z_max: flz_hi, + }, + // Slower — common slow trackpad pinch speed + FrameLoopZoomScenario { + name: "flz_120ms", + event_interval_ms: 120.0, + step: 0.03, + z_min: flz_lo, + z_max: flz_hi, + }, + // Very slow — deliberate, careful zooming + FrameLoopZoomScenario { + name: "flz_200ms", + event_interval_ms: 200.0, + step: 0.05, + z_min: flz_lo, + z_max: flz_hi, + }, + // Discrete scroll-wheel clicks — clearly separate events, stable fires between + FrameLoopZoomScenario { + name: "flz_500ms", + event_interval_ms: 500.0, + step: 0.1, + z_min: flz_lo, + z_max: flz_hi, + }, + ]; + + for flz in &frameloop_zoom_scenarios { + renderer.camera.set_zoom((flz.z_min + flz.z_max) / 2.0); + renderer.queue_stable(); + let _ = renderer.flush(); + warmup(renderer); + + let stats = run_frameloop_zoom_pass( + renderer, + flz.event_interval_ms, + flz.step, + flz.z_min, + flz.z_max, + 2000.0, // 2 second session + ); + results.push(ScenarioResult { + name: flz.name.to_string(), + kind: "frameloop_zoom".to_string(), + params: ScenarioParams { + speed: Some(flz.step), + zoom: None, + zoom_min: Some(flz.z_min), + zoom_max: Some(flz.z_max), + }, + stats, + }); + } + results } diff --git a/docs/wg/feat-2d/optimization.md b/docs/wg/feat-2d/optimization.md index 9d50c96edb..e70bb3d5b5 100644 --- a/docs/wg/feat-2d/optimization.md +++ b/docs/wg/feat-2d/optimization.md @@ -837,6 +837,10 @@ missing the cheapest possible camera-change path. - Preserved across: zoom changes, no-change frames, pan+zoom - First zoom frame after invalidation: full draw + capture (cache miss) - All subsequent zoom frames within ratio: cache hit (single blit) + - **Critical:** the `apply_changes()` `stable` parameter controls zoom + cache invalidation. The `redraw()` path must pass `stable=false` + when the camera is actively changing, otherwise every interaction + frame nukes the zoom cache and forces a full O(N) draw. **Measured impact (yrr-main.grida, 136K nodes, 100 frames):** @@ -860,6 +864,24 @@ missing the cheapest possible camera-change path. Border strip rasterization remains a future refinement for the settle phase (see items 25–27 on progressive refinement). + **Update (item 21 hardening):** Two improvements to the zoom image cache: + 1. **Removed hard ratio eviction** — the cache is never evicted during + active interaction, regardless of zoom ratio. Previously, exceeding + 4× ratio caused full-draw spikes (50-60 ms on 135K-node scenes). + Now the stretched texture is blitted at any ratio; the settle frame + handles quality. + 2. **No-change frame coverage** — zero-delta zoom frames (which produce + `CameraChangeKind::None` because `set_zoom(z)` doesn't change the + matrix) now use the zoom image cache instead of falling through to + a full draw. This eliminates spikes at gesture bounds where the zoom + value quantizes to min/max. + + Measured on 01-135k.perf.grida (135K nodes): + | Scenario | Before p95 | After p95 | Before MAX | After MAX | + |---|---|---|---|---| + | zoom_slow_around_fit | 54,062 µs | 6 µs | 60,282 µs | 119 µs | + | zoom_slow_high | 6 µs | 5 µs | 3,848 µs | 44 µs | + 23. **Settle & Refine (shared)** After the gesture ends (~50 ms idle), the frame loop fires a stable @@ -1147,9 +1169,11 @@ expensive full redraws. current camera position (not just any position), or (b) use a `last_had_data_changes` flag that is reliably set in BOTH the `frame()` and legacy `redraw()` code paths. - - The legacy `redraw()` path does not call `apply_changes()`, so any - flag set there is stale. Migrating all hosts to `frame()` would - eliminate this dual-path problem. + - The legacy `redraw()` path now calls `apply_changes()` but + historically passed `stable=true` unconditionally, defeating the + zoom cache blit fast path during interaction. This was fixed by + deriving `stable` from `!camera_change.any_changed()`. Migrating + all hosts to `frame()` would still be preferable long-term. - The `queue()` stable promotion (non-camera events → stable quality) interacts badly with clamped zoom at min/max zoom limits — the zoom doesn't actually change, so `camera_change == None`, causing