feat(vdd): blend driver-exported hardware cursor onto VDD frames#660
feat(vdd): blend driver-exported hardware cursor onto VDD frames#660qiin2333 wants to merge 2 commits into
Conversation
VDD with HardwareCursor=true routes pointer shape/position to an IddCx overlay channel, so the cursor is absent from the shared swap-chain texture that display_vdd_vram_t copies into the encoder pipeline. Windows clients still see a cursor via the local OS, but Moonlight Android (and any other client that does not draw its own cursor) sees none at all. To fix this without forcing the driver back to software cursor, the companion change in ZakoVDD publishes the IddCx cursor state into Global\\ZakoVDD_CursorMeta_<i> / Global\\ZakoVDD_CursorReady_<i>. This change consumes that SHM and feeds Sunshine's existing DXGI cursor-blend pipeline (cursor_vs / cursor_ps / cursor_alpha / cursor_xor) so the same shader path used by Desktop Duplication composites the cursor on top of VDD frames. To avoid duplicating that pipeline, the shared cursor blend state, shaders, gpu_cursor_t pair, and the draw helper are lifted from display_ddup_vram_t into the display_vram_t base class as protected members (init_cursor_pipeline / blend_cursor). display_ddup_vram_t now calls those helpers; display_amd_vram_t keeps its own scheme (cursor_info constant buffer + GetCursorInfo) untouched. vdd_capture_t now optionally attaches the cursor SHM in init(), and exposes a non-blocking poll_cursor() that returns a snapshot with shape and position diffs. display_vdd_vram_t::snapshot consumes one snapshot per captured frame and only re-uploads the cursor texture when ShapeId changes. Cursor SHM is best-effort: missing mapping (older driver builds) silently falls back to no overlay, preserving previous behaviour.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Summary by CodeRabbit
Walkthrough为 Windows VDD 直采后端新增光标合成:添加驱动共享内存的光标快照结构与 poll_cursor(),实现无撕裂的快照读取;在 display_vram_t 中新增 init_cursor_pipeline() 与 blend_cursor(),并在 VDD snapshot 中轮询并合成光标到帧上。 变更内容光标合成管线功能
序列图sequenceDiagram
participant Client as 屏幕截图客户端
participant VDDSnap as display_vdd_vram_t::snapshot()
participant SharedMem as CursorSharedMetadata (驱动共享内存)
participant Poll as vdd_capture_t::poll_cursor()
participant Texture as make_cursor_alpha/xor_image() / set_cursor_texture()
participant Render as display_vram_t::blend_cursor()
participant Frame as 屏幕截图帧 RTV
Client->>VDDSnap: 请求屏幕截图
VDDSnap->>Poll: 调用 vdd_capture_t::poll_cursor()
Poll->>SharedMem: 读取并校验元数据
Poll-->>VDDSnap: 返回 cursor_snapshot
VDDSnap->>Texture: 构建/上传光标纹理(如果 shape_updated)
alt cursor_snapshot.visible && 纹理存在
VDDSnap->>Render: 调用 blend_cursor(RTV*)
Render->>Frame: 在帧上绘制 alpha/xor 光标
end
VDDSnap-->>Client: 返回合成后的帧
预估代码审查工作量🎯 4 (Complex) | ⏱️ ~45 minutes 可能相关的 PR
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/platform/windows/display_vdd.cpp (1)
53-80: ⚡ Quick win用
static_assert固化共享内存 ABI。这里完全依赖注释来保证和驱动侧
CursorSharedMetadata二进制兼容;后续只要字段顺序、类型或编译选项有漂移,就会静默把 SHM 读错。建议至少对sizeof(CursorSharedMetadata)和几个关键offsetof(...)加编译期断言。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/platform/windows/display_vdd.cpp` around lines 53 - 80, Add compile-time assertions to lock the shared-memory ABI for CursorSharedMetadata: use static_assert to check sizeof(CursorSharedMetadata) matches the expected size, assert VDD_CURSOR_MAGIC and VDD_CURSOR_VERSION as needed, and add static_asserts for offsetof(CursorSharedMetadata, Magic), offsetof(..., Version), offsetof(..., PositionX), offsetof(..., ShapeBufferSize) (pick a few key fields) to ensure field layout and alignment don't drift; place these next to the struct definition so build fails if the size or offsets change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/platform/windows/display_vdd.cpp`:
- Around line 354-389: The code only checks ShapeBufferSize but not that
Width/Height/Pitch/ShapeType are consistent with that size, which can lead to
out-of-bounds reads in make_cursor_alpha_image()/make_cursor_xor_image(); before
copying payload compute a safe expected_bytes based on meta->Height and
meta->Pitch (use safe integer arithmetic to detect overflow) and also branch on
meta->ShapeType if different formats have different per-row sizes; if
expected_bytes == 0 or expected_bytes > shape_buffer_size or expected_bytes >
VDD_CURSOR_MAX_BYTES then treat as torn/invalid (do not copy, clear
out.shape_buffer and set out.shape_updated = false), otherwise copy only
expected_bytes from payload; keep existing updates to
m_lastSeenCursorShapeId/m_lastSeenCursorPositionId only when out.shape_updated
remains true.
In `@src/platform/windows/display_vram.cpp`:
- Around line 3526-3529: The calls to set_cursor_texture (after
make_cursor_alpha_image/make_cursor_xor_image) ignore its return value, so
failures can leave old texture/viewport state while input_res is cleared and
later code (set_pos, blend_cursor) will draw using stale or empty SRVs; update
the call site to check the boolean result of set_cursor_texture for both
cursor_alpha and cursor_xor, and on any failure either immediately abort further
cursor work for this frame (skip set_pos/blend_cursor) or explicitly clear
cursor_alpha/cursor_xor textures/SRVs and ensure input_res remains cleared
before returning, mirroring the Desktop Duplication path behavior.
---
Nitpick comments:
In `@src/platform/windows/display_vdd.cpp`:
- Around line 53-80: Add compile-time assertions to lock the shared-memory ABI
for CursorSharedMetadata: use static_assert to check
sizeof(CursorSharedMetadata) matches the expected size, assert VDD_CURSOR_MAGIC
and VDD_CURSOR_VERSION as needed, and add static_asserts for
offsetof(CursorSharedMetadata, Magic), offsetof(..., Version), offsetof(...,
PositionX), offsetof(..., ShapeBufferSize) (pick a few key fields) to ensure
field layout and alignment don't drift; place these next to the struct
definition so build fails if the size or offsets change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a4a71ec8-1a8a-4f8e-b022-d423953327bc
📒 Files selected for processing (3)
src/platform/windows/display.hsrc/platform/windows/display_vdd.cppsrc/platform/windows/display_vram.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Windows
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.{cpp,c,h}
⚙️ CodeRabbit configuration file
src/**/*.{cpp,c,h}: Sunshine 核心 C++ 源码,自托管游戏串流服务器。审查要点:内存安全、 线程安全、RAII 资源管理、安全漏洞。注意预处理宏控制的平台相关代码。
Files:
src/platform/windows/display_vdd.cppsrc/platform/windows/display.hsrc/platform/windows/display_vram.cpp
src/platform/**
⚙️ CodeRabbit configuration file
src/platform/**: 平台抽象层代码(Windows/Linux/macOS)。确保各平台实现一致, 注意 Windows API 调用的错误处理和资源释放。
Files:
src/platform/windows/display_vdd.cppsrc/platform/windows/display.hsrc/platform/windows/display_vram.cpp
🔇 Additional comments (1)
src/platform/windows/display.h (1)
350-385: LGTM!Also applies to: 612-669
1. poll_cursor: validate cursor geometry vs payload size before
publishing (display_vdd.cpp). Previously only ShapeBufferSize <=
VDD_CURSOR_MAX_BYTES was checked; Width/Height/Pitch/ShapeType
could still be inconsistent on a torn header read, and downstream
make_cursor_alpha_image/make_cursor_xor_image do pointer arithmetic
into shape_buffer based on Pitch * Height. A bogus header could
therefore trigger out-of-bounds reads. Now reject snapshots where:
- shape_type > 2 (unknown enum)
- width/height == 0 or > 512 (HW cursor cap incl. monochrome stack)
- monochrome height is odd or pitch < ceil(width/8)
- color/masked-color pitch < width*4
- declared payload size < pitch*height
2. display_vdd_vram_t::snapshot: check set_cursor_texture() return
value (display_vram.cpp). On upload failure clear both cursor_alpha
and cursor_xor so blend_cursor() doesn't sample a stale or empty SRV;
the cursor will simply be omitted from this frame and re-uploaded
when the next shape arrives.
背景
ZakoVDD 开启
HardwareCursor=true时,IddCx 把光标走 OOB overlay 通道,不写入 swap chain framebuffer。display_vdd_vram_t通过OpenSharedResourceByName拿到的就是纯净桌面,编码出去后所有客户端都看不到指针。通过 4K framebuffer 抓帧 + 像素采样确认 framebuffer 中确实没有光标。
改动
1. 复用已有 cursor 合成管线(重构)
把原本只属于
display_ddup_vram_t的 cursor blend 资源(sampler_linear/point、blend_alpha/invert/disable、cursor_ps/vs、cursor_alpha/xor两个gpu_cursor_t)上提到基类display_vram_t::protected,并抽出两个 helper:int init_cursor_pipeline(const ::video::config_t &config)— 创建 shader / blend state / sampler / 旋转 CB。void blend_cursor(ID3D11RenderTargetView *capture_rt)— 一次绘制调用,画完恢复blend_disable并清掉 RTV/SRV 槽。display_ddup_vram_t::init/snapshot改为调用基类 helper,删掉本地 lambda。功能等价、热路径行为不变。AMD 路径(
display_amd_vram_t)使用不同的 cursor_info constant buffer +GetCursorInfo方案,本 PR 保持不动。2. VDD cursor SHM 消费
vdd_capture_t在init里 best-effort 打开驱动侧新增的:Global\ZakoVDD_CursorMeta_<i>—CursorSharedMetadata头 + 最多 256 KiB shape pixels。Global\ZakoVDD_CursorReady_<i>— diagnostic event。新增 lock-free
poll_cursor(cursor_snapshot &):snapshot 头字段、按需拷出 shape buffer、用ShapeId / PositionId单调计数做 diff。读后做一次ShapeId二次校验,发现读到撕裂的 shape 就丢弃留给下一拍。3.
display_vdd_vram_t::snapshot集成CopyResource之后调用poll_cursor,命中shape_updated时走现成的make_cursor_alpha_image/make_cursor_xor_image/set_cursor_texture产生两张 cursor 纹理;每帧调用cursor_alpha/xor.set_pos,最后通过blend_cursor(d3d_img->capture_rt.get())画到刚拷过来的捕获纹理上。IDDCX_CURSOR_SHAPE_TYPE0/1/2 直接映射到DXGI_OUTDUPL_POINTER_SHAPE_TYPE1/2/4。PositionX/Y驱动侧已采用 DXGI 语义(cursor 图像左上角,已扣除 hot spot),这里直接喂给set_pos,不再二次扣除。兼容性
poll_cursor返回 false,display_vdd_vram_t::snapshot走原路径,行为与本 PR 之前完全一致。配套驱动
ZakoVDD PR:qiin2333/zako-vdd#9。
验证
ninja sunshine干净通过(无新增 warning)。