Skip to content

feat(vdd): blend driver-exported hardware cursor onto VDD frames#660

Open
qiin2333 wants to merge 2 commits into
masterfrom
feat/vdd-cursor-blend
Open

feat(vdd): blend driver-exported hardware cursor onto VDD frames#660
qiin2333 wants to merge 2 commits into
masterfrom
feat/vdd-cursor-blend

Conversation

@qiin2333
Copy link
Copy Markdown
Collaborator

背景

ZakoVDD 开启 HardwareCursor=true 时,IddCx 把光标走 OOB overlay 通道,不写入 swap chain framebuffer。display_vdd_vram_t 通过 OpenSharedResourceByName 拿到的就是纯净桌面,编码出去后所有客户端都看不到指针。

  • Windows / macOS 客户端依赖本地 OS cursor 蒙了一层"伪光标",平时感觉正常。
  • Moonlight Android、Steam Link、其他无本地 cursor 的客户端则完全看不到指针。

通过 4K framebuffer 抓帧 + 像素采样确认 framebuffer 中确实没有光标。

改动

1. 复用已有 cursor 合成管线(重构)

把原本只属于 display_ddup_vram_t 的 cursor blend 资源(sampler_linear/pointblend_alpha/invert/disablecursor_ps/vscursor_alpha/xor 两个 gpu_cursor_t)上提到基类 display_vram_t::protected,并抽出两个 helper:

  • int init_cursor_pipeline(const ::video::config_t &config) — 创建 shader / blend state / sampler / 旋转 CB。
  • void blend_cursor(ID3D11RenderTargetView *capture_rt) — 一次绘制调用,画完恢复 blend_disable 并清掉 RTV/SRV 槽。

display_ddup_vram_t::init / snapshot 改为调用基类 helper,删掉本地 lambda。功能等价、热路径行为不变。

AMD 路径(display_amd_vram_t)使用不同的 cursor_info constant buffer + GetCursorInfo 方案,本 PR 保持不动。

2. VDD cursor SHM 消费

vdd_capture_tinit 里 best-effort 打开驱动侧新增的:

  • Global\ZakoVDD_CursorMeta_<i>CursorSharedMetadata 头 + 最多 256 KiB shape pixels。
  • Global\ZakoVDD_CursorReady_<i> — diagnostic event。

新增 lock-free poll_cursor(cursor_snapshot &):snapshot 头字段、按需拷出 shape buffer、用 ShapeId / PositionId 单调计数做 diff。读后做一次 ShapeId 二次校验,发现读到撕裂的 shape 就丢弃留给下一拍。

3. display_vdd_vram_t::snapshot 集成

CopyResource 之后调用 poll_cursor,命中 shape_updated 时走现成的 make_cursor_alpha_image / make_cursor_xor_image / set_cursor_texture 产生两张 cursor 纹理;每帧调用 cursor_alpha/xor.set_pos,最后通过 blend_cursor(d3d_img->capture_rt.get()) 画到刚拷过来的捕获纹理上。

IDDCX_CURSOR_SHAPE_TYPE 0/1/2 直接映射到 DXGI_OUTDUPL_POINTER_SHAPE_TYPE 1/2/4。PositionX/Y 驱动侧已采用 DXGI 语义(cursor 图像左上角,已扣除 hot spot),这里直接喂给 set_pos,不再二次扣除。

兼容性

  • Cursor SHM 不存在时(旧 ZakoVDD 构建)poll_cursor 返回 false,display_vdd_vram_t::snapshot 走原路径,行为与本 PR 之前完全一致。
  • 基类的 cursor 资源对 WGC / AMD 子类无副作用(它们继续 shadow 自己的成员)。
  • 不改动 Desktop Duplication 路径行为。

配套驱动

ZakoVDD PR:qiin2333/zako-vdd#9

验证

  • 编译:MSYS2 UCRT64 GCC 15.2.0,ninja sunshine 干净通过(无新增 warning)。
  • 待端到端:Moonlight Android 客户端连接 VDD 流后应看到原生 hardware cursor,光标形状切换(默认 / I-beam / resize)实时更新。

VDD with HardwareCursor=true routes pointer shape/position to an IddCx
overlay channel, so the cursor is absent from the shared swap-chain
texture that display_vdd_vram_t copies into the encoder pipeline.
Windows clients still see a cursor via the local OS, but Moonlight
Android (and any other client that does not draw its own cursor) sees
none at all.

To fix this without forcing the driver back to software cursor, the
companion change in ZakoVDD publishes the IddCx cursor state into
Global\\ZakoVDD_CursorMeta_<i> / Global\\ZakoVDD_CursorReady_<i>.
This change consumes that SHM and feeds Sunshine's existing
DXGI cursor-blend pipeline (cursor_vs / cursor_ps / cursor_alpha /
cursor_xor) so the same shader path used by Desktop Duplication
composites the cursor on top of VDD frames.

To avoid duplicating that pipeline, the shared cursor blend state,
shaders, gpu_cursor_t pair, and the draw helper are lifted from
display_ddup_vram_t into the display_vram_t base class as protected
members (init_cursor_pipeline / blend_cursor). display_ddup_vram_t
now calls those helpers; display_amd_vram_t keeps its own scheme
(cursor_info constant buffer + GetCursorInfo) untouched.

vdd_capture_t now optionally attaches the cursor SHM in init(), and
exposes a non-blocking poll_cursor() that returns a snapshot with
shape and position diffs. display_vdd_vram_t::snapshot consumes one
snapshot per captured frame and only re-uploads the cursor texture
when ShapeId changes.

Cursor SHM is best-effort: missing mapping (older driver builds)
silently falls back to no overlay, preserving previous behaviour.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db2fac51-d776-4def-ae17-f07bfa1b18d5

📥 Commits

Reviewing files that changed from the base of the PR and between 4b5f0ac and 73617a9.

📒 Files selected for processing (2)
  • src/platform/windows/display_vdd.cpp
  • src/platform/windows/display_vram.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/platform/windows/display_vdd.cpp
  • src/platform/windows/display_vram.cpp

Summary by CodeRabbit

  • 新功能

    • 增加对硬件光标共享与快照轮询的支持,使后台可非阻塞获取光标状态
    • 在捕获流程中将光标合成到已拷贝帧上,使非 Windows 客户端也能正确看到硬件光标
    • 引入可初始化的光标渲染/合成管线以提升合成一致性与效率
  • 修复

    • 改善光标元数据的安全读取与撕裂防护,减少光标显示异常

Walkthrough

为 Windows VDD 直采后端新增光标合成:添加驱动共享内存的光标快照结构与 poll_cursor(),实现无撕裂的快照读取;在 display_vram_t 中新增 init_cursor_pipeline() 与 blend_cursor(),并在 VDD snapshot 中轮询并合成光标到帧上。

变更内容

光标合成管线功能

层 / 文件(s) 总结
光标数据结构与管线接口
src/platform/windows/display.h
display_vram_t 新增受保护方法 init_cursor_pipeline(config)blend_cursor(capture_rt) 用于初始化和执行光标合成;vdd_capture_t 新增 cursor_snapshot 结构体和 poll_cursor() 方法用于非阻塞轮询共享内存光标状态;新增私有成员用于管理光标共享内存映射与去重状态。
共享内存光标轮询与防撕裂读取
src/platform/windows/display_vdd.cpp
定义 CursorSharedMetadata 二进制布局镜像与版本常量;在 close() 中释放光标资源映射;在 init() 中以 best-effort 方式附加光标元数据;实现 poll_cursor() 通过 magic/version 校验、ShapeId 前后复核、防止纹理撕裂读取的方式安全读取共享内存光标快照。
D3D11 光标渲染管线
src/platform/windows/display_vram.cpp
实现 init_cursor_pipeline() 编译光标着色器、创建采样器、混合状态和常量缓冲,支持 HDR 白点校正;实现 blend_cursor() 设置管线状态并根据光标纹理类型(alpha/xor)选择混合策略绘制到目标 RTV;替换原有 inline lambda,display_ddup_vram_t::init() 调用新管线初始化。
VDD 后端光标集成
src/platform/windows/display_vram.cpp
display_vdd_vram_t::init() 调用管线初始化;在 snapshot() 中轮询共享内存、按 shape_type 构建 alpha/xor 光标纹理、设置位置与可见性、调用合成方法将光标叠加到屏幕截图帧。

序列图

sequenceDiagram
  participant Client as 屏幕截图客户端
  participant VDDSnap as display_vdd_vram_t::snapshot()
  participant SharedMem as CursorSharedMetadata (驱动共享内存)
  participant Poll as vdd_capture_t::poll_cursor()
  participant Texture as make_cursor_alpha/xor_image() / set_cursor_texture()
  participant Render as display_vram_t::blend_cursor()
  participant Frame as 屏幕截图帧 RTV

  Client->>VDDSnap: 请求屏幕截图
  VDDSnap->>Poll: 调用 vdd_capture_t::poll_cursor()
  Poll->>SharedMem: 读取并校验元数据
  Poll-->>VDDSnap: 返回 cursor_snapshot
  VDDSnap->>Texture: 构建/上传光标纹理(如果 shape_updated)
  alt cursor_snapshot.visible && 纹理存在
    VDDSnap->>Render: 调用 blend_cursor(RTV*)
    Render->>Frame: 在帧上绘制 alpha/xor 光标
  end
  VDDSnap-->>Client: 返回合成后的帧
Loading

预估代码审查工作量

🎯 4 (Complex) | ⏱️ ~45 minutes

可能相关的 PR

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed PR标题准确地概括了主要变更:支持在VDD帧上混合驱动导出的硬件光标,是对changeset的核心功能的简洁描述。
Description check ✅ Passed PR描述详细阐述了背景、具体改动、兼容性等方面,与changeset中的所有主要部分(cursor管线重构、VDD SHM消费、集成逻辑)都高度相关。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/vdd-cursor-blend

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/platform/windows/display_vdd.cpp (1)

53-80: ⚡ Quick win

static_assert 固化共享内存 ABI。

这里完全依赖注释来保证和驱动侧 CursorSharedMetadata 二进制兼容;后续只要字段顺序、类型或编译选项有漂移,就会静默把 SHM 读错。建议至少对 sizeof(CursorSharedMetadata) 和几个关键 offsetof(...) 加编译期断言。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/platform/windows/display_vdd.cpp` around lines 53 - 80, Add compile-time
assertions to lock the shared-memory ABI for CursorSharedMetadata: use
static_assert to check sizeof(CursorSharedMetadata) matches the expected size,
assert VDD_CURSOR_MAGIC and VDD_CURSOR_VERSION as needed, and add static_asserts
for offsetof(CursorSharedMetadata, Magic), offsetof(..., Version), offsetof(...,
PositionX), offsetof(..., ShapeBufferSize) (pick a few key fields) to ensure
field layout and alignment don't drift; place these next to the struct
definition so build fails if the size or offsets change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/platform/windows/display_vdd.cpp`:
- Around line 354-389: The code only checks ShapeBufferSize but not that
Width/Height/Pitch/ShapeType are consistent with that size, which can lead to
out-of-bounds reads in make_cursor_alpha_image()/make_cursor_xor_image(); before
copying payload compute a safe expected_bytes based on meta->Height and
meta->Pitch (use safe integer arithmetic to detect overflow) and also branch on
meta->ShapeType if different formats have different per-row sizes; if
expected_bytes == 0 or expected_bytes > shape_buffer_size or expected_bytes >
VDD_CURSOR_MAX_BYTES then treat as torn/invalid (do not copy, clear
out.shape_buffer and set out.shape_updated = false), otherwise copy only
expected_bytes from payload; keep existing updates to
m_lastSeenCursorShapeId/m_lastSeenCursorPositionId only when out.shape_updated
remains true.

In `@src/platform/windows/display_vram.cpp`:
- Around line 3526-3529: The calls to set_cursor_texture (after
make_cursor_alpha_image/make_cursor_xor_image) ignore its return value, so
failures can leave old texture/viewport state while input_res is cleared and
later code (set_pos, blend_cursor) will draw using stale or empty SRVs; update
the call site to check the boolean result of set_cursor_texture for both
cursor_alpha and cursor_xor, and on any failure either immediately abort further
cursor work for this frame (skip set_pos/blend_cursor) or explicitly clear
cursor_alpha/cursor_xor textures/SRVs and ensure input_res remains cleared
before returning, mirroring the Desktop Duplication path behavior.

---

Nitpick comments:
In `@src/platform/windows/display_vdd.cpp`:
- Around line 53-80: Add compile-time assertions to lock the shared-memory ABI
for CursorSharedMetadata: use static_assert to check
sizeof(CursorSharedMetadata) matches the expected size, assert VDD_CURSOR_MAGIC
and VDD_CURSOR_VERSION as needed, and add static_asserts for
offsetof(CursorSharedMetadata, Magic), offsetof(..., Version), offsetof(...,
PositionX), offsetof(..., ShapeBufferSize) (pick a few key fields) to ensure
field layout and alignment don't drift; place these next to the struct
definition so build fails if the size or offsets change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a4a71ec8-1a8a-4f8e-b022-d423953327bc

📥 Commits

Reviewing files that changed from the base of the PR and between 3d75ab0 and 4b5f0ac.

📒 Files selected for processing (3)
  • src/platform/windows/display.h
  • src/platform/windows/display_vdd.cpp
  • src/platform/windows/display_vram.cpp
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Windows
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.{cpp,c,h}

⚙️ CodeRabbit configuration file

src/**/*.{cpp,c,h}: Sunshine 核心 C++ 源码,自托管游戏串流服务器。审查要点:内存安全、 线程安全、RAII 资源管理、安全漏洞。注意预处理宏控制的平台相关代码。

Files:

  • src/platform/windows/display_vdd.cpp
  • src/platform/windows/display.h
  • src/platform/windows/display_vram.cpp
src/platform/**

⚙️ CodeRabbit configuration file

src/platform/**: 平台抽象层代码(Windows/Linux/macOS)。确保各平台实现一致, 注意 Windows API 调用的错误处理和资源释放。

Files:

  • src/platform/windows/display_vdd.cpp
  • src/platform/windows/display.h
  • src/platform/windows/display_vram.cpp
🔇 Additional comments (1)
src/platform/windows/display.h (1)

350-385: LGTM!

Also applies to: 612-669

Comment thread src/platform/windows/display_vdd.cpp
Comment thread src/platform/windows/display_vram.cpp Outdated
1. poll_cursor: validate cursor geometry vs payload size before
   publishing (display_vdd.cpp). Previously only ShapeBufferSize <=
   VDD_CURSOR_MAX_BYTES was checked; Width/Height/Pitch/ShapeType
   could still be inconsistent on a torn header read, and downstream
   make_cursor_alpha_image/make_cursor_xor_image do pointer arithmetic
   into shape_buffer based on Pitch * Height. A bogus header could
   therefore trigger out-of-bounds reads. Now reject snapshots where:
     - shape_type > 2 (unknown enum)
     - width/height == 0 or > 512 (HW cursor cap incl. monochrome stack)
     - monochrome height is odd or pitch < ceil(width/8)
     - color/masked-color pitch < width*4
     - declared payload size < pitch*height
2. display_vdd_vram_t::snapshot: check set_cursor_texture() return
   value (display_vram.cpp). On upload failure clear both cursor_alpha
   and cursor_xor so blend_cursor() doesn't sample a stale or empty SRV;
   the cursor will simply be omitted from this frame and re-uploaded
   when the next shape arrives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants