Skip to content

vhost-user: add frontend, backend, and protocol crates#3126

Open
jstarks wants to merge 12 commits intomicrosoft:mainfrom
jstarks:vhost_user_async
Open

vhost-user: add frontend, backend, and protocol crates#3126
jstarks wants to merge 12 commits intomicrosoft:mainfrom
jstarks:vhost_user_async

Conversation

@jstarks
Copy link
Copy Markdown
Member

@jstarks jstarks commented Mar 25, 2026

Add vhost-user support to OpenVMM, enabling virtio device backends to run in external processes connected via Unix domain sockets.

  • vhost_user_protocol — shared wire types (headers, payloads, feature flags) and async socket I/O with SCM_RIGHTS fd passing.

  • vhost_user_frontend — a VirtioDevice implementation that proxies device operations to an external vhost-user backend. Handles the full protocol handshake, memory table export, queue lifecycle (including event-backed and function-backed interrupt proxying), config read/write, and save/restore.

  • vhost_user_backend — a protocol server that wraps any VirtioDevice and serves the vhost-user protocol. Handles SET_MEM_TABLE with mmap-backed GuestMemory construction, queue start/stop orchestration, and hot memory table replacement (stopping and restarting active queues).

  • openvmm_vhost — a standalone binary hosting virtio-blk over vhost-user, primarily as a test vehicle.

  • --vhost-user <socket>,type=blk|net|... CLI argument on Linux

  • Resource resolver for VhostUserDeviceHandle registered in openvmm_resources

  • VhostUserDeviceHandle resource type in virtio_resources

Copilot AI review requested due to automatic review settings March 25, 2026 00:23
@github-actions
Copy link
Copy Markdown

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

@github-actions github-actions bot added the unsafe Related to unsafe code label Mar 25, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds vhost-user support to OpenVMM so virtio devices can be hosted in an external process over Unix domain sockets (plus a standalone openvmm-vhost backend binary).

Changes:

  • Introduces vhost_user_protocol, vhost_user_frontend, and vhost_user_backend crates implementing vhost-user wire types, async socket I/O with fd passing, and frontend/backend adapters.
  • Adds --vhost-user ... (Linux) CLI plumbing and a VhostUserDeviceHandle resource so OpenVMM can instantiate vhost-user-backed virtio devices.
  • Adjusts virtio feature advertisement so devices explicitly advertise ring features (event_idx/indirect/packed), with transports no longer force-enabling them.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
vm/devices/virtio/virtiofs/src/virtio.rs Advertises additional virtio ring features for virtio-fs.
vm/devices/virtio/virtio_rng/src/lib.rs Advertises additional virtio ring features for virtio-rng.
vm/devices/virtio/virtio_resources/src/lib.rs Adds VhostUserDeviceHandle resource type (unix-only).
vm/devices/virtio/virtio_pmem/src/lib.rs Advertises additional virtio ring features for virtio-pmem.
vm/devices/virtio/virtio_p9/src/lib.rs Advertises additional virtio ring features for virtio-9p (including device-specific bit).
vm/devices/virtio/virtio_net/src/lib.rs Updates virtio-net feature advertisement to include ring features/packed ring.
vm/devices/virtio/virtio_console/src/lib.rs Updates virtio-console feature advertisement to include ring features/packed ring.
vm/devices/virtio/virtio_blk/src/lib.rs Updates virtio-blk feature advertisement to include ring features/packed ring.
vm/devices/virtio/virtio/src/transport/pci.rs Stops force-enabling ring_event_idx/indirect/packed in transport; only forces VERSION_1.
vm/devices/virtio/virtio/src/transport/mmio.rs Same as PCI: transport only forces VERSION_1.
vm/devices/virtio/vhost_user_protocol/src/socket.rs Implements async recvmsg/sendmsg with SCM_RIGHTS fd passing.
vm/devices/virtio/vhost_user_protocol/src/protocol.rs Defines vhost-user wire structs/enums/constants + layout tests.
vm/devices/virtio/vhost_user_protocol/src/lib.rs Crate entry for shared protocol/socket modules (unix-only).
vm/devices/virtio/vhost_user_protocol/Cargo.toml New crate manifest for protocol/socket implementation.
vm/devices/virtio/vhost_user_frontend/src/resolver.rs Registers a resource resolver that turns VhostUserDeviceHandle into a virtio device.
vm/devices/virtio/vhost_user_frontend/src/lib.rs Implements VirtioDevice frontend proxy (handshake, mem table export, queue lifecycle, config, save/restore).
vm/devices/virtio/vhost_user_frontend/Cargo.toml New crate manifest for frontend implementation + tests.
openvmm_vhost/vhost_user_backend/src/queue_setup.rs Backend per-queue readiness/activation state machine.
openvmm_vhost/vhost_user_backend/src/memory.rs Builds GuestMemory from SET_MEM_TABLE region fds via sparse mapping + VA→GPA translation.
openvmm_vhost/vhost_user_backend/src/lib.rs Implements backend server translating vhost-user messages into VirtioDevice calls.
openvmm_vhost/vhost_user_backend/Cargo.toml New crate manifest for backend server implementation + tests.
openvmm_vhost/openvmm_vhost/src/main.rs Adds openvmm-vhost binary (currently hosts virtio-blk) as a test vehicle.
openvmm_vhost/openvmm_vhost/Cargo.toml New binary crate manifest.
openvmm/openvmm_resources/src/lib.rs Registers vhost-user frontend resolver on Linux builds.
openvmm/openvmm_resources/Cargo.toml Adds Linux-only dependency on vhost_user_frontend.
openvmm/openvmm_entry/src/lib.rs Wires --vhost-user into VM config creation by connecting to the socket and adding the device.
openvmm/openvmm_entry/src/cli_args.rs Adds --vhost-user argument parsing and a VhostUserCli parser.
openvmm/openvmm_entry/Cargo.toml Adds virtio dependency used by the new CLI parsing.
Cargo.toml Adds new crates/binary to the workspace and workspace dependencies.
Cargo.lock Updates lockfile for the newly added workspace crates.

@jstarks jstarks changed the title vhost-user: add frontend, backend, and protocol crates wip: vhost-user: add frontend, backend, and protocol crates Mar 25, 2026
@jstarks jstarks marked this pull request as ready for review March 25, 2026 02:48
@jstarks jstarks requested review from a team as code owners March 25, 2026 02:48
Copilot AI review requested due to automatic review settings March 25, 2026 02:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated 6 comments.

@github-actions github-actions bot added the Guide label Mar 25, 2026
Copilot AI review requested due to automatic review settings March 25, 2026 05:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 5 comments.

Copilot AI review requested due to automatic review settings March 25, 2026 06:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.

Copilot AI review requested due to automatic review settings March 25, 2026 07:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 46 changed files in this pull request and generated 2 comments.

@jstarks jstarks changed the title wip: vhost-user: add frontend, backend, and protocol crates vhost-user: add frontend, backend, and protocol crates Mar 25, 2026
jstarks added 7 commits March 25, 2026 21:49
Add vhost-user support to OpenVMM, enabling virtio device backends to
run in external processes connected via Unix domain sockets.

- `vhost_user_protocol` — shared wire types (headers, payloads,
  feature flags) and async socket I/O with SCM_RIGHTS fd passing.

- `vhost_user_frontend` — a `VirtioDevice` implementation that proxies
  device operations to an external vhost-user backend. Handles the full
  protocol handshake, memory table export, queue lifecycle (including
  event-backed and function-backed interrupt proxying), config
  read/write, and save/restore.

- `vhost_user_backend` — a protocol server that wraps any
  `VirtioDevice` and serves the vhost-user protocol. Handles
  SET_MEM_TABLE with mmap-backed GuestMemory construction, queue
  start/stop orchestration, and hot memory table replacement (stopping
  and restarting active queues).

- `openvmm_vhost` — a standalone binary hosting virtio-blk over
  vhost-user, primarily as a test vehicle.

- `--vhost-user <socket>,type=blk|net|...` CLI argument on Linux
- Resource resolver for `VhostUserDeviceHandle` registered in
  `openvmm_resources`
- `VhostUserDeviceHandle` resource type in `virtio_resources`
- Refactor frontend reply handling into send_and_recv helper with
  reply code validation
- Add nregions limit (VHOST_USER_MAX_FDS) in backend SET_MEM_TABLE
- Fix packed ring save/restore: GET_VRING_BASE now packs both avail
  and used state; frontend stop_queue extracts both from reply
- Save queue state across SET_VRING_ENABLE(0) for GET_VRING_BASE
- Backend serves single connection then exits (matches virtiofsd)
- Fix RESET_DEVICE comment to match actual behavior
- Clarify mem_table_sent lifecycle (connection-scoped, not reset)
- Add end-to-end VMM test (vhost_user_blk_device) in multiarch
- Add flowey build_openvmm_vhost node and CI pipeline wiring
- Add OPENVMM_VHOST_NATIVE artifact declaration and resolver
- Clamp fd_count to VHOST_USER_MAX_FDS in try_recv (both truncation
  and normal paths) to prevent OOB on malformed cmsg_len
- Assert fds.len() <= VHOST_USER_MAX_FDS in try_send to prevent UB
- Validate regions count in send_set_mem_table before sending
- Update run() doc comment to match single-connection behavior
- Add --vhost-user documentation to Guide CLI reference
- Default openvmm_vhost logging to info level via EnvFilter
- Pipe backend stdout/stderr to petri log_task, await drain on exit
- Add ChildGuard drop guard to kill backend on early test exit
- Use async PolledTimer instead of thread::sleep in socket wait loop
- Expose log_source() on PetriVmBuilder for test log file creation
- Accept driver as test function parameter instead of accessor
- Gate vhost_user_protocol, vhost_user_frontend, vhost_user_backend on
  target_os=linux (not unix) to fix macOS build (MSG_CMSG_CLOEXEC)
- Gate openvmm_vhost binary with cfg(target_os=linux) for Windows/macOS
- Add missing doc comment on aarch64 OPENVMM_VHOST_NATIVE constant
- Move IntoResource import into cfg(target_os=linux) test function
Remove intra-doc links to VirtioDevice in frontend and backend crate-level
docs. These links fail on Windows where the crates are empty due to
cfg(target_os = linux).
jstarks added 3 commits March 25, 2026 21:49
- openvmm_vhost main(): replace .unwrap() with clean error
  handling (eprintln + exit(1)) to avoid panic backtraces
- socket: remove dead SocketError::Truncated and
  SocketError::InvalidAncillary variants that were never produced
- socket: retry sendmsg/recvmsg on EINTR instead of propagating
  the interrupted error
These are nonblocking calls inside poll_io, so EINTR cannot occur.
@github-actions
Copy link
Copy Markdown

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

//! openvmm_vhost: a vhost-user backend binary that hosts OpenVMM virtio
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a separate binary that communicates with the main openvmm worker, that itself hosts the vhost user device? or am i misunderstanding?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. This is just a test vehicle in practice, because I don't know of other VMMs that would want to use OpenVMM's virtio devices in this way, and we would probably use mesh if we wanted to run our own virtio devices out of proc (so that we could get inspect, etc.).

Copilot AI review requested due to automatic review settings March 27, 2026 01:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 5 comments.

Comment on lines +98 to +100
let max_addr = regions.last().map(|r| r.guest_phys_addr + r.size).unwrap();
let mapping = SparseMapping::new(max_addr as usize).map_err(MemoryError::Reserve)?;

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparseMapping::new(max_addr as usize) truncates max_addr on 32-bit (or if a malicious frontend sends very large GPAs), potentially creating an undersized reservation and causing incorrect mappings or unexpected failures. Add an explicit check that max_addr fits in usize and return a structured error if it doesn’t.

Copilot uses AI. Check for mistakes.
Comment on lines +269 to +273
// the backend doesn't support. Always include PROTOCOL_FEATURES
// if it was negotiated — backends (e.g., virtiofsd) treat its
// absence in SET_FEATURES as de-negotiation.
let masked_bits =
(guest_bits & self.device_features_raw) | VHOST_USER_F_PROTOCOL_FEATURES;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SET_FEATURES always ORs VHOST_USER_F_PROTOCOL_FEATURES into the feature bits sent to the backend. This can cause protocol negotiation to be reported as enabled even when the backend didn’t advertise that bit in GET_FEATURES, which may break interoperability with non-OpenVMM backends. Only set the PROTOCOL_FEATURES bit in SET_FEATURES when the backend advertised it (and ideally keep it separate from the virtio feature set you derive for internal use).

Suggested change
// the backend doesn't support. Always include PROTOCOL_FEATURES
// if it was negotiated — backends (e.g., virtiofsd) treat its
// absence in SET_FEATURES as de-negotiation.
let masked_bits =
(guest_bits & self.device_features_raw) | VHOST_USER_F_PROTOCOL_FEATURES;
// the backend doesn't support. Only include PROTOCOL_FEATURES
// if the backend advertised it — otherwise, enabling it here
// would report protocol negotiation that never occurred.
let mut masked_bits = guest_bits & self.device_features_raw;
if (self.device_features_raw & VHOST_USER_F_PROTOCOL_FEATURES) != 0 {
masked_bits |= VHOST_USER_F_PROTOCOL_FEATURES;
}

Copilot uses AI. Check for mistakes.
// bits 16-31: used state (index + wrap counter)
// For split ring, only the low 16 bits matter (avail index),
// and used_index is read from the guest-visible used ring.
let vring_base = send_get_vring_base(&self.socket, idx).await.ok()?;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stop_queue returns early on GET_VRING_BASE failure (.ok()?) without clearing q.active / q.params / proxy state. That leaves the frontend’s per-queue state inconsistent (queue may already be disabled, but still marked active), which can cascade into later stop/reset behavior. Consider logging the error and marking the queue inactive (and tearing down any proxy) before returning None.

Suggested change
let vring_base = send_get_vring_base(&self.socket, idx).await.ok()?;
let vring_base = match send_get_vring_base(&self.socket, idx).await {
Ok(base) => base,
Err(e) => {
tracelimit::warn_ratelimited!(
error = &*e as &dyn std::error::Error,
idx,
"GET_VRING_BASE failed during stop_queue; marking queue inactive"
);
// The backend has likely already stopped the queue; ensure
// the frontend's per-queue state reflects that.
q.active = false;
q.params = None;
return None;
}
};

Copilot uses AI. Check for mistakes.
// VHOST_F_LOG_ALL may be toggled). We don't currently support
// any features that require restarting queues on change, so
// just storing the new value is correct.
state.negotiated_features = features_from_u64(msg.value);
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend stores the raw SET_FEATURES value into state.negotiated_features via features_from_u64(msg.value). The vhost-user VHOST_USER_F_PROTOCOL_FEATURES bit shares the same 64-bit feature space but is not a virtio feature; passing it through to VirtioDevice::start_queue can incorrectly surface an unrelated bit to devices. Mask out VHOST_USER_F_PROTOCOL_FEATURES (and any other vhost-user-only bits) before converting to VirtioDeviceFeatures.

Suggested change
state.negotiated_features = features_from_u64(msg.value);
// Mask out vhost-user-only bits (e.g., VHOST_USER_F_PROTOCOL_FEATURES)
// before converting into virtio device features.
let virtio_features = msg.value & !VHOST_USER_F_PROTOCOL_FEATURES;
state.negotiated_features = features_from_u64(virtio_features);

Copilot uses AI. Check for mistakes.
Comment on lines +222 to +229
let mut config_data = vec![0u8; config_hdr.size as usize];
// Read device config registers 4 bytes at a time.
let offset = config_hdr.offset;
let size = config_hdr.size;
let mut pos = 0u32;
while pos < size {
let reg_offset = offset + pos;
let val = self.device.read_registers_u32(reg_offset as u16).await;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GET_CONFIG trusts config_hdr.size/offset from the wire: it allocates vec![0u8; config_hdr.size as usize] and then casts reg_offset to u16 for read_registers_u32. A malformed/malicious frontend can trigger very large allocations (DoS) and/or offset truncation. Validate that offset + size is within the device’s device_register_length (and within u16::MAX) before allocating/reading, and return an error if it’s out of range or the request payload is malformed.

Suggested change
let mut config_data = vec![0u8; config_hdr.size as usize];
// Read device config registers 4 bytes at a time.
let offset = config_hdr.offset;
let size = config_hdr.size;
let mut pos = 0u32;
while pos < size {
let reg_offset = offset + pos;
let val = self.device.read_registers_u32(reg_offset as u16).await;
// Validate offset/size from the wire before allocating or reading registers.
let offset = u32::from(config_hdr.offset);
let size = u32::from(config_hdr.size);
// Compute end = offset + size with overflow check.
let end = offset.checked_add(size).ok_or(SocketError::InvalidPayload)?;
// The device should expose how many bytes of config space are valid.
let dev_len = u32::from(self.device.device_register_length());
// Ensure the requested range fits within the device config space and u16.
if end > dev_len || end > u16::MAX as u32 {
return Err(SocketError::InvalidPayload);
}
let mut config_data = vec![0u8; size as usize];
// Read device config registers 4 bytes at a time.
let mut pos = 0u32;
while pos < size {
let reg_offset = offset + pos;
// Safe due to the end <= u16::MAX check above.
let reg_offset_u16 = reg_offset as u16;
let val = self.device.read_registers_u32(reg_offset_u16).await;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Guide unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants