feat(sec-254): vsock CID allocation and end-to-end tests#6
Draft
jasonhernandez wants to merge 14 commits intoaljoscha:mainfrom
Draft
feat(sec-254): vsock CID allocation and end-to-end tests#6jasonhernandez wants to merge 14 commits intoaljoscha:mainfrom
jasonhernandez wants to merge 14 commits intoaljoscha:mainfrom
Conversation
Add vsock device support across both Firecracker (Linux) and AVF (macOS) backends, enabling structured host↔guest communication over a Unix domain socket instead of SSH polling. CLI: `ember vm create myvm --image base --vsock` YAML config: `vsock: true` UDS created at: `<state_dir>/vms/<name>/vsock.sock` Linux (Firecracker): - New `PUT /vsock` API call with guest CID and UDS path - Firecracker natively creates the UDS and bridges to guest AF_VSOCK macOS (AVF): - VZVirtioSocketDeviceConfiguration added to VM config - ember-vz implements a UDS bridge: accepts host connections on the UDS and proxies them to guest vsock port 1024, and accepts guest-initiated connections on port 1024 and bridges them back to the UDS Both platforms expose the same UDS interface — Thermite's code path is identical regardless of the underlying hypervisor. Co-Authored-By: Claude <noreply@anthropic.com>
ember vm stop --all # stop all running VMs ember vm stop --all --force # SIGKILL all running VMs ember vm delete --all --force # stop + delete every VM Useful for cleanup and for ending all VMs (including non-pool control agent VMs that pool destroy doesn't touch). Co-Authored-By: Claude <noreply@anthropic.com>
Lightweight Rust daemon that runs inside Ember VMs and serves the JSON-lines protocol expected by Thermite's EmberdClient. Listens on vsock port 1024 (Linux) or a Unix domain socket (--uds, for testing). Operations: ping, exec, read_file, write_file, agent_status. - New `emberd/` workspace member with minimal dependencies - 15 unit + integration tests (all via UDS on any platform) - Makefile targets: `make emberd`, `make emberd-release` - Workspace fmt/check/clippy/test updated to include emberd Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Protocol reference, build instructions, architecture diagram, and image integration guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add emberd binary and systemd service to both ubuntu-dev and ubuntu-slim Dockerfiles. The binary is pre-built on the host with `make emberd-image` and staged at images/emberd for COPY. - images/emberd.service: systemd unit (Type=simple, Restart=always) - Dockerfile.ubuntu-dev: COPY emberd + enable service - Dockerfile.ubuntu-slim: COPY emberd + enable service - Makefile: `make emberd-image` target (native on Linux, cross-compile on macOS) - .gitignore: exclude staged binary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Show build, pull, and list commands instead of only suggesting pull. Most custom images (ubuntu-dev, ubuntu-slim) need to be built from a Dockerfile, not pulled from a registry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix Backlog type in listen_vsock (nix 0.29 on Linux requires Backlog::new() instead of raw integer) - Makefile emberd-image: use Docker (rust:latest) for Linux builds on macOS instead of requiring cross-compilation toolchain - Dockerfiles: clean up emberd COPY comments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If `ember vm create` succeeds but the subsequent start fails (e.g., ember-vz crash, missing binary), delete the created VM instead of leaving orphaned state behind. Previously, the start rollback only cleaned up network/process but left the VM metadata and disk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs in the ember-vz vsock UDS bridge: 1. VZVirtioSocketDevice.connect(toPort:) was called from a background queue, but AVF requires it on the main queue. The completion handler never fired, so host→guest connections silently failed. 2. VZVirtioSocketConnection was not retained during bridgeConnection(), so ARC could deallocate it and close the fd mid-transfer. Fixes: - Dispatch connect(toPort:) to DispatchQueue.main - Hold strong ref to VZVirtioSocketConnection via DispatchGroup - Log ember-vz stderr to <vm_dir>/ember-vz.log for debugging - Add diagnostic logging throughout the bridge Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`ember vm create` now waits up to 90s (configurable via --wait) for SSH to become reachable before reporting success. This means `ember exec` works immediately after create — no manual polling needed. Also add --wait flag to `ember exec` for configuring the SSH connect timeout (default: 30s, can be increased for heavy images). If the wait times out, the VM is still running — just SSH is slow. A hint is printed suggesting `ember exec --wait`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When `ember exec vm -- "echo hi | tee /tmp/out"` has one argument after `--`, pass it directly to the SSH channel without quoting. The remote shell interprets pipes and redirects correctly. Previously, shell_escape_join would single-quote arguments containing `|` or `>`, preventing the remote shell from interpreting them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ember exec now tries vsock (emberd) first, falling back to SSH: - Connects to the VM's vsock UDS and sends JSON-lines exec request - No SSH dependency — works immediately after boot (emberd starts fast) - Falls back to SSH automatically if vsock fails - --ssh flag to force SSH path ember vm list now shows IP address and vsock status: NAME STATUS IP VSOCK CPUS MEM DISK val-smoke running 192.168.64.2 ✓ 1 16 GiB 8 GiB Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- `ember vm create --format json` returns VM metadata as JSON on stdout - All progress messages (Cloning, Growing, Injecting, Starting, Waiting) now go to stderr so stdout is clean for JSON piping - `ember exec` also reformatted by cargo fmt This makes ember scriptable: `ember vm create foo --image bar --format json | jq .` outputs clean JSON while progress is visible on stderr. 201 tests pass (186 ember + 15 emberd), clippy clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tests Replace hardcoded guest_cid=3 with a proper CID allocator that assigns unique CIDs (starting at 3) per VM, persisted in vsock/cids.json. CIDs are freed on VM delete and reused lowest-first, following the same pattern as IP allocation in network/ip.rs. - state/vsock.rs: allocate()/release() with flock-based locking (6 tests) - cli/vm.rs: create and fork use CID allocator; delete releases CIDs - cli/vm.rs: validate_uds_path() rejects paths >= 104 bytes (macOS sun_path limit) with actionable error message (3 tests) - error.rs: Error::Vsock variant for CID allocation failures - state/store.rs: vsock_allocations_path(), vsock/ dir in init() - tests/vsock.rs: 6 integration tests (CID uniqueness, reuse after delete, inspect JSON/table output, vm list checkmark, end-to-end UDS connectivity on macOS and Linux) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
276757c to
4afb65f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #5.
guest_cid=3with a proper CID allocator (state/vsock.rs) that assigns unique CIDs per VM, persisted invsock/cids.json, freed on delete, reused lowest-firstsun_pathlimit (104 bytes) with actionable error before allocating resourcesvm create --format jsonwith progress to stderrFiles changed (key additions over #5)
crates/ember-core/src/state/vsock.rsvsock/cids.jsoncrates/ember-core/src/state/store.rsvsock_allocations_path()methodemberd/src/cli/exec.rssrc/cli/vm.rsember-vz/Sources/EmberVZ/Start.swifttests/vsock.rsimages/Rebased onto main after the ember-core/ember-linux/ember-macos workspace restructuring.
Test plan
cargo test --workspace) — 9 CID allocator + 3 UDS validationcargo buildclean,cargo clippy --workspaceclean,cargo fmtcleancargo test --test vsock -- --ignoredon macOS with ember-vz builtcargo test --test vsock -- --ignoredon Linux with Firecracker + KVM🤖 Generated with Claude Code