refactor signal infrastructure and support SIGINT (ctrl-c) by CvvT · Pull Request #690 · microsoft/litebox

CvvT · 2026-02-27T02:07:59Z

This PR supports async signal SIGINT.

On Linux, the workflow is the following:

when boot, register SIGINT handler --> Ctrl-C arrives --> record signal in TLS and wake up a thread if it is being blocked
about to switch to guest (i.e., prepare_to_run_guest) --> check TLS and queue signals --> process signal
any blocking operations
3.1. before commit to wait, check_for_interrupt would check TLS and queue signals --> process signal
3.2. during wait, wake up and go to 3.1
3.3. after waiting, got to 3.1

For shim to be able to retrieve pending signals from platform, add a new SignalProvider with the interface fn take_pending_signals(&self, mut f: impl FnMut(litebox::shim::Signal)).

One tricky problem is that when a signal arrives, the running thread may be about to be blocked or blocked already (i.e., case 3.1 and 3.2), and we didn't interrupt a thread if it is running inside LiteBox code. In which case, we never have the chance to process that signal. It might be possible to examine the RIP of the thread in the signal handler and divert the execution to skip mutex wait call it if RIP is close to it (e.g., use assembly code to build a critical region), but this may not work on Windows (where we don't call syscall instruction directly). The current solution I have is based on the observation that for all blocking operations (e.g., futex wait, sleep) boil down to RawMutex::block_or_timeout with the per-thread WaitState and the expected value ThreadState::WAITING. WaitState stores the thread status like RUNNING_IN_HOST, WAITING, WOKEN, and etc. Thus, in the signal handler, besides recording the signal, we can update the thread state from WAITING to WOKEN and then calls futex wake. If the thread is about to call futex wait, futex call would return immediately because of the change of the value. If the thread is being blocked, it would wake up and return. To access WaitState in the signal handler, I added two new interfaces on_interruptible_wait_start and on_interruptible_wait_end to RawMutexProvider so that platform can use Waker to wake up the blocking thread.

The signal handlers for SIGINT can be called on any threads. However, we don't want non-guest threads (e.g., background network worker, additional test thread spawned by cargo test) to be called with the signal. For background network worker, we can block those signals. For test thread, the signal handler would check if it has our custom TLS. If not, it blocks those signals and raise them again.

On Windows, the implementation is almost the same except that signal handler is called on a kernel thread (and thus cannot access litebox thread's TLS easily). I have to use a global variable to record all active threads.

I added some end-to-end C tests and Rust unit tests. To make Rust unit test work, I also introduced a new interface fn run_test_thread<R>(f: impl FnOnce() -> R) -> R to ThreadProvider. This is because spawning a guest thread (via sys_clone) and a regular thread in test (via std::thread::spawn) have different routes and require different setup. For example, while running in test, there is no guest code, and thus we don't need to do fs/gs swap (hence no setup for one of fs/gs). To make the code work for both guest thread and non-guest thread, we can set gs == fs before running the test thread.

wdcui

Overall, the code change looks good to me. I left some comments below.

wdcui · 2026-02-27T03:35:05Z

litebox/src/platform/mod.rs

+    ///
+    /// The default implementation simply calls `f()` with no additional setup.
+    /// Platforms that require explicit thread registration should override this.
+    fn run_test_thread<R>(f: impl FnOnce() -> R) -> R {


Should we scope this function to tests only?

I thought about it. Using cfg(test) does not work here because the tests reside in a different crate (litebox_shim_linux). Test functions cannot be used by a different crate. One possible solution is adding a new feature, but that adds complexity.

Actually, we can use cfg(debug_assertions).

wdcui · 2026-02-27T03:48:58Z

litebox_platform_linux_userland/src/lib.rs

        thread.interrupt();
    }
+
+    fn run_test_thread<R>(f: impl FnOnce() -> R) -> R {


Limit this to test only?

Added cfg(debug_assertions)

wdcui · 2026-02-27T04:09:00Z

litebox_platform_windows_userland/src/lib.rs

+    /// Runs `f`, ensuring that [`CURRENT_THREAD_HANDLE`] is set while in the call to `f`.
+    fn run_with_handle<R>(tls: &TlsState, f: impl FnOnce() -> R) -> R {
+        // Safety: `tls_state` lives for the duration of this call.
+        unsafe { install_tls(tls) };


Why do we install tls on demand?

All threads should install tls. Previously install_tls was called in the caller function, I moved it here because I want to reuse this function for test threads. So now threads including guest threads and test threads would install tls properly.

wdcui · 2026-02-27T04:10:00Z

litebox_platform_windows_userland/src/lib.rs

+                .unwrap()
+                .retain(|h| !Arc::ptr_eq(&h.0, &current.0));
            *current.0.lock().unwrap() = None;
+            let tls_index = TLS_INDEX.load(Ordering::Relaxed);


Should we create a function to uninstall_tls?

Yes, updated.

wdcui · 2026-02-27T04:11:48Z

litebox_platform_windows_userland/src/lib.rs

+                // Safety: the TLS pointer is valid as long as the thread is
+                // alive, and we hold the thread handle lock.
+                let tls = unsafe { &*inner.tls.0 };
+                tls.pending_host_signals.fetch_or(bit, Ordering::SeqCst);


What's the effect of this fetch_or?

Performs a bitwise “or” operation on the current value and the argument val, and sets the new value to the result.

Returns the previous value.

wdcui

LGTM. I left some comments below.

github-actions · 2026-02-28T00:23:01Z

🤖 SemverChecks 🤖 No breaking API changes detected

Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered.

jaybosamiya-ms · 2026-03-02T21:13:18Z

litebox/src/shim.rs

+/// Signal numbers are 1-based and must be in the range 1–63.
+#[derive(Copy, Clone, Debug, Eq, PartialEq)]
+pub struct Signal(u32);


Nit: store as a u8?

jaybosamiya-ms · 2026-03-02T21:14:48Z

litebox/src/shim.rs

+/// A set of [`Signal`]s, stored as a 64-bit bitmask.
+///
+/// Bit `(signum - 1)` is set when signal `signum` is present in the set.
+/// Because signal numbers are 1-based and capped at 63, all 63 possible
+/// signals fit in a single `u64`.
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub struct SigSet(u64);


If you are explicitly defining the internal details of how things are stored, you can just expose the value inside it via pub struct SigSet(pub u64) to remove from_u64/as_u64. Alternatively, you should not be specifying how it is being internally stored in the docs.

jaybosamiya-ms · 2026-03-02T21:23:37Z

litebox/src/platform/mod.rs

+/// Provider for consuming platform-originating signals.
+///
+/// Platforms can record signals (e.g., `SIGINT`) and the shim should call
+/// [`SignalProvider::take_pending_signals`] to consume them.
+pub trait SignalProvider {
+    /// Atomically take all pending asynchronous signals (e.g., SIGINT and SIGALRM)
+    /// for the current thread, passing each one to `f`.
+    ///
+    /// Platforms that support asynchronous signals should override this method.
+    #[allow(unused_variables, reason = "no-op by default")]
+    fn take_pending_signals(&self, f: impl FnMut(crate::shim::Signal)) {}


I am quite surprised by this provider. There are some surprising inversions happening here, for example, it depends on crate::shim in its signature.

I don't think the Signal sits well in shim, since the shim module is meant for traits that the shim must implement but there isn't a trait being implemented there.

Also, "shim should call" implies that a shim that does not call stuff is incorrect. This sort of requirement is exactly what traits are for, and thus the way the design should be is to actually have the shim implement a trait that probably looks like the following:

pub trait ConsumeSignal { fn consume_signal(Signal); }

Then the platform can always invoke this trait, and the shim can make a decision of it wants to buffer things or not. If you need/want the "all in a group" behavior, then you can make the trait (instead of the above):

pub trait ConsumeSignal { fn consume_signals(&[Signal]); }

jaybosamiya-ms · 2026-03-02T21:24:36Z

litebox/src/platform/mod.rs

+    /// for the current thread, passing each one to `f`.
+    ///
+    /// Platforms that support asynchronous signals should override this method.
+    #[allow(unused_variables, reason = "no-op by default")]


Nit: prefer expect rather than allow

jaybosamiya-ms · 2026-03-02T21:28:27Z

litebox/src/platform/mod.rs

+    /// The passed `waker` should live at least until the thread leaves the interruptible
+    /// wait (i.e., [`on_interruptible_wait_end`](Self::on_interruptible_wait_end) is called).


This is a somewhat surprising requirement; we can simplify this by updating the signature in this to take a Waker rather than a &Waker, right?

jaybosamiya-ms · 2026-03-02T21:28:41Z

litebox/src/platform/mod.rs

 pub trait RawMutexProvider {
    type RawMutex: RawMutex;
+
+    /// Called when a thread enters an interruptible wait.


Called by whom?

jaybosamiya-ms · 2026-03-02T21:29:00Z

litebox/src/platform/mod.rs

+    ///
+    /// The passed `waker` should live at least until the thread leaves the interruptible
+    /// wait (i.e., [`on_interruptible_wait_end`](Self::on_interruptible_wait_end) is called).
+    /// The platform can use the `waker` to wake up the thread while it is in the interruptible wait.


"can use" -> "uses"?

jaybosamiya-ms · 2026-03-02T21:31:18Z

litebox/src/platform/mod.rs

+    ///
+    /// This is a no-op by default.
+    #[allow(unused_variables)]
+    fn on_interruptible_wait_start(&self, waker: &crate::event::wait::Waker<Self>)


The name of this function is somewhat weird. It is not actually beginning the wait, right? Maybe something like register_waker (and the other one should be unregister_waker)?

Also, what is the expected behavior if someone runs "start" multiple times in a row?

jaybosamiya-ms · 2026-03-02T21:34:09Z

litebox/src/platform/mod.rs

+    /// Called when a thread leaves an interruptible wait.
+    ///
+    /// This is a no-op by default.
+    fn on_interruptible_wait_end(&self) {}


Can merge with the above one by taking in a Option<Waker> instead of a Waker; in that case, it is behaving more like update_interruptible_waker.

jaybosamiya-ms · 2026-03-02T21:38:30Z

litebox_common_linux/src/signal/mod.rs

+impl TryFrom<Signal> for litebox::shim::Signal {
+    type Error = Signal;
+
+    fn try_from(value: Signal) -> Result<Self, Self::Error> {
+        match value {
+            Signal::SIGINT => Ok(Self::SIGINT),
+            Signal::SIGALRM => Ok(Self::SIGALRM),
+            _ => Err(value),


This is quite surprising, haven't you designed the litebox::shim::Signal literally to support the full set of Linux signals? You've even hard-coded the "up to 64" and so on, so I see no reason why this does not support SIGILL etc converted over.

I do think that the long-term move is for litebox::shim::Signal to not have this 64 restriction, but if you are assuming that right now, then wouldn't the correct move just be Signal::from_raw(...).ok_or(Err(...))?

refactor signal infrastructure and support SIGINT (ctrl-c)

8ba98dc

CvvT marked this pull request as ready for review February 27, 2026 02:14

wdcui approved these changes Feb 27, 2026

View reviewed changes

fix

37ed0f2

jaybosamiya-ms reviewed Mar 2, 2026

View reviewed changes

		/// The passed `waker` should live at least until the thread leaves the interruptible
		/// wait (i.e., [`on_interruptible_wait_end`](Self::on_interruptible_wait_end) is called).

Conversation

CvvT commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wdcui left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wdcui left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CvvT commented Feb 27, 2026 •

edited

Loading