Skip to content

RAID1 Disk Setup and Asynchronous Reads#223

Open
ioeddk wants to merge 5 commits intomainfrom
yingqi/raid1_merge/setting_up_raid_qemu
Open

RAID1 Disk Setup and Asynchronous Reads#223
ioeddk wants to merge 5 commits intomainfrom
yingqi/raid1_merge/setting_up_raid_qemu

Conversation

@ioeddk
Copy link
Copy Markdown
Contributor

@ioeddk ioeddk commented May 5, 2026

As one of the intermediate PR of #170, merging the following commits:

  1. Updated the QEMU start-up options to use three physical partitions as RAID1 member devices, rather than disk images.
  2. Updated the RAID1 process_read to perform asynchronous reads.
  3. Resolve the long tail latency of the RAID device by creating the RAID1 thread with the real-time scheduler.
  4. Changed the OQueue used in the VirtIO module to the new OQueue.

@ioeddk ioeddk requested a review from a team as a code owner May 5, 2026 00:10
Copy link
Copy Markdown
Contributor

@arthurp arthurp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits that should be trivial to fix.

The complex bit is creating and configuring the block devices. See comments below.

pub fn handle_requests(&self) {
let request = self.queue.dequeue();
info!("Handle Request: {:?}", request);
// info!("Handle Request: {:?}", request);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not leave commented code. Either keep it or remove it. I'm find with either.

Comment on lines +286 to +287
// let status = BioStatus::Complete;
// parent.complete(status);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code. Or justify it

self.bio_submission_oqueue().produce(bio)?;
let device_index = self.device.device_index.load(Ordering::Relaxed);
bio.prepare_enqueue(reply_handle, self.queue.clone(), device_index);
let producer = self.bio_submission_oqueue().attach_value_producer()?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Producers are Sync. So you can store it in self if you like. I'm not totally sure of the best way to handle this pattern of producing from various threads though. That decision is for a different PR, so whatever works if fine.

Comment on lines 53 to +57
#[cfg(not(baseline_asterinas))]
RPCError(RPCError),
/// The OQueue attachment errors
/// The OQueue errors
#[cfg(not(baseline_asterinas))]
OQueueAttachError(OQueueAttachError),
OQueueError(OQueueError),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use snafu and ostd_error (which you should). Then the From implementations can be automatically generated by marking the variant #[snafu(transparent)]. Search for that for examples.

Comment thread test/apps/test_common.mk
CC := gcc
C_FLAGS := -Wall -Werror
# C_FLAGS := -Wall -Werror

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do want to continue banning warnings. If there are specific warnings causing problems. I'm happy to help.

Comment thread test/Makefile
Comment on lines -249 to +248
@dd if=/dev/zero of=$(EXT2_IMAGE) bs=2G count=1
@dd if=/dev/zero of=$(EXT2_IMAGE) bs=1G count=2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Comment thread tools/qemu_args.sh
Comment on lines -92 to 97
-drive if=none,format=raw,id=r0,file=./test/build/raid1_0.img,cache=directsync \
-drive if=none,format=raw,id=r1,file=./test/build/raid1_1.img,cache=directsync \
-drive if=none,format=raw,id=d0,file=./test/build/capture.img \
-drive if=none,format=raw,id=d1,file=./test/build/capture_legacy.img \
-drive if=none,format=raw,id=r0,file=/dev/nvme0n1p1,cache=directsync \
-drive if=none,format=raw,id=r1,file=/dev/nvme1n1p1,cache=directsync \
-drive if=none,format=raw,id=r2,file=/dev/nvme2n1p1,cache=directsync \
"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be like this because it will only work on a specifically configured machine. Maybe introduce an environment variable which takes three file names "r0,r1,r2" and uses them for the raid devices. Given file size concerns maybe we don't want to auto-create raid devices in general either. So maybe have an "auto" value for the variable which trigger automatic file setup like before.

I'm very open to other ideas.

Comment thread OSDK.toml
Comment on lines -65 to +67
-drive if=none,format=raw,id=r0,file=./test/build/raid1_0.img \
-drive if=none,format=raw,id=r1,file=./test/build/raid1_1.img \
-drive if=none,format=raw,id=r0,file=/dev/nvme0n1p1 \
-drive if=none,format=raw,id=r1,file=/dev/nvme1n1p1 \
-drive if=none,format=raw,id=r2,file=/dev/nvme2n1p1 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to handle this given the discussion of multiple configurations above. Do you know off hand when this file is actually used? I think, most of the time, it's actually ignored. If we can understand when this file is actually used we can potentially remove the raid devices entirely and just document that raid isn't configured in the specific configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants