Skip to content

fix(kvm-clock): do not jump monotonic clock on restore#5809

Open
Manciukic wants to merge 3 commits intofirecracker-microvm:mainfrom
Manciukic:fix-clock-jump
Open

fix(kvm-clock): do not jump monotonic clock on restore#5809
Manciukic wants to merge 3 commits intofirecracker-microvm:mainfrom
Manciukic:fix-clock-jump

Conversation

@Manciukic
Copy link
Copy Markdown
Contributor

@Manciukic Manciukic commented Apr 2, 2026

Changes

This patch adds a new API flag to LoadSnapshot, clock_realtime, that advances the clock on restore when set (default is False). Rather than the clock flags being decided at snapshot time, the restore path ignores those flags and decides what to do depending on the clock_realtime flag. This is because the other available flags are ignored (KVM_CLOCK_TSC_HOST) or cannot be passed to set_clock (KVM_CLOCK_TSC_STABLE) , meaning the only valid flag is KVM_CLOCK_REALTIME.

The name of the flag was kept generic as we may add this behaviour for the other clock sources in the future, if the need arises.

Reason

Firecracker has never advanced the clock on restore for any of the supported clocksources. Since Linux 5.16, the KVM_CLOCK_REALTIME has been passed to kvm-clock, causing the monotonic time in the guest to jump when using kvm-clock as clock source.

Despite being unexpected and not what Firecracker should do, we recognize this may be a valid usecase so this patch adds a way to configure it, keeping the default to the expected documented behaviour.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 95.23810% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.99%. Comparing base (8b55c86) to head (c7c682c).

Files with missing lines Patch % Lines
src/vmm/src/builder.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5809      +/-   ##
==========================================
- Coverage   83.00%   82.99%   -0.01%     
==========================================
  Files         275      275              
  Lines       29383    29399      +16     
==========================================
+ Hits        24388    24401      +13     
- Misses       4995     4998       +3     
Flag Coverage Δ
5.10-m5n.metal 83.32% <100.00%> (+<0.01%) ⬆️
5.10-m6a.metal 82.65% <100.00%> (+0.01%) ⬆️
5.10-m6g.metal 79.89% <83.33%> (-0.01%) ⬇️
5.10-m6i.metal 83.32% <100.00%> (+<0.01%) ⬆️
5.10-m7a.metal-48xl 82.64% <100.00%> (-0.01%) ⬇️
5.10-m7g.metal 79.89% <83.33%> (-0.01%) ⬇️
5.10-m7i.metal-24xl 83.29% <100.00%> (+<0.01%) ⬆️
5.10-m7i.metal-48xl 83.29% <100.00%> (+<0.01%) ⬆️
5.10-m8g.metal-24xl 79.89% <83.33%> (-0.01%) ⬇️
5.10-m8g.metal-48xl 79.88% <83.33%> (-0.01%) ⬇️
5.10-m8i.metal-48xl 83.29% <100.00%> (-0.01%) ⬇️
5.10-m8i.metal-96xl 83.29% <100.00%> (+<0.01%) ⬆️
6.1-m5n.metal 83.34% <100.00%> (-0.01%) ⬇️
6.1-m6a.metal 82.67% <100.00%> (+<0.01%) ⬆️
6.1-m6g.metal 79.89% <83.33%> (-0.01%) ⬇️
6.1-m6i.metal 83.35% <100.00%> (+0.01%) ⬆️
6.1-m7a.metal-48xl 82.66% <100.00%> (+<0.01%) ⬆️
6.1-m7g.metal 79.88% <83.33%> (-0.01%) ⬇️
6.1-m7i.metal-24xl 83.36% <100.00%> (+<0.01%) ⬆️
6.1-m7i.metal-48xl 83.36% <100.00%> (+<0.01%) ⬆️
6.1-m8g.metal-24xl 79.88% <83.33%> (-0.01%) ⬇️
6.1-m8g.metal-48xl 79.88% <83.33%> (-0.01%) ⬇️
6.1-m8i.metal-48xl 83.36% <100.00%> (-0.01%) ⬇️
6.1-m8i.metal-96xl 83.36% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Manciukic Manciukic force-pushed the fix-clock-jump branch 4 times, most recently from 295403e to 9565fa8 Compare April 2, 2026 14:18
@Manciukic Manciukic marked this pull request as ready for review April 2, 2026 14:50
@Manciukic Manciukic requested review from kalyazin and pb8o as code owners April 2, 2026 14:50
@Manciukic Manciukic self-assigned this Apr 2, 2026
@Manciukic Manciukic added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Apr 2, 2026
When using `kvm-clock` as clock source on `x86_64`, it's possible to optionally
set the `clock_realtime: true` in the `LoadSnapshot` request to advance the
clock source on the guest at restore time. Note that this may cause issues
within the guest as the clock source will appear to suddenly jump.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop the "source"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second one at line 498? yeah that makes sense

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, also in 499

CHANGELOG.md Outdated
- [#5809](https://github.com/firecracker-microvm/firecracker/pull/5809): Fixed a
bug on host Linux >= 5.16 for x86_64 guests using the `kvm-clock` clock source
causing the monotonic clock to jump on restore by the wall-clock time elapsed
since the snapshot was taken. Users that want to keep the previous behaviour
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'previous' is ambiguous in this paragraph. I think it should be explicit what is the behaviour as of the new version and what does setting clock_realtime does.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can mention the KVM_CLOCK_REALTIME instead of just "previous behaviour".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users using kvm-clock that want to explicitly advance the clock with KVM_CLOCK_REALTIME can opt back in...

Removes trailing whitespace.

Signed-off-by: Riccardo Mancini <mancio@amazon.com>
ilstam
ilstam previously approved these changes Apr 2, 2026
Firecracker has never advanced the clock on restore for any of the
supported clocksources. Since Linux 5.16, the KVM_CLOCK_REALTIME has
been passed to kvm-clock, causing the monotonic time in the guest to
jump when using kvm-clock as clock source.

Despite being unexpected and not what Firecracker should do, we
recognize this may be a valid usecase so this patch adds a way to
configure it, keeping the default to the expected documented behaviour.

This patch adds a new API flag to LoadSnapshot, clock_realtime, that
advances the clock on restore when set (default is False). Rather than
the clock flags being decided at snapshot time, the restore path ignores
those flags and decides what to do depending on the clock_realtime flag.
This is because the other available flag (KVM_CLOCK_TSC_STABLE) cannot
even be passed to `set_clock`, meaning the only valid flag is
KVM_CLOCK_REALTIME.

The name of the flag was kept generic as we may add this behaviour for
the other clock sources in the future, if the need arises.

Signed-off-by: Riccardo Mancini <mancio@amazon.com>
We actually support all of them.

Signed-off-by: Riccardo Mancini <mancio@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants