Skip to content

fix: OTA partition selection producing ota_seq == 0#5346

Open
zone117x wants to merge 5 commits intoesp-rs:mainfrom
zone117x:main
Open

fix: OTA partition selection producing ota_seq == 0#5346
zone117x wants to merge 5 commits intoesp-rs:mainfrom
zone117x:main

Conversation

@zone117x
Copy link
Copy Markdown

Ota::set_current_app_partition produces ota_seq = 0 when selecting certain partitions from uninitialized otadata, for example the state espflash leaves after a direct flash. The esp-idf bootloader ignores ota_seq == 0, so the device boots the old firmware instead of the newly flashed OTA image.

This breaks the standard OTA workflow: flash via espflash -> trigger first OTA -> device stays on old firmware.

Root cause

The Factory-case increment formula:

((app.ota_app_number() as i32 + 1) + count as i32) as u32 % count as u32

For Ota1 with 2 partitions: ((1+1)+2) % 2 = 0. With both otadata slots uninitialized, new_seq = inc = 0.

Once ota_seq == 0 is on flash, current_app_partition() misinterprets it via unsigned underflow ((0u32 - 1) % 2 = 1Ota1), making subsequent set_current_app_partition(Ota1) calls no-ops. The broken state is sticky across OTA retries.

Fix

  • Increment formula: replace the Factory-case formula with app_number + 1, matching the esp-idf C implementation (SUB_TYPE_ID(subtype) + 1).
  • Sequence validation: normalize ota_seq == 0 to UNINITALIZED_SEQUENCE in get_slot_seq() so it is handled the same as an erased slot, preventing the sticky broken state.

Affected configurations

Any device with:

  • Two OTA app partitions (ota_0, ota_1) and no factory partition
  • otadata left uninitialized (e.g. after espflash flash)
  • First OTA targeting ota_1 (the non-booted slot)

In those cases, using the OtaUpdater to activate the next partition does not work once the device is reboot, sample code:

fn activate_next_ota_partition(flash: &mut esp_storage::FlashStorage) {
    use esp_bootloader_esp_idf::ota::OtaImageState;
    use esp_bootloader_esp_idf::ota_updater::OtaUpdater;
    use esp_bootloader_esp_idf::partitions::PARTITION_TABLE_MAX_LEN;

    let mut pt_buf = [0u8; PARTITION_TABLE_MAX_LEN];
    let mut ota = OtaUpdater::new(flash, &mut pt_buf).expect("OtaUpdater init failed");
    ota.activate_next_partition().expect("activate failed");
    ota.set_current_ota_state(OtaImageState::New).expect("set state failed");
}

Submission Checklist 📝

  • I have updated existing examples or added new ones (if applicable).
  • I have used cargo xtask fmt-packages command to ensure that all changed code is formatted correctly.
  • My changes were added to the CHANGELOG.md in the proper section.
  • I have added necessary changes to user code to the latest Migration Guide.
  • My changes are in accordance to the esp-rs developer guidelines

Extra:

Testing

  • New regression test test_factory_to_ota1_produces_valid_seq — verifies ota_seq >= 1 and correct partition selection for Factory → Ota1 with 2 partitions
  • All 16 existing esp-bootloader-esp-idf unit tests pass

Copilot AI review requested due to automatic review settings April 12, 2026 18:39
…a_seq == 0

`Ota::set_current_app_partition` produces `ota_seq = 0` when selecting
certain partitions from uninitialized otadata (the state espflash leaves
after a direct flash).  The esp-idf bootloader ignores `ota_seq == 0`,
so the device boots the old firmware instead of the newly flashed image.

The root cause is the Factory-case increment formula:

    ((app_number + 1) + count) % count

For Ota1 with 2 partitions this gives `((1+1)+2) % 2 = 0`.  With both
otadata slots uninitialized, `new_seq = inc = 0`.

Additionally, `get_slot_seq` treats `ota_seq == 0` as a valid sequence
number, causing `current_app_partition` to return a wrong result via
unsigned underflow (`(0u32 - 1) % 2 = 1`).  This makes the broken
state sticky: subsequent calls to `set_current_app_partition` see the
target as already selected and become no-ops.

Fix:
- Replace the Factory-case formula with `app_number + 1`, matching the
  esp-idf C implementation (`SUB_TYPE_ID(subtype) + 1`).
- Normalize `ota_seq == 0` to `UNINITALIZED_SEQUENCE` in `get_slot_seq`
  so it is handled the same as an erased slot.

Refs:
- esp-idf uninitialized case: https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L430-L434
- esp-idf active case: https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L418-L428
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes OTA partition selection in esp-bootloader-esp-idf when otadata is erased/uninitialized (e.g., after direct flashing), preventing ota_seq == 0 from being written/handled and ensuring the bootloader selects the intended OTA slot.

Changes:

  • Normalize ota_seq == 0 to the uninitialized sentinel when reading otadata to avoid underflow-based mis-selection.
  • Adjust the Factory-case sequence increment logic to match ESP-IDF behavior (app_number + 1) so ota_seq is always valid (>= 1).
  • Add a regression unit test covering Factory → Ota1 with 2 OTA partitions; update the crate changelog.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
esp-bootloader-esp-idf/src/ota.rs Fixes ota_seq handling/selection logic and adds a regression test to prevent ota_seq == 0 states.
esp-bootloader-esp-idf/CHANGELOG.md Documents the bugfix in the Unreleased changelog.

Comment thread esp-bootloader-esp-idf/src/ota.rs Outdated
Comment on lines +223 to +227
// The esp-idf C writer never produces ota_seq == 0 (see
// `esp_rewrite_ota_data` in esp-idf's `esp_ota_ops.c`), so if it
// appears on flash it was written by a buggy caller. Normalize it
// to UNINITALIZED_SEQUENCE so the rest of the selection logic does
// not misinterpret it via unsigned underflow in `(0u32 - 1)`.
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNINITALIZED_SEQUENCE is misspelled (missing the second 'i' in 'uninitialized'), and the new comment repeats this typo. Since this constant is private to the module, consider renaming it (and updating uses) to improve readability and avoid propagating the typo into documentation/comments.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already misspelled. Perhaps better to fix in a follow-up PR, because that variable is referenced in a dozen different areas.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc #5348

Comment thread esp-bootloader-esp-idf/CHANGELOG.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 12, 2026 18:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@zone117x zone117x changed the title fix(esp-bootloader-esp-idf): fix OTA partition selection producing ota_seq == 0 fix(esp-bootloader-esp-idf): OTA partition selection producing ota_seq == 0 Apr 12, 2026
@zone117x zone117x changed the title fix(esp-bootloader-esp-idf): OTA partition selection producing ota_seq == 0 fix: OTA partition selection producing ota_seq == 0 Apr 12, 2026
// `SUB_TYPE_ID(subtype) + 1` for the uninitialized case:
// https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L430-L434
//
// Note: the previous formula `((app_number + 1) + count) %
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to talk about how things looked previously - it's in the changelog and git-history

@bugadani
Copy link
Copy Markdown
Contributor

I would like to see a reproducer that shows our code actually isn't correct here, there are some red flags in this PR.

Copilot AI review requested due to automatic review settings April 13, 2026 16:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread esp-bootloader-esp-idf/src/ota.rs Outdated
@github-actions github-actions Bot added the merge-conflict Merge conflict detected. Automatically added/removed by CI. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown

New commits in main has made this PR unmergable. Please resolve the conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-conflict Merge conflict detected. Automatically added/removed by CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants