fix: OTA partition selection producing ota_seq == 0#5346
fix: OTA partition selection producing ota_seq == 0#5346zone117x wants to merge 5 commits intoesp-rs:mainfrom
ota_seq == 0#5346Conversation
…a_seq == 0
`Ota::set_current_app_partition` produces `ota_seq = 0` when selecting
certain partitions from uninitialized otadata (the state espflash leaves
after a direct flash). The esp-idf bootloader ignores `ota_seq == 0`,
so the device boots the old firmware instead of the newly flashed image.
The root cause is the Factory-case increment formula:
((app_number + 1) + count) % count
For Ota1 with 2 partitions this gives `((1+1)+2) % 2 = 0`. With both
otadata slots uninitialized, `new_seq = inc = 0`.
Additionally, `get_slot_seq` treats `ota_seq == 0` as a valid sequence
number, causing `current_app_partition` to return a wrong result via
unsigned underflow (`(0u32 - 1) % 2 = 1`). This makes the broken
state sticky: subsequent calls to `set_current_app_partition` see the
target as already selected and become no-ops.
Fix:
- Replace the Factory-case formula with `app_number + 1`, matching the
esp-idf C implementation (`SUB_TYPE_ID(subtype) + 1`).
- Normalize `ota_seq == 0` to `UNINITALIZED_SEQUENCE` in `get_slot_seq`
so it is handled the same as an erased slot.
Refs:
- esp-idf uninitialized case: https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L430-L434
- esp-idf active case: https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L418-L428
There was a problem hiding this comment.
Pull request overview
Fixes OTA partition selection in esp-bootloader-esp-idf when otadata is erased/uninitialized (e.g., after direct flashing), preventing ota_seq == 0 from being written/handled and ensuring the bootloader selects the intended OTA slot.
Changes:
- Normalize
ota_seq == 0to the uninitialized sentinel when readingotadatato avoid underflow-based mis-selection. - Adjust the Factory-case sequence increment logic to match ESP-IDF behavior (
app_number + 1) soota_seqis always valid (>= 1). - Add a regression unit test covering Factory → Ota1 with 2 OTA partitions; update the crate changelog.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| esp-bootloader-esp-idf/src/ota.rs | Fixes ota_seq handling/selection logic and adds a regression test to prevent ota_seq == 0 states. |
| esp-bootloader-esp-idf/CHANGELOG.md | Documents the bugfix in the Unreleased changelog. |
| // The esp-idf C writer never produces ota_seq == 0 (see | ||
| // `esp_rewrite_ota_data` in esp-idf's `esp_ota_ops.c`), so if it | ||
| // appears on flash it was written by a buggy caller. Normalize it | ||
| // to UNINITALIZED_SEQUENCE so the rest of the selection logic does | ||
| // not misinterpret it via unsigned underflow in `(0u32 - 1)`. |
There was a problem hiding this comment.
UNINITALIZED_SEQUENCE is misspelled (missing the second 'i' in 'uninitialized'), and the new comment repeats this typo. Since this constant is private to the module, consider renaming it (and updating uses) to improve readability and avoid propagating the typo into documentation/comments.
There was a problem hiding this comment.
It was already misspelled. Perhaps better to fix in a follow-up PR, because that variable is referenced in a dozen different areas.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
ota_seq == 0
ota_seq == 0ota_seq == 0
| // `SUB_TYPE_ID(subtype) + 1` for the uninitialized case: | ||
| // https://github.com/espressif/esp-idf/blob/v5.4.1/components/app_update/esp_ota_ops.c#L430-L434 | ||
| // | ||
| // Note: the previous formula `((app_number + 1) + count) % |
There was a problem hiding this comment.
no need to talk about how things looked previously - it's in the changelog and git-history
|
I would like to see a reproducer that shows our code actually isn't correct here, there are some red flags in this PR. |
|
New commits in main has made this PR unmergable. Please resolve the conflicts. |
Ota::set_current_app_partitionproducesota_seq = 0when selecting certain partitions from uninitializedotadata, for example the stateespflashleaves after a direct flash. The esp-idf bootloader ignoresota_seq == 0, so the device boots the old firmware instead of the newly flashed OTA image.This breaks the standard OTA workflow: flash via
espflash-> trigger first OTA -> device stays on old firmware.Root cause
The Factory-case increment formula:
For
Ota1with 2 partitions:((1+1)+2) % 2 = 0. With both otadata slots uninitialized,new_seq = inc = 0.Once
ota_seq == 0is on flash,current_app_partition()misinterprets it via unsigned underflow ((0u32 - 1) % 2 = 1→Ota1), making subsequentset_current_app_partition(Ota1)calls no-ops. The broken state is sticky across OTA retries.Fix
app_number + 1, matching the esp-idf C implementation (SUB_TYPE_ID(subtype) + 1).ota_seq == 0toUNINITALIZED_SEQUENCEinget_slot_seq()so it is handled the same as an erased slot, preventing the sticky broken state.Affected configurations
Any device with:
ota_0,ota_1) and no factory partitionotadataleft uninitialized (e.g. afterespflash flash)ota_1(the non-booted slot)In those cases, using the
OtaUpdaterto activate the next partition does not work once the device is reboot, sample code:Submission Checklist 📝
cargo xtask fmt-packagescommand to ensure that all changed code is formatted correctly.CHANGELOG.mdin the proper section.Extra:
Testing
test_factory_to_ota1_produces_valid_seq— verifiesota_seq >= 1and correct partition selection for Factory → Ota1 with 2 partitionsesp-bootloader-esp-idfunit tests pass