Skip to content

fix: [SDK-4754] guard IndexedDB reads + writes from iOS Safari PWA wedge#1472

Merged
sherwinski merged 4 commits into
mainfrom
sherwin/sdk-4754
Jun 9, 2026
Merged

fix: [SDK-4754] guard IndexedDB reads + writes from iOS Safari PWA wedge#1472
sherwinski merged 4 commits into
mainfrom
sherwin/sdk-4754

Conversation

@sherwinski

@sherwinski sherwinski commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Description

1 Line Summary

Extend the iOS Safari PWA IndexedDB wedge guard to cover reads as well as writes, so OneSignal.init() no longer re-hangs ~30 min after a push subscription on subsequent navigations.

Details

Background: the 160605 fix (#1468) capped writes to the Options store with a timeout + page-scoped circuit breaker. On-device verification showed the WebKit readwrite wedge is DB-wide, not Options-specific, so this branch first generalized the guard to every readwrite op (put/delete/clear) across all stores.

That still wasn't enough. The key insight, confirmed with on-device breadcrumb logging:

  • Our timeout makes the JS promise resolve, but the underlying IndexedDB readwrite transaction stays open. WebKit then serializes every later op queued behind it on that object store — including reads.
  • So guarding writes alone let init() clear the wedged Options write, advance past internalInit / sessionInit, then hang on the first post-wedge read of the same storegetSubscription's Options reads inside storeInitialValues. Reads on other stores (Ids/*) completed fine, which is why the hang looked like it moved.

This change routes get/getAll through the same timeout + circuit breaker as the writes:

  • Refactors guardReadwrite → a generic guard(label, op, fallback); renames readwriteWedgeddbWedged, READWRITE_TIMEOUT_MSDB_TIMEOUT_MS. isReadwriteWedged() is kept as the exported getter (consumed by initSaveState's app-ID-deferral bail-out).
  • Once the breaker trips, every op short-circuits for the page's lifetime: writes → no-op, getundefined, getAll[]. A read that itself stalls also trips the breaker.
  • Dropped writes are session metadata the SW re-derives, or idempotent queued operations retried on the next clean load. Dropped reads fall back to the in-memory model state hydrated before the wedge.

The preview/pageA.html repro now calls login + addTag after init so the operations store keeps getting writes on every load — this mirrors the customer's app and reliably triggers the wedge + same-store read hang. (Kept in this PR to make the fix verifiable; can be reverted before/after merge.)

Systems Affected

  • WebSDK
  • Backend
  • Dashboard

Validation

Tests

Info

  • vp check: 0 errors.
  • vp test --run: 519 passed. The breaker regression test now asserts that after a wedged write trips the breaker, reads short-circuit too (getundefined, getAll[]), not just writes.
  • vp run build:prod: clean, size-limit within budget, API validation passed.

Confirmed on device (iOS Safari PWA, clean slate per run): delete ONE_SIGNAL_SDK_DB / reinstall PWA → Page A → Page B → Register → B → A. The breaker trips once (db.put(...) timed out warning), all subsequent ops short-circuit, and init() completes in ~1.6s instead of hanging ~30 min.

Checklist

  • All the automated tests pass or I explained why that is not possible
  • I have personally tested this on my machine or explained why that is not possible
  • I have included test coverage for these changes or explained why they are not needed

Screenshots

Checklist

  • I have included screenshots/recordings of the intended results or explained why they are not needed (log-based diagnosis; behavior covered by unit tests + on-device steps above)

Related Tickets


The 160605 fix only guarded the Options store, but the iOS Safari PWA
WebKit readwrite wedge is DB-wide: once init clears the Options write,
it re-hangs at the next unguarded write (operation queue / model-store
persistence). Generalize the timeout + page-scoped circuit breaker to
every readwrite op (put/delete/clear) across all stores.
@sherwinski sherwinski requested a review from fadi-george June 5, 2026 20:09

@fadi-george fadi-george left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes seem fine, double check nothing breaks from this "wedge" logic

The 1.5s timeout makes the readwrite promise resolve, but the underlying
WebKit IndexedDB transaction stays open and serializes every later op on
that object store behind it -- including reads. Guarding only writes let
init() crawl past the wedged Options write and then hang on the first
post-wedge read of the same store (getSubscription's Options reads during
storeInitialValues), reproducing the ~30 min stall.

Route get/getAll through the same timeout + page-scoped circuit breaker as
put/delete/clear: once wedged, reads short-circuit to a fallback
(get -> undefined, getAll -> []) and a read that itself stalls also trips
the breaker. Dropped reads fall back to the in-memory model state hydrated
before the wedge.
Extend the breaker test to verify that once a wedged write trips the
circuit breaker, reads short-circuit too (get -> undefined, getAll -> []),
locking in the guard that keeps init() from hanging on the first
post-wedge read of the wedged store.
After init, Page A now calls login + addTag so the operations store keeps
getting writes on every load. This mirrors the customer's app behavior and
reliably reproduces the readwrite wedge (and the subsequent same-store read
hang) that the guard fix addresses.
@sherwinski sherwinski changed the title fix: [SDK-4754] extend IndexedDB wedge guard to all readwrite ops fix: [SDK-4754] guard IndexedDB reads + writes from iOS Safari PWA wedge Jun 9, 2026
@sherwinski

sherwinski commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

A few things worth knowing when reviewing:

1. Why reads needed guarding too
The earlier write-only guard was correct for a minimal reproduction but init still hung. Our timeout resolves the JS promise, but the underlying WebKit readwrite transaction stays open, and on iOS the persistent (SQLite) IndexedDB backing store runs only one transaction at a time for the whole database — not one per object store. So once a write wedges, every later op queued behind it is blocked DB-wide, reads included, and WebKit has no server-side timeout to abort it (the ~30 min "recovery" is actually iOS terminating the suspended process, which finally closes the connection). That's why guarding get/getAll, not just writes, is what fully unblocks init(): at the engine level there's a single global lock, so once it's stuck nothing else can run until the page is gone. (Cases where reads of other stores seemed to succeed were reads that completed before the wedge, not concurrent with it.) Verified against the iOS 18.1 WebKit source; tracked upstream at https://bugs.webkit.org/show_bug.cgi?id=315804.

2. It's possible that a benign update-subscription 400 (result = 3) can occur.
result = 3 is ExecutionResult._FailNoretry, so that specific stale/batched op is dropped, not retried. The push token still converges: it's re-derived from the browser pushManager on every load, and a fresh update-subscription op is enqueued reflecting the correct state. Even if that op's persistence is short-circuited during a wedged load, it's re-enqueued and flushed on the next clean load. In my experience, this only occurs in situations when you navigate away during the original op, interrupting the process.

@sherwinski sherwinski merged commit 21f1a4c into main Jun 9, 2026
3 checks passed
@sherwinski sherwinski deleted the sherwin/sdk-4754 branch June 9, 2026 02:09
@github-actions github-actions Bot mentioned this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants