Summary
A TOCTOU race between comment purge and the postUpdates MFS sync can leave a purged post's CommentUpdate permanently in the postUpdates MFS tree of a community's IPFS node.
Surfaced as a flaky failure in the remote-libp2pjs CI config on the helia/libp2p dependency upgrade branch:
AssertionError: MFS path /<community>/postUpdates/86400/<postCid>/update still exists after 30s - purge cleanup did not complete
at test/node-and-browser/publications/comment-moderation/purged.test.ts:292
The slower libp2pjs client transport widens timing windows, so the race lands; the Kubo-RPC / gateway configs (same community-side code) pass. The community-side purge/sync code is not changed by the dependency-upgrade PR — this is a pre-existing concurrency bug exposed by timing.
Root cause
syncIpnsWithDb (sync loop) and storeCommentModeration (pubsub challenge handler) run concurrently with no mutual exclusion. Observed timeline (community-side log):
t0 — updateCommentsThatNeedToBeUpdated calculates the post's CommentUpdate (post still in DB), producing a row with localMfsPath = /<addr>/postUpdates/86400/<postCid>/update.
t0+~150ms — a purge moderation arrives: storeCommentModeration deletes the post from the DB, queues its MFS path, and rmUnneededMfsPaths removes it from MFS (this also drains _mfsPathsToRemove).
t0+~190ms — the in-flight sync from step 1 reaches syncPostUpdatesWithIpfs and writes the captured (pre-purge) row back to MFS.
Because the post is now gone from the DB, it is never recalculated and never re-purged. Nothing scans the MFS tree to remove entries absent from the DB (calculateNewPostUpdates only reads bucket dirs), so the resurrected entry persists indefinitely.
Fix
In syncPostUpdatesWithIpfs, skip writing post-update rows whose comment no longer exists in the DB (purged mid-cycle), and don't throw when the post-filter list is empty.
Plan
Summary
A TOCTOU race between comment purge and the postUpdates MFS sync can leave a purged post's
CommentUpdatepermanently in thepostUpdatesMFS tree of a community's IPFS node.Surfaced as a flaky failure in the
remote-libp2pjsCI config on the helia/libp2p dependency upgrade branch:The slower libp2pjs client transport widens timing windows, so the race lands; the Kubo-RPC / gateway configs (same community-side code) pass. The community-side purge/sync code is not changed by the dependency-upgrade PR — this is a pre-existing concurrency bug exposed by timing.
Root cause
syncIpnsWithDb(sync loop) andstoreCommentModeration(pubsub challenge handler) run concurrently with no mutual exclusion. Observed timeline (community-side log):t0—updateCommentsThatNeedToBeUpdatedcalculates the post'sCommentUpdate(post still in DB), producing a row withlocalMfsPath = /<addr>/postUpdates/86400/<postCid>/update.t0+~150ms— a purge moderation arrives:storeCommentModerationdeletes the post from the DB, queues its MFS path, andrmUnneededMfsPathsremoves it from MFS (this also drains_mfsPathsToRemove).t0+~190ms— the in-flight sync from step 1 reachessyncPostUpdatesWithIpfsand writes the captured (pre-purge) row back to MFS.Because the post is now gone from the DB, it is never recalculated and never re-purged. Nothing scans the MFS tree to remove entries absent from the DB (
calculateNewPostUpdatesonly reads bucket dirs), so the resurrected entry persists indefinitely.Fix
In
syncPostUpdatesWithIpfs, skip writing post-update rows whose comment no longer exists in the DB (purged mid-cycle), and don't throw when the post-filter list is empty.Plan
test/node/(capture row → purge → stale sync write → assert not resurrected)syncPostUpdatesWithIpfsnpm run build+ test green underlocal-kubo-rpc