Skip to content

Resumable chunked upload to R2 (ALO-134)#102

Open
aloewright wants to merge 2 commits into
mainfrom
conductor/alo-134-resumable-chunked-upload-to-r2
Open

Resumable chunked upload to R2 (ALO-134)#102
aloewright wants to merge 2 commits into
mainfrom
conductor/alo-134-resumable-chunked-upload-to-r2

Conversation

@aloewright
Copy link
Copy Markdown
Owner

@aloewright aloewright commented May 8, 2026

Summary

Adds resume-on-disconnect support to the chunked upload flow.

  • GET /api/videos/upload/:uploadId/status — returns the KV-persisted manifest (receivedChunks, receivedBytes, nextChunkIndex, complete) so a client that lost its connection can pick up at nextChunkIndex instead of restarting.
  • DELETE /api/videos/upload/:uploadId — aborts the underlying R2 multipart upload and clears the manifest.

The chunk manifest is now persisted as a single KV key per upload (upload:<userId>:<uploadId>) holding videoId, r2Key, the R2 multipart upload id, original metadata, and the per-part etag/size map. Single-key writes keep the manifest atomic — the previous 3-key scheme could leave orphan state if a put() failed between writes.

Re-uploading the same chunk index is idempotent: the manifest tracks parts by partNumber, so a retried chunk replaces the prior etag/size and the byte budget check subtracts the old size before re-checking. The POST path also honors a client-provided uploadId on chunk 0, so re-init of the same upload reuses the existing R2 multipart upload instead of leaking it.

The manifest is saved before completion, so a crash between uploadPart and multipart.complete() leaves a recoverable manifest with one extra part (rather than a lost one). Manifests are scoped by user — cross-tenant status queries and aborts return 404.

nextChunkIndex is computed via a single linear scan over the sorted indices, addressing the O(N²) concern raised against the earlier draft.

Linear

ALO-134

Test plan

  • npm test — 513 passed (46 files)
  • npm run lint — 0 warnings, AI Gateway guard clean
  • npm run type-check
  • Manual: simulate disconnect mid-upload, GET status, resume from nextChunkIndex
  • Manual: re-POST the same chunkIndex twice; manifest should hold the latest etag and total byte count should reflect the latest size

🤖 Generated with Claude Code

Adds GET /api/videos/upload/:uploadId/status so clients that disconnect
mid-upload can query the KV-persisted chunk manifest and resume from the
next missing chunkIndex. Adds DELETE /api/videos/upload/:uploadId to
abort and clean up an in-progress multipart upload.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 8, 2026 14:46
@aloewright aloewright added the conductor Conductor-managed PR label May 8, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Warning

Rate limit exceeded

@aloewright has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 40 minutes and 13 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d736d9d9-5ff8-4802-80da-c57ddb170f0b

📥 Commits

Reviewing files that changed from the base of the PR and between 4d3c13f and daf0f00.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (5)
  • src/workers/chunked-upload.test.ts
  • src/workers/chunked-upload.ts
  • src/workers/upload-resume.test.ts
  • src/workers/videos.test.ts
  • src/workers/videos.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch conductor/alo-134-resumable-chunked-upload-to-r2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools Bot commented May 8, 2026

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a resume-on-disconnect feature for video uploads by adding a status endpoint to track received chunks and a delete endpoint to abort multipart uploads. It also includes unit tests for these new routes and updates several package dependencies. Feedback was provided regarding the efficiency of the algorithm used to identify the next missing chunk index, suggesting an optimization to reduce complexity from O(N^2) to O(N).

Comment thread src/workers/videos.ts Outdated
Comment on lines +335 to +341
let nextChunkIndex: number | null = null;
for (let i = 0; i < uploadMeta.chunkCount; i++) {
if (!receivedChunks.includes(i)) {
nextChunkIndex = i;
break;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current loop to find nextChunkIndex has O(N^2) complexity because it calls receivedChunks.includes(i) (an O(N) operation) inside a loop that runs up to chunkCount times. Since receivedChunks is already sorted numerically (line 333), you can optimize this to O(N) by checking if the value at the current index matches the expected chunk index. This is significantly more efficient as the number of chunks grows (up to 3,072 per MAX_CHUNK_COUNT).

Suggested change
let nextChunkIndex: number | null = null;
for (let i = 0; i < uploadMeta.chunkCount; i++) {
if (!receivedChunks.includes(i)) {
nextChunkIndex = i;
break;
}
}
let nextChunkIndex: number | null = null;
for (let i = 0; i < uploadMeta.chunkCount; i++) {
if (receivedChunks[i] !== i) {
nextChunkIndex = i;
break;
}
}

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24006f668f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/workers/videos.ts Outdated
receivedChunks,
receivedBytes,
nextChunkIndex,
complete: nextChunkIndex === null,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not mark upload complete from part presence alone

This endpoint sets complete from nextChunkIndex === null, which only means all part numbers exist in KV, not that multipart.complete(...) and the DB insert have succeeded. In the disconnect/error case on the final chunk (the exact resume scenario this feature targets), parts can be fully recorded while finalization never commits, so status reports complete: true and gives no chunk to retry, leaving the client with no reliable way to recover that upload.

Useful? React with 👍 / 👎.

Comment thread src/workers/videos.ts Outdated
}
const uploadMeta = uploadMetaPersistedSchema.parse(JSON.parse(uploadMetaJson));
const multipart = c.env.VIDEOS.resumeMultipartUpload(uploadMeta.r2Key, multipartUploadId);
await multipart.abort().catch(() => {});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Surface abort failure before clearing session state

The delete handler suppresses all abort() errors and then deletes mpid/meta/parts anyway. If abort fails transiently, the multipart upload remains active in R2 but the server forgets the upload ID and still returns success, so the client cannot retry cleanup and the upload can remain stranded until automatic expiration.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a resume-on-disconnect feature for video uploads by adding status and abort endpoints to the video worker. It also includes comprehensive unit tests for these new routes and updates dependency configurations in the lockfile. A performance issue was identified in the logic for calculating the next missing chunk index, where a nested lookup could lead to CPU exhaustion for large uploads; an optimized linear approach was suggested to mitigate this risk.

Comment thread src/workers/videos.ts Outdated
Comment on lines +335 to +341
let nextChunkIndex: number | null = null;
for (let i = 0; i < uploadMeta.chunkCount; i++) {
if (!receivedChunks.includes(i)) {
nextChunkIndex = i;
break;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation for finding the nextChunkIndex has a time complexity of $O(N \cdot M)$, where $N$ is the number of received chunks and $M$ is the total chunk count. For large videos (e.g., 30GB with 10MB chunks, resulting in ~3,000 chunks), this loop performs up to 9 million operations. In a Cloudflare Worker environment, this could potentially exceed the CPU time limit (especially on the free tier or bundled plan). Since receivedChunks is already sorted, you can optimize this to $O(M)$ using a pointer approach.

  let nextChunkIndex: number | null = null;
  let j = 0;
  for (let i = 0; i < uploadMeta.chunkCount; i++) {
    if (receivedChunks[j] === i) {
      j++;
    } else {
      nextChunkIndex = i;
      break;
    }
  }

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…O-134)

Persist a single per-upload manifest in KV (videoId, r2Key, multipartUploadId,
metadata, parts map) and expose two new endpoints so clients can resume after
disconnect:

- GET    /api/videos/upload/:uploadId/status  → receivedChunks, receivedBytes,
                                                  nextChunkIndex, complete
- DELETE /api/videos/upload/:uploadId         → abort R2 multipart + clear KV

Re-uploading the same chunk index is now idempotent: the manifest tracks parts
by partNumber, so a retried chunk replaces the prior etag/size and the byte
budget check subtracts the old size before re-checking. The POST path also
honors a client-provided uploadId on chunk 0, letting a re-init of the same
upload reuse the existing R2 multipart upload instead of leaking it.

The manifest is persisted before completion, so a crash between uploadPart
and multipart.complete() leaves a recoverable manifest with one extra part
rather than a lost one. Manifests are scoped by user, so cross-tenant status
queries / aborts return 404.

`nextChunkIndex` is computed via a linear scan over the sorted indices,
addressing the O(N²) algorithmic concern raised against the earlier draft of
this endpoint.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools Bot commented May 9, 2026

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conductor Conductor-managed PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants