Skip to content

fix: surface upload validation errors and remove ghost cards (#16)#402

Open
Padmajakachare1911 wants to merge 1 commit into
param20h:devfrom
Padmajakachare1911:fix/unsupported-file-upload-error-handling
Open

fix: surface upload validation errors and remove ghost cards (#16)#402
Padmajakachare1911 wants to merge 1 commit into
param20h:devfrom
Padmajakachare1911:fix/unsupported-file-upload-error-handling

Conversation

@Padmajakachare1911

Copy link
Copy Markdown
Contributor

🔗 Related Issue

Closes #16

📋 Summary

Uploading an unsupported file (wrong extension, fake PDF, oversized file)
previously failed silently — the backend rejected the file but the
frontend never surfaced the error, leaving a ghost card in the list with
no content. This PR closes both failure paths: synchronous validation
errors (wrong type/size) now show a sonner toast and prevent the card
from appearing; async ingest failures show a brief error card that
auto-removes after 4 seconds.


🛠️ Changes Made

backend/app/routes/documents.py

  • Added inline validation before any DB write — runs in this order:
    1. Extension check — rejects non-.pdf with 400:
      "Unsupported file type '.docx'. Only PDF files are accepted."
    2. MIME type check — rejects non-application/pdf with 400
    3. Magic bytes check — reads first 4 bytes, rejects anything
      that does not start with %PDF with 400:
      "The uploaded file does not appear to be a valid PDF."
    4. File size check — rejects files over settings.MAX_UPLOAD_SIZE_MB
      (20 MB) with 400:
      "File too large (X MB). Maximum allowed size is 20 MB."
  • File bytes read once and reused for save (no double-read)
  • Old validate_upload helper no longer called for uploads
  • All detail strings are user-facing and consistent

backend/tests/test_document_upload_validation.py (updated)

  • 7 tests covering all four validation failure paths + valid PDF pass
  • Smoke checks verified:
    • .docx → correct unsupported-type message
    • Fake PDF bytes → magic-bytes rejection message
    • 21 MB PDF → size rejection message with correct limit (20 MB)

frontend/src/app/dashboard/page.tsx

  • Added toastedFailures ref (Set<number>) to track which failed
    doc IDs have already triggered a toast (prevents repeat fires on
    every poll interval)
  • On each document list refresh: detects status === "failed" docs,
    fires one toast.error() via sonner with doc.error_message
    (falls back to generic message if null)
  • Auto-removes the failed card from list state after 4 seconds via
    setTimeout

frontend/src/components/document/DocumentSidebar.tsx

  • "failed" status renders distinct badge (AlertCircle icon, red)
    and shows doc.error_message text below the filename
  • Dropzone narrowed to PDF-onlyaccept updated to
    { "application/pdf": [".pdf"] } (was also accepting .docx,
    .txt, .md) to match the backend's PDF-only validation
  • Helper text updated: "PDF files only (max 20 MB)"

Toast system

  • Uses sonner (already in the project) — no new dependencies added

✅ Test Results

Backend

7/7 passed in test_document_upload_validation.py

Manual smoke tests

Scenario Result
.txt renamed to .docx → upload ✅ Toast: unsupported type, no card
.jpg renamed to test.pdf → upload ✅ Toast: not a valid PDF, no card
PDF > 20 MB → upload ✅ Toast: file too large (20 MB limit), no card
Valid PDF → upload ✅ No toast, card appears, pendingready
Valid PDF → delete mid-ingest ✅ Failed badge + error text, one toast, card gone after 4s

🔒 Edge Cases Covered

Scenario Behaviour
File renamed to .pdf but wrong bytes Magic bytes check catches it → 400
Ingest crashes after successful upload status = "failed" + error_message persisted by document_ingestion.py (pre-existing); surfaced via polling
Same failed doc polled multiple times toastedFailures ref prevents duplicate toasts
error_message is null Falls back to generic "Processing failed for..." message
Dropzone shows unsupported type UI now restricted to .pdf — mismatch eliminated

📝 Notes

  • document_ingestion.py already persisted status = "failed" and
    error_message before this PR — no changes needed there
  • error_message DB column confirmed present — no migration required
  • DocumentUpload.tsx is not in the active component tree; upload UX
    lives in DocumentSidebar.tsx — all changes applied there instead
  • Max size is 20 MB (settings.MAX_UPLOAD_SIZE_MB), not 50 MB
    as suggested in the issue plan — aligned to the app's actual limit

📦 Dependencies Added

None — uses sonner already present in the project.


🧪 How to Test Locally

# Backend
cd backend
pytest backend/tests/test_document_upload_validation.py -v

# Frontend + backend running together
uvicorn app.main:app --reload --port 8000
cd frontend && npm run dev

# Then in the UI:
# 1. Try uploading a .docx file → error toast, no card
# 2. Rename a .jpg to test.pdf → error toast, no card
# 3. Upload a PDF > 20 MB → error toast, no card
# 4. Upload a valid PDF → works normally

…aram20h#16)

Return consistent 400 detail strings for extension, MIME, magic bytes, and size checks. Toast upload and ingest failures in the UI, auto-remove failed cards, and restrict the sidebar dropzone to PDF only with updated helper copy.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: PDF viewer fails silently when an unsupported file format is uploaded

1 participant