Skip to content

[BUG] Silent Exception Swallowing in ingest_document Breaks Celery Retry #561

@ionfwsrijan

Description

@ionfwsrijan

Description

ingest_document() wraps its entire pipeline in a try/except block. The except Exception handler (line 118):

  1. Logs the error
  2. Rolls back the DB session
  3. Re-queries the document
  4. Sets status to "failed" with error message and traceback
  5. Commits the failure state

It then falls through to finally, closes the DB session, and returns None. The exception is completely consumed — never re-raised.

Meanwhile, the Celery task process_document is decorated with:

@celery_app.task(
    bind=True,
    name="app.tasks.process_document",
    max_retries=3,
    default_retry_delay=30,
    autoretry_for=(Exception,),
    acks_late=True,
    reject_on_worker_lost=True,
)

All these retry mechanisms (autoretry_for, max_retries, acks_late) are designed to handle transient failures (API timeout, OOM, DB hiccup). But because ingest_document never propagates an exception, none of them ever fire. The task body executes to completion, swallows the error internally, and returns:

return {"document_id": document_id, "status": "completed"}

Impact

  • Every transient failure permanently marks the document as failed — no automatic retry.
  • Celery task result says "completed" while DB says "failed" — contradictory monitoring state.
  • The 3 retries configured at the framework level are completely dead code.
  • doc.last_error_traceback is populated but never exposed through any API endpoint.

Fix Required (~35 lines across 2 files)

backend/app/services/document_ingestion.py:

  • After the except block cleans up DB state (rollback, mark failed, commit), re-raise the original exception so Celery's autoretry_for can catch it.
  • Optionally: don't catch at all, let it propagate naturally.

backend/app/tasks.py:

  • Make the return {"status": "completed"} conditional on actual success (e.g., check the return value or the doc's status in DB).
  • Ensure the task properly enters failure state when all retries are exhausted.

GSSoC '26

  • Yes, I am participating in GirlScript Summer of Code and would like to fix this.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinggssocGirlScript Summer of Code 2026 issue/PR

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions