Skip to content

Error in scheduled email extraction - attachment_pdf - Maybe not a PDF #72

@HNygard

Description

@HNygard

Getting errors link this one:

Source: scheduled-email-extraction
Time: 2025-09-21 17:28:25 CEST
Message: Unsuccessful email extraction

Error Details:
success:
message: Failed to extract text from email.
email_id: 56fa80bd-415b-43d4-b020-0e8c7c0faca2
thread_id: 4a3cafd8-0003-4525-8494-ac01ed878ee1
attachment_id: a0c349c3-a7ca-43ed-a710-826478e3b3ad
extraction_id: 44203

error: pdftotext command failed with code 1: Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
extraction_type: attachment_pdf

Arrange so that if pdftotext exits with code 1 and "Syntax Warning: May not be a PDF file", then this is not an error. It's a warning. Or something like that.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions