Skip to content

[pypdf] Add new project integration#15789

Open
MR-SS wants to merge 4 commits into
google:masterfrom
MR-SS:pypdf-integration
Open

[pypdf] Add new project integration#15789
MR-SS wants to merge 4 commits into
google:masterfrom
MR-SS:pypdf-integration

Conversation

@MR-SS

@MR-SS MR-SS commented Jun 22, 2026

Copy link
Copy Markdown

Adds fuzzing integration for pypdf, a widely used Python PDF library with millions of downloads.

This integration implements a comprehensive Atheris fuzzer that exercises:

  • PDF parsing via PdfReader
  • Text extraction (in both layout and plain modes)
  • Image extraction and stream decoding
  • PDF metadata parsing
  • PDF writing via PdfWriter

The build.sh script automatically fetches a diverse set of highly complex PDFs (from Mozilla's pdf.js test suite) to use as a seed corpus, ensuring high code coverage. Standard robustness exceptions (like ValueError, TypeError, AttributeError, OverflowError) are ignored to focus the fuzzer purely on Native crashes, OOM, and Timeouts.

Proof of Value:
During local testing with infra/helper.py, this fuzzer successfully discovered multiple unhandled edge cases and crashes in the upstream library within minutes, including:

  • An unhandled AttributeError during dictionary casting
  • An unhandled OverflowError during startxref parsing
  • An Infinite Loop (Timeout) during text extraction of a malformed content stream

These findings have been reported to pypdf

@google-cla

google-cla Bot commented Jun 22, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@github-actions

Copy link
Copy Markdown

MR-SS is integrating a new project:
- Main repo: https://github.com/py-pdf/pypdf
- Criticality score: 0.56828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant