Skip to content

[python-docx] Add new project integration#15793

Open
MR-SS wants to merge 6 commits into
google:masterfrom
MR-SS:python-docx-integration
Open

[python-docx] Add new project integration#15793
MR-SS wants to merge 6 commits into
google:masterfrom
MR-SS:python-docx-integration

Conversation

@MR-SS

@MR-SS MR-SS commented Jun 23, 2026

Copy link
Copy Markdown

Adds fuzzing integration for python-docx, the foundational Python library for reading and writing Microsoft Word (.docx) files (30M+ monthly downloads).

This integration implements an Atheris fuzzer that takes raw bytes, treats them as a .docx (ZIP) stream, and forces deep traversal of the underlying XML AST.

To maximize coverage and ensure the fuzzer does not halt on shallow ZIP corruption, all standard decompression errors (zlib.error, EOFError, OSError, RuntimeError) are explicitly ignored, forcing the fuzzer to focus purely on XML memory exhaustion (OOM), infinite loops, and native crashes in the lxml bindings.

Proof of Value:
During local testing, this fuzzer successfully traversed the ZIP logic and discovered multiple unhandled edge cases where standard decompression errors bypassed the PackageNotFoundError handler and crashed the application. I have compiled these and reported them upstream here:

@github-actions

Copy link
Copy Markdown

MR-SS is integrating a new project:
- Main repo: https://github.com/py-pdf/pypdf
- Criticality score: 0.56831
MR-SS is integrating a new project:
- Main repo: https://github.com/py-pdf/pypdf
- Criticality score: 0.56831

@MR-SS

MR-SS commented Jun 23, 2026

Copy link
Copy Markdown
Author

Hello,

I am currently working on improving the fuzzer. After running it for 8 hours, I did not observe the performance/behavior I expected.

I am now in the process of optimizing and enhancing it.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant