added feature to extract all images from the pdf #44 #74

shebinleo · 2025-07-13T11:29:56Z

Extract all embedded images from PDFs.

// From file path
const imagePaths = await pdf2html.extractImages('path/to/document.pdf');
console.log('Extracted images:', imagePaths);
// Output: ['/absolute/path/to/files/image/document1.jpg', '/absolute/path/to/files/image/document2.png', ...]

// From buffer
const pdfBuffer = fs.readFileSync('path/to/document.pdf');
const imagePaths = await pdf2html.extractImages(pdfBuffer);

// With custom output directory
const imagePaths = await pdf2html.extractImages(pdfBuffer, {
    outputDirectory: './extracted-images', // Custom output directory
});

// With custom buffer size for large PDFs
const imagePaths = await pdf2html.extractImages('large-document.pdf', {
    outputDirectory: './output',
    maxBuffer: 1024 * 1024 * 10, // 10MB buffer
});

sonarqubecloud · 2025-07-13T11:57:35Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2025-07-13T11:58:17Z

Coverage after merging extract-images-from-pdf into main will be

98.48%

Coverage Report

File	Stmts	Branches	Funcs	Lines	Uncovered Lines
index.js	100%	100%	100%	100%
lib
CommandExecutor.js	94.74%	85.71%	100%	96%	34–35
FileManager.js	96.97%	100%	71.43%	100%
HTMLParser.js	100%	100%	100%	100%
ImageProcessor.js	100%	100%	100%	100%
PDFBoxWrapper.js	86.44%	61.54%	88.89%	94.59%	53–54, 60, 83, 83, 83, 83
PDFProcessor.js	98.65%	94.44%	100%	100%	82
TikaWrapper.js	100%	100%	100%	100%
config.js	100%	100%	100%	100%
errors.js	100%	100%	100%	100%

Shebin added 7 commits July 5, 2025 17:37

added feature to extract all images from the pdf #44

7fbc4a3

added feature to extract all images from the pdf #44

0065258

added feature to extract all images from the pdf #44

9154957

added feature to extract all images from the pdf #44

537196a

added feature to extract all images from the pdf #44

6668063

added feature to extract all images from the pdf #44

8ab70cf

added feature to extract all images from the pdf #44

9207bfd

shebinleo merged commit ae61af5 into main Jul 13, 2025
5 checks passed

shebinleo deleted the extract-images-from-pdf branch July 13, 2025 12:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added feature to extract all images from the pdf #44 #74

added feature to extract all images from the pdf #44 #74

Uh oh!

shebinleo commented Jul 13, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

added feature to extract all images from the pdf #44 #74

added feature to extract all images from the pdf #44 #74

Uh oh!

Conversation

shebinleo commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud bot commented Jul 13, 2025

Quality Gate passed

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shebinleo commented Jul 13, 2025 •

edited

Loading