When I use the pdffigures2 backend to extract images from a PDF, there are often images that are overlooked. For example, pdf_parser extracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation, pdffigures2 is the best of the three image extraction backends, cermine will cut a complete image into pieces.)
I guess maybe the pdffigures2 backend uses default parameters such as "image size" or "resolution" to filter the images?
Can you give me some advice or clues?
Thank you for your assistance.
When I use the
pdffigures2backend to extract images from a PDF, there are often images that are overlooked. For example,pdf_parserextracts only 3 images from a PDF file that contains 5 images. (In fact, in my observation,pdffigures2is the best of the three image extraction backends,cerminewill cut a complete image into pieces.)I guess maybe the
pdffigures2backend uses default parameters such as "image size" or "resolution" to filter the images?Can you give me some advice or clues?
Thank you for your assistance.