My PDF's have a lot of math, symbols, figures, etc.
is there any way you know of to extract text from a page but only within one of several bounding boxes?
I basically want to set up a feedback loop where I:
- iterate through the pages of the pdf
- set ordered bounding boxes visually on each page
- automatically extract and concatenate text from these bounding boxes, in their indicated order (from step 2)
Is this doable? is there a simple way to do this? what do you think?
My PDF's have a lot of math, symbols, figures, etc.
is there any way you know of to extract text from a page but only within one of several bounding boxes?
I basically want to set up a feedback loop where I:
Is this doable? is there a simple way to do this? what do you think?