Skip to content

Provide access to page::text_list #111

@stefan6419846

Description

@stefan6419846

The current wrapper implementation only provides access to the page->text method results.

There is a similar text_list method in the original Poppler code (since version 0.63.0?) which provides access to single words and their bounding boxes. With this, functionality like selecting a clipping region, re-ordering the text or filtering too small text can be achieved. This roughly corresponds to the -bbox option of the CLI.

It would be great if the Python wrapper could provide access to the words with their bounding boxes for further post-processing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions