-
Notifications
You must be signed in to change notification settings - Fork 4
find ranges of page numbers #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
find ranges of page numbers #23
Conversation
|
Note that this adds to requirements.py, the |
ports a feature from PPtools: find page numbers in various formats and display them (roman first, arabic next). understands different formats like Page_1, page_1, page1 (in id attribute) and attempts to parse out span class=pagenum formats like p. 1, [Pg 1] and so forth.
b6c26ee to
4d9cbde
Compare
pphtml.py
Outdated
| import roman | ||
| from time import strftime | ||
| from html.parser import HTMLParser | ||
| import regex as re # for unicode support (pip install regex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we follow the convention of grouping the built-in packages together, a newline, and then the 3rd party?
import sys
import os
import argparse
import itertools
from time import strftime
from html.parser import HTMLParser
import regex as re # for unicode support
import roman
from PIL import ImageIdeally each would be alpha-sorted but I'm not going to get wound around the axle about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! and sorted :)
|
I just pushed an update that will show |
|
I ran all of my own projects through this and it worked for all of them except for a couple that have some weird issue uploading. I don't think that's related to this change, though. My projects have gone through Guiguts 1, ppgen, and Guiguts 2, so I think that shows this is able to handle all of those styles for page number markup. |
cpeel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sandbox with this code available for testing in https://www.pgdp.org/~cpeel/ppwb/pphtml.php
Ports a feature from PPtools: find page numbers in various formats and display them (roman first, arabic next).
Understands different formats like
Page_1,page_1,page1(inidattribute). Also attempts to parse numbers from<span class="pagenum">tags (p. 1,[Pg 1], etc.).Example of what this looked like in PPTools:

Example from this change:
Can display multiple ranges if numbers are missing from the sequence: