Skip to content

fix: Update regex dependency for security fix and add uvx documentation#989

Open
ambicuity wants to merge 2 commits intodatalab-to:masterfrom
ambicuity:fix/issues-975-971-regex-security-and-uvx-docs
Open

fix: Update regex dependency for security fix and add uvx documentation#989
ambicuity wants to merge 2 commits intodatalab-to:masterfrom
ambicuity:fix/issues-975-971-regex-security-and-uvx-docs

Conversation

@ambicuity
Copy link

Summary

This PR addresses two open issues:

Issue #975: Security vulnerability in regex

Issue #971: Add uvx documentation

  • Added uvx installation instructions to README.md
  • Provides fast CLI workflow for users who prefer not to install packages globally

Fixes #975
Fixes #971

…dd uvx documentation

- Update regex dependency from ^2024.4.28 to >=2024.4.28 to allow
  installation of versions that include security fixes (Fixes datalab-to#975)
- Add uvx installation instructions to README.md for fast CLI workflow
  without global pip install (Fixes datalab-to#971)
Copilot AI review requested due to automatic review settings February 9, 2026 18:25
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2026

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses Issue #975 by relaxing the regex dependency constraint to allow installing versions that include security fixes, and addresses Issue #971 by documenting uvx as an alternative CLI workflow.

Changes:

  • Updated regex version constraint in pyproject.toml to permit newer releases.
  • Regenerated poetry.lock to reflect the updated dependency resolution.
  • Added uvx installation/run instructions to the README.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
pyproject.toml Relaxes regex constraint so users can install patched versions.
poetry.lock Updates locked dependency set after the constraint change.
README.md Adds uvx usage instructions for running the CLI without global installs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pyproject.toml Outdated
rapidfuzz = "^3.8.1"
surya-ocr = "^0.17.1"
regex = "^2024.4.28"
regex = ">=2024.4.28"
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regex is now unbounded above (>=2024.4.28), unlike the rest of the dependencies here which all use constraints that include an implicit upper bound (e.g., ^ / ~). Consider using a bounded range that still permits the patched releases (e.g., allow 2025.x) but avoids accidentally pulling in a future breaking release (date-based major bumps).

Copilot uses AI. Check for mistakes.
Comment on lines +79 to +89
Alternatively, you can use [uvx](https://docs.astral.sh/uv/guides/tools/) for a fast CLI workflow without installing packages globally:

```shell
uvx --from marker-pdf marker_single /path/to/file.pdf
```

For the full installation with additional document format support:

```shell
uvx --from "marker-pdf[full]" marker_single /path/to/file.pdf
```
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uvx examples omit --output_dir. By default marker_single writes to settings.OUTPUT_DIR, which resolves under the installed package directory (site-packages) and may be non-writable / unexpected when run via uvx. Update the examples to pass an explicit output directory (e.g., current working directory) so the command reliably produces output where users expect.

Copilot uses AI. Check for mistakes.
@ambicuity
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Feb 9, 2026
- Use bounded regex constraint (>=2024.4.28,<2026) to prevent
  accidental future breaking releases while still allowing 2025.x
  security-patched versions
- Add --output_dir flag to uvx examples for predictable output location
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security vulnerability in regex Update docs with uvx instructions for fast CLI workflow

2 participants