-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Restore and finish pdf-to-md-swift as a proper Layer 3 converter in macdoc.
The current main branch only carries the package manifest / README / CLI wiring from the earlier merge, but the tracked converter source and tests are missing. This follow-up issue closes that gap by landing the actual implementation.
Conversion Requirements
- Input:
.pdf - Output: Markdown stream /
.md - Architecture: direct PDF → Markdown path, avoiding hub loss through LaTeX
- Protocol: implement
DocumentConverterwithStreamingOutput
Layer 1 / Core Dependencies
PDFKitfor native macOS PDF parsingcommon-converter-swift(DocumentConverter,StreamingOutput,ConversionOptions)
Implementation Notes
- Extract PDF content page-by-page
- Heuristically map headings, paragraphs, ordered lists, unordered lists
- Preserve page boundaries with Markdown thematic breaks
- Support frontmatter + hard line break options already exposed in CLI
- Keep logic in
packages/pdf-to-md-swift/as an independent Swift package
Test Strategy
- Package-level unit tests for headings / paragraphs / list detection
- Page-break handling across multi-page PDFs
- Hyphenated line join behavior
- Frontmatter and hard-break options
swift testinsidepackages/pdf-to-md-swift
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request