PDFWrite

Project Title & Badges

Description

The pdf-weaver project is a Next.js-based application designed to streamline the process of converting PDF documents and images into editable Markdown. Leveraging AI, it intelligently extracts text, recognizes document structure, and provides a rich text editing experience.

It has been designed for developers, writers, and content creators who need to seamlessly convert PDFs, scanned documents, and even handwritten notes into editable Markdown. The application allows users to edit, format, and export content, integrating directly into existing workflows. Cloud synchronization using Supabase is planned for future development.

Features ✨

PDF and Image Upload: Supports both PDF and image file uploads.
Intelligent Text Extraction: Utilizes AI to extract and format text from PDFs and images.
Page Range Selection: Allows users to select specific pages or page ranges to process.
WYSIWYG Editor: Provides a rich text editor based on Tiptap for editing and formatting extracted content.
Markdown Preview: Offers live Markdown and HTML previews with syntax highlighting.
Export Options: Supports exporting content to various formats, including Markdown, HTML, DOCX and PDF.
Draft Saving: Automatically saves drafts when navigating back, preventing data loss.
Cloud Sync with Supabase (Coming Soon): Will allow users to access saved projects from any device.
Local Storage: Saved projects are stored locally in the browser for offline access.
Theme Support: Light and Dark theme support using Next Themes.

Tech Stack 💻

Framework: React, Next.js
Language: TypeScript, JavaScript
Styling: Tailwind CSS, Tailwind CSS-animate
AI: Genkit
Editor: Tiptap
Database (Planned): Supabase
PDF Processing: pdf-lib, pdfjs-dist
Other: Node.js, Express (implied by Genkit)

Installation ⚙️

Clone the repository:

git clone https://github.com/Sonucs12/pdf-weaver.git
cd pdf-weaver

Install dependencies:
```
npm install
```

Set up environment variables:

Create a .env.local file in the root directory and add the following:

GEMINI_API_KEY=<your_gemini_api_key>
NEXT_PUBLIC_SUPABASE_URL=<your_supabase_url>
NEXT_PUBLIC_SUPABASE_ANON_KEY=<your_supabase_anon_key>
SUPABASE_SERVICE_KEY=<your_supabase_service_key>

Replace placeholders with your actual API keys and Supabase credentials.

Run patch-package (if necessary):
```
npx patch-package
```
Configure Firebase:
- It is assumed that the project may integrate with Firebase, ensure necessary config is in place

Usage 🚀

Run the development server:
```
npm run dev
```
Access the application: Open your browser and navigate to http://localhost:9002.
Extract Text from PDF: Navigate to the /extract-text route, then to /extract-text/create-new.
- Upload the PDF or images you wish to process.
- Select page ranges in the PDF.
- Click "Process Pages" to extract the content using AI.
Edit Extracted Text: Once processing is complete, you'll be directed to the editor where you can modify the extracted Markdown content.
Export: Export to various file formats including Markdown, HTML, DOCX and PDF.

Real-World Use Cases

Convert Scanned PDFs: Transform scanned PDFs or handwritten notes into editable text.
Content Repurposing: Extract content from PDFs for use in blogs, articles, or other documents.
Document Summarization: Summarize lengthy PDF documents into concise Markdown notes.

How to Use ✍️

Create New: Use the /extract-text/create-new route to upload and process documents.
Edit Drafts: Access and modify automatically saved drafts from the /extract-text/draft route.
Saved Documents: Manage and edit saved projects through the /extract-text/saved route.
Editor: Edit and format your contents using the WYSIWYG editor at /extract-text/editor.

Configuration Examples

Set API keys: Ensure your .env.local file has valid API keys for Genkit and Supabase.

Project Structure 📂

pdf-weaver/
├── .idx/
├── .next/
├── .vscode/
├── apphosting.yaml
├── components.json
├── docs/
├── LICENSE
├── next-sitemap.config.js
├── next.config.ts
├── package.json
├── postcss.config.mjs
├── public/
├── src/
│   ├── ai/
│   ├── app/
│   ├── components/
│   ├── extensions/
│   ├── hooks/
│   ├── lib/
│   ├── styles/
│   └── types/
├── tailwind.config.ts
├── tsconfig.json
└── yarn.lock

Key Directories:

/src/ai: Contains AI-related flows and configurations using Genkit.
/src/app: Main Next.js application directory with routes and pages.
/src/components: Reusable React components.
/src/lib: Utility functions and configurations.
/src/workers: Web worker scripts for background tasks.

API Reference 📚

The project utilizes Genkit for AI flows. Key API endpoints and functions include:

src/ai/flows/index.ts: Exports the extractAndFormatPages function.
src/ai/flows/extract-and-format.ts: Defines the extractAndFormatPages flow for text extraction and formatting.
src/ai/genkit.ts: Manages Genkit configurations and API key handling.

The application leverages Supabase for potential cloud sync features. Check the .env.local file and src/lib/supabase.ts for Supabase client setup.

Contributing 🤝

Contributions are welcome! Here's how you can contribute:

Fork the repository.
Create a new branch for your feature or bug fix.
Implement your changes.
Submit a pull request.

License 📜

This project is licensed under the MIT License - see the LICENSE file for details.

Important Links 🔗

Repository: https://github.com/Sonucs12/pdf-weaver

Footer

PDF-weaver - https://github.com/Sonucs12/pdf-weaver - Made with ❤️ by sonucs12 - Contribute, Like, Star, or raise Issues!

Generated by ReadmeCodeGen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDFWrite

Project Title & Badges

Description

Table of Contents

Features ✨

Tech Stack 💻

Installation ⚙️

Usage 🚀

Real-World Use Cases

How to Use ✍️

Configuration Examples

Project Structure 📂

API Reference 📚

Contributing 🤝

License 📜

Important Links 🔗

Footer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
apphosting.yaml		apphosting.yaml
components.json		components.json
next-sitemap.config.js		next-sitemap.config.js
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

PDFWrite

Project Title & Badges

Description

Table of Contents

Features ✨

Tech Stack 💻

Installation ⚙️

Usage 🚀

Real-World Use Cases

How to Use ✍️

Configuration Examples

Project Structure 📂

API Reference 📚

Contributing 🤝

License 📜

Important Links 🔗

Footer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages