Jaison - OCR Service powered by LLM.

Extract structured data from document images using multimodal LLMs.

Overview

Jaison is a platform that leverages multimodal Large Language Models (LLMs) to extract structured information from document images. Unlike traditional OCR services that only convert text from images, Jaison uses advanced visual understanding capabilities of multimodal models to extract specific data requested by the user.

Users can upload images of documents (like receipts, invoices, tickets), specify what information they want to extract using natural language prompts, and receive structured JSON data in response.

Architecture

Jaison uses a microservices architecture with two separate services:

OCR API Service (Port 8420): Handles document processing and OCR functionality
Admin API Service (Port 8421): Handles user authentication, API key management, and database access

Features

Upload document images via API
Extract structured data using natural language prompts
Receive standardized JSON responses
Dashboard for API key management and usage tracking
User authentication and account management
Support for various document types (receipts, invoices, IDs, etc.)

Getting Started

Prerequisites

Python 3.9+
Node.js 16+ and npm (for frontend)
Supabase account
OpenRouter API key

Backend Setup

Clone the repository:

git clone https://github.com/zakantonio/jaison.git
cd jaison

Set up the environment:

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e ".[dev]"

Configure environment variables:

cp .env.example .env
# Edit .env with your actual values

Set up the database:
- Create a new project in Supabase
- Update your .env file with the Supabase URL and keys
- Run the database migrations:
```
python scripts/run_migrations.py
```
- This will automatically bootstrap the database and create all necessary tables and indexes
- If you encounter any issues, you can run the bootstrap script manually:
```
python scripts/bootstrap_database.py
```
Create required directories:
```
mkdir -p uploads results
```

Frontend Setup

Navigate to the frontend directory:
```
cd frontend/jaison-dashboard
```
Install dependencies:
```
npm install
```
Start the development server:
```
npm start
```

Running the Application

You can use the provided Makefile commands to run the application:

# Run the OCR API service
make ocr-api

# Run the Admin API service
make admin-api

# Run the frontend server
make frontend

The OCR API will be available at http://localhost:8420 The Admin API will be available at http://localhost:8421 The frontend will be available at http://localhost:3000

Using the Makefile

The project includes a Makefile with various helpful commands:

make help  # Show all available commands

Development

Running Tests

# Run all tests
make test

# Run tests with coverage report
make test-cov

Code Formatting and Linting

# Format code
make format

# Lint code
make lint

Project Structure

jaison/ocr_api/ - OCR API Service
- jaison/ocr_api/api/ - OCR API endpoints and models
- jaison/ocr_api/services/ - OCR business logic and external services
- jaison/ocr_api/config/ - OCR API configuration settings
- jaison/ocr_api/utils/ - OCR API utility functions
jaison/admin_api/ - Admin API Service
- jaison/admin_api/api/ - Admin API endpoints and models
- jaison/admin_api/database/ - Database models and repository
- jaison/admin_api/services/ - Admin business logic
- jaison/admin_api/config/ - Admin API configuration settings
- jaison/admin_api/utils/ - Admin API utility functions
frontend/jaison-dashboard/ - React frontend application
tests/ - Test suite
docs/ - Documentation files
- docs/api.md - API documentation
- docs/architecture.md - Architecture documentation
- docs/setup.md - Setup guide

API Documentation

Once the services are running, you can access the API documentation at:

OCR API Swagger UI: http://localhost:8420/docs
Admin API Swagger UI: http://localhost:8421/docs

Detailed API documentation is also available in the docs/api.md file.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Jaison - OCR Service powered by LLM.

Overview

Architecture

Features

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Running the Application

Using the Makefile

Development

Running Tests

Code Formatting and Linting

Project Structure

API Documentation

License

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
frontend/jaison-dashboard		frontend/jaison-dashboard
jaison		jaison
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

zakantonio/jaison

Folders and files

Latest commit

History

Repository files navigation

Jaison - OCR Service powered by LLM.

Overview

Architecture

Features

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Running the Application

Using the Makefile

Development

Running Tests

Code Formatting and Linting

Project Structure

API Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages