Skip to content

eldon922/ezra-be

Repository files navigation

Ezra Backend Service

This repository contains the backend API for Ezra, a web service that accepts YouTube or Google Drive audio links, transcribes the content, and returns text/word documents to authenticated users. The service includes an administrative interface for managing users, prompts, and system settings.


πŸš€ Features

  • User authentication with JWT tokens
  • Audio retrieval from Google Drive or YouTube (via gdown/yt-dlp)
  • Optional trimming of audio using FFmpeg
  • Asynchronous processing with flask_executor
  • Transcription & proofreading logic (via external services)
  • File download endpoints (TXT, MD, Word)
  • Admin routes for managing users, prompts, transcriptions, and settings
  • PostgreSQL database managed with SQLAlchemy
  • Lightweight Flask app easily containerized or deployed with Gunicorn & Nginx

πŸ› οΈ Tech Stack

  • Python 3.11+ (tested)
  • Flask
  • Flask-JWT-Extended
  • Flask-SQLAlchemy
  • Flask-Executor
  • PostgreSQL
  • yt-dlp, ffmpeg, pandoc

πŸ“ Repository Structure

admin_routes.py
app.py                 # Main Flask application
database.py            # SQLAlchemy initialization
models.py              # ORM models
pandoc_service.py      # Converts documents via Pandoc
proofreading_service.py
transcription_service.py
password.py            # helper functions for password generation
wsgi.py                # Gunicorn entrypoint
migrations/            # SQL migration scripts
readme.md              # You are here
requirements.txt
Dockerfile

βš™οΈ Configuration

Environment Variables

Variable Description Example
DATABASE_URL SQLAlchemy database URI postgresql://user:pass@host/db
JWT_SECRET_KEY Secret key for signing JWT tokens a-very-secret-value
DEEPSEEK_BASE_URL Base URL for Deepseek API https://api.deepseek.com
DEEPSEEK_API_KEY API key for Deepseek service sk-xxxxxxxxxxxx
TRANSCRIBE_API_KEY API key used by transcription microservice Jsh2Y-KlsHSKhAg7K...
TRANSCRIBE_API_URL Endpoint for transcription service https://eldon922--ezra-inference-process.modal.run
GET_RESULT_TRANSCRIBE_API_URL Endpoint to fetch transcription results https://eldon922--ezra-inference-get-transcription-result.modal.run

Load them via a .env file or your deployment environment. You can use the included .env template if available.


πŸ“ Installation (local development)

  1. Clone the repository

    git clone <repo-url> ezra-be
    cd ezra-be
  2. Create & activate a Python virtual environment

    python3 -m venv venv
    source venv/bin/activate   # Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Set environment variables (via .env or export):

    export DATABASE_URL="postgresql://..."
    export JWT_SECRET_KEY="change-this"
  5. Run migrations (if using):

    python migrations/migrate.py
  6. Start the application

    python app.py
    # or with Flask
    export FLASK_APP=app.py
    flask run

Open http://localhost:5000/ and test the /login endpoint.


πŸ“‘ API Endpoints

Authentication

  • POST /login – body {username, password} β†’ JWT access token

User Routes (require Authorization: Bearer <token>)

  • POST /process – submit a transcription request (form data: drive_link, optional start_time, end_time)
  • GET /transcriptions – list current user's transcriptions
  • GET /download/{txt|md|word}/{id} – download a completed file

Admin Routes (JWT token of a user with is_admin=true)

Under /admin prefix:

  • GET /users, POST /users, DELETE /users/{id}
  • GET /transcriptions, DELETE /transcriptions/{id}
  • GET /logs
  • Prompt management (/transcribe-prompts, /proofread-prompts)
  • Settings endpoints to select active prompts

See admin_routes.py for full details and request/response shapes.


πŸ—‚ Database Schema

Tables defined by models.py include User, Transcription, ErrorLog, TranscribePrompt, ProofreadPrompt, SystemSetting, etc. Scripts in migrations/ provide initial SQL.


⚠️ Deployment Tips

  • Build a virtual environment and install dependencies.
  • Use Gunicorn with wsgi:app and configure systemd (service file example in existing README).
  • Serve behind Nginx as reverse proxy; ensure file permissions for user file directories.
  • Install system packages: pandoc, ffmpeg, and keep yt-dlp up to date.

πŸ“¦ Docker

A Dockerfile is included for container builds. Adapt as needed for production.


πŸ“š Additional Resources

Links to DigitalOcean tutorials (e.g., Flask+Gunicorn+Nginx, PostgreSQL setup, firewall rules) are kept for reference.

Linux Snippets

# DEPLOY BACKEND #########################################################################

cd ~/ezra-be
git checkout main
git pull
source ~/ezra-be/venv/bin/activate
pip install -r requirements.txt
sudo systemctl restart ezra-be
sleep 5
sudo systemctl status ezra-be

------------------------------------------------------------------------------------------

journalctl -e -u ezra-be
htop

------------------------------------------------------------------------------------------

cd ~/ezra-be
source ~/ezra-be/venv/bin/activate

python3 -m venv venv
python3 app.py

------------------------------------------------------------------------------------------

curl http://149.248.36.65/login

curl -H "Content-type: application/json" -d '{
    "username": "eldon",
    "password": "eldon444"
}' 'http://149.248.36.65/login'

# COPY/CUT/REMOVE/RENAME/LINK FILES ######################################################

cp -r /usr/bin/ffmpeg /root/ezra-be/venv/bin/ffmpeg
scp root@104.248.159.174:/root/ezra-be/txt/eldon/2455-10minutes.txt .
scp -P 47903 C:/Users/AVOWS/Desktop/ASR/audio_files/3648.mp3 user@194.106.118.83:~/whisper/audio_files/3648.mp3
mv ezra-be /home/ezra_user/
ln -s /usr/bin/ffprobe /root/ezra-be/venv/bin/ffprobe

# DATABASE ###############################################################################

sudo -u ezra_user psql ezra

UPDATE system_settings SET setting_value = 'true' WHERE setting_key = 'transcribing_allowed';

psql 'postgres://avnadmin:[PASSWORD]@ezra-ezra.e.aivencloud.com:10744/ezra_be?sslmode=require'

# NGINX BACKEND ##########################################################################

sudo nano /etc/nginx/sites-available/ezra-be

------------------------------------------------------------------------------------------

server {
    listen 80;
    server_name _;

    # allow 127.0.0.1;
    # deny all;

    location / {
        include proxy_params;
        proxy_pass http://unix:/root/ezra-be/ezra-be.sock;
    }
}

------------------------------------------------------------------------------------------

sudo ln -s /etc/nginx/sites-available/ezra-be /etc/nginx/sites-enabled

cd /etc/nginx/sites-enabled
sudo rm default

sudo nginx -t
sudo systemctl restart nginx

# GUNICORN BACKEND SERVICE ###############################################################

sudo nano /etc/systemd/system/ezra-be.service

------------------------------------------------------------------------------------------

[Unit]
Description=Gunicorn instance to serve ezra-be
After=network.target

[Service]
User=root
Group=www-data
WorkingDirectory=/root/ezra-be
Environment="PATH=/root/ezra-be/venv/bin"
ExecStart=/root/ezra-be/venv/bin/gunicorn --timeout 0 --threads 3 --workers 3 --bind unix:ezra-be.sock -m 007 wsgi:app

# Memory management
MemoryAccounting=yes
MemoryHigh=400M

CPUQuota=80%

[Install]
WantedBy=multi-user.target

------------------------------------------------------------------------------------------

sudo systemctl daemon-reload

sudo systemctl start ezra-be
sudo systemctl stop ezra-be
sudo systemctl restart ezra-be
sudo systemctl enable ezra-be
sudo systemctl status ezra-be

# SSL #####################################################################################

sudo certbot --nginx -d transcript.griibandung.org -d www.transcript.griibandung.org

πŸ™Œ Contributing

Feel free to submit issues or pull requests. Follow Python style guidelines and update tests when adding features.


πŸ“œ License

Specify the license for your project here (e.g. MIT, Apache 2.0).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors