FAPESP Curriculum Summary builder from multiple sources: file uploads (PDF, XLS/XLSX, TXT, MD) and URLs (Lattes, ORCID, DBLP, Google Scholar, Web of Science, personal site). Output is FAPESP-format Markdown; optional async email delivery via MailerSend.
- Inputs: PDF, XLS/XLSX, TXT, MD files; Lattes, ORCID, DBLP, Google Scholar, Web of Science, personal site URLs; BibTeX paste for Scholar (MVP).
- Processing: Text extraction and curation to structured TXT; LLM used only on curated text to generate the summary.
- Output: FAPESP-formatted curriculum summary in Markdown.
- Email: Optional async sending via MailerSend (no sign-up required for MVP).
- Python 3.11+
- Docker (PostgreSQL 16, Redis 7)
- OpenAI API key
-
Clone and enter the repo
git clone git@github.com:gustavopinto/sumula.git && cd sumula
-
Create virtualenv and install
python3 -m venv .venv && source .venv/bin/activate pip install -e .
-
Configure environment
cp .env.example .env # Edit .env with your OPENAI_API_KEY, MailerSend SMTP, etc. -
Start dependencies
docker compose up -d
-
Run the app
./run.sh
This starts the API (uvicorn) and the background worker (arq). Use
./stop.shto stop them.
See .env.example for all variables. Main ones:
OPENAI_API_KEY,OPENAI_MODEL,OPENAI_MAX_TOKENS— LLMDATABASE_URL— PostgreSQL (async)REDIS_URL— Redis (for arq)MAX_UPLOAD_MB,MAX_FILES— Upload limitsWORKDIR_PATH— Directory for job files- MailerSend:
SMTP_HOST,SMTP_PORT,SMTP_USERNAME,SMTP_PASSWORD,MAIL_DEFAULT_SENDER,MAIL_DEFAULT_SENDER_NAME
app/— FastAPI app, routes, extractors, worker, configmigrations/— Alembic migrationsrun.sh/stop.sh— Start/stop API and workerspec.md— Full specification (in Portuguese)
MIT License. See LICENSE.