Mode: Maintenance -- Core features complete. Accepting bug reports and minor enhancements.
EasyLit is a web application that automates structured data extraction from academic PDFs stored in your Zotero library. It uses the Claude API to analyze research articles and extract constructs, variables, hypotheses, scales, themes, and research gaps -- with live analytics, trend synthesis, and a full HTML report with bibliography.
Claude API costs are covered (donorware model) -- users only need a free Zotero account.
There are already several well-developed literature review tools out there, and EasyLit is not trying to replace any of them. It exists to fill one specific gap that the others do not address well.
| Tool | Strength | What it is |
|---|---|---|
| Elicit | Semantic search across 125M+ papers, AI-generated summary tables | Discovery and abstract-level extraction |
| Research Rabbit | Citation graph exploration, "Spotify for papers" | Discovery |
| Litmaps | Visual citation maps, alerts on new related work | Discovery and monitoring |
| Connected Papers | Visual graph of related work from a seed paper | Discovery |
| Scite | "Smart citations" showing supporting vs. contradicting claims | Citation analysis |
| Covidence / Rayyan | PRISMA-style screening workflows for systematic reviews | Team screening |
| SciSpace | Chat-with-PDF, extraction tables, paraphrasing | Full-text analysis |
These tools are all strong in their lanes, and if you need article discovery, citation graph exploration, or team-based PRISMA screening, you should use them. EasyLit does not try to compete with any of them.
All of the discovery-focused tools above run into the same wall: getting actual full text is expensive and legally complicated.
Most of them lean on Semantic Scholar (~200M papers), OpenAlex (~250M works), CrossRef, PubMed, arXiv, or DOAJ for their backing corpus. These are free or near-free sources, but they only reliably provide metadata and abstracts, plus full text for open-access papers. Anything behind a publisher paywall is out of reach without direct licensing deals. Elicit has been working on publisher agreements, but those are slow, expensive, and incomplete. Covidence and Rayyan sidestep the problem entirely by making you upload the PDFs yourself, and SciSpace increasingly does the same.
The overhead this creates is substantial:
- Embedding and indexing compute for hundreds of millions of papers
- Ongoing metadata freshness (retractions, DOI changes, new preprint versions)
- Legal and contractual work to license full text where possible
- Storage that scales linearly with every user who uploads their own library
All of that is why every full-featured tool in this space charges somewhere between $12 and $20/mo to individuals, or thousands per year to institutions.
EasyLit assumes you have already solved the hardest problem: you brought the corpus.
Your institutional library paid for the access. Zotero stored the PDFs. You did the relevance filtering and screening. At that point, the remaining job -- actually pulling structured research data out of those PDFs -- is a narrow, well-defined extraction task, and that is the only thing EasyLit does.
Because the tool never maintains a corpus, there is no indexing bill, no metadata graph to keep fresh, no licensing to negotiate, and no storage that scales with every new user. The entire cost surface is just the Claude API calls for the specific PDFs you point at it. That is precisely why the donorware model works here and would not work for Elicit.
- You already curated the corpus. EasyLit is a post-curation extraction tool, not a discovery tool. If you already have a Zotero collection of the papers you care about, this is the straight line from "200 PDFs in a folder" to "structured CSV of constructs, variables, hypotheses, scales, and gaps."
- Structured extraction tuned for empirical research. The schema is opinionated: constructs, IV/DV/moderator variables, hypothesis-level row expansion with beta coefficients and effect sizes, measurement scales, sample characteristics, themes, and research gaps. This is specifically aimed at quantitative and mixed-methods dissertation and meta-analysis work.
- Donorware -- free to end users. Shared Claude API key with a daily per-user cap. No subscription, no paywall, no credit card.
- Zotero-native. Collection tree picker, PDF resolution from local storage or the Zotero API, BibTeX round-trip. No re-uploading a library you already maintain.
- Transparent and self-hostable. Public repo. You can run it on your own Droplet with your own Anthropic key. Your PDFs and prompts do not flow through a closed SaaS stack.
- Reports out of the box. HTML report with SVG publication timeline, variable co-occurrence matrix, 10 citation styles, and BibTeX export. Most competitors stop at "here is a table, export to CSV."
- No discovery or search. If you do not already have a Zotero library, EasyLit cannot help you build one. Use Elicit, Research Rabbit, or Litmaps for that stage.
- No PRISMA screening workflow. Covidence and Rayyan own that space and do it well.
- No citation graph visualization. Research Rabbit, Litmaps, and Connected Papers own that.
- Single-worker Flask app. Not built for large team collaboration the way Covidence is.
The honest pitch: if you are a researcher who already uses Zotero and needs to pull structured empirical data out of a curated collection without paying a monthly subscription or uploading your library to someone else's server, this is for you. It is a sharp tool for a narrow slot, not an Elicit killer.
- Zotero Collection Tree Picker -- Browse and select any collection from your Zotero library directly in the UI
- Automated PDF Extraction -- Downloads PDFs via the Zotero API and sends them to Claude for analysis
- Claude-Powered Analysis -- Extracts constructs, independent/dependent variables, moderators, instrumentation, key definitions, themes, theoretical constructs, and research gaps
- Meta-Analysis Mode -- Captures hypothesis-level numerical data (beta coefficients, effect sizes, sample sizes, R values) for quantitative synthesis
- Scale & Sample Mode -- Extracts measurement scale details and sample characteristics
- Running Trend Synthesis -- Periodic Claude-generated summaries of emerging patterns as articles are processed
- Mock Mode -- Tests the full UI and visualizations with synthetic data at no API cost
- Token Pre-Estimation -- Before every real run, EasyLit scans all PDFs using
count_tokens()(no inference cost) and shows a pre-flight modal with exact input token counts, estimated output tokens, projected API cost, and estimated run time based on historical jobs - Daily Usage Cap -- Configurable per-user daily cost limit (default $2/day) to prevent runaway spend
- The pre-flight modal requires confirmation before launching -- Cancel discards the run entirely
- CSV Export -- One row per article (or per hypothesis in Meta mode) with all extracted fields
- HTML Report -- Rendered research report with synthesis, publication timeline, frequency tables, variable co-occurrence matrix, bibliography, and a citable EasyLit reference
- BibTeX Export --
.bibfile generated from article metadata for direct import into reference managers - 10-Style Citation Card -- Live citation card on the analysis panel supporting APA 7, MLA 9, Chicago (Full Note and Author-Date), IEEE, Vancouver, Nature, AMA, Elsevier Harvard, and Bluebook
- Navigation-safe jobs -- Closing or refreshing the tab does not kill the job; the server thread continues running
- Auto-reconnect -- On return, EasyLit detects any active or recently completed job and restores the log, download buttons, and citation card
- Browser notifications -- When the job finishes while the tab is in the background, a browser notification fires (requires one-time permission grant)
- Usage dashboard -- Total jobs, articles, pages, API cost, token counts; per-user breakdown and guest activity
- Prompt versioning -- View, edit, and version all six Claude prompt templates; roll back to any prior version; changes take effect on the next extraction run
- Google OAuth -- Sign in with Google to save analyses and access history
- Analysis History -- Re-download CSV, report, or BibTeX for any past extraction without re-running
- Encrypted Zotero key storage -- Zotero API keys are encrypted at rest in Supabase using Fernet symmetric encryption
- Python 3.9 or higher
- A Zotero account with API access
- A Supabase project (free tier is sufficient)
- A Google OAuth application (for sign-in)
- An Anthropic API key (server-side only, not required from users)
git clone https://github.com/opieeipo/EasyLit.git
cd EasyLit# macOS / Linux
python3 -m venv venv
source venv/bin/activate
# Windows
python -m venv venv
venv\Scripts\activatepip install -r requirements.txtCopy env.example to .env and fill in your values. Key additions for donorware mode:
# Shared Claude API key (all users share this)
ANTHROPIC_API_KEY=sk-ant-your-key
# Daily cost cap per user in USD
DAILY_COST_CAP_USD=2.00
# App base URL (your domain in production)
APP_BASE_URL=http://localhost:8000See env.example for the full list.
Run schema.sql and schema_prompt_versions.sql in your Supabase SQL Editor to create all required tables.
python easylit_app.pyThen open your browser to: http://localhost:8000
On first launch, the setup wizard guides you through connecting Zotero. Claude API is provided server-side -- no user API key needed.
| Setting | Description |
|---|---|
| Zotero API Key | Found at zotero.org/settings/keys |
| Zotero Library ID | Your numeric user ID -- use Auto-detect after entering your key |
| Model | claude-sonnet-4-20250514 recommended; Opus and Haiku also available |
| Delay (seconds) | Pause between PDFs to manage API rate limits |
| Retries | Retry attempts on failed extractions |
| Citation Format | Default style for the citation card and report (10 styles supported) |
Zotero settings for registered users are stored encrypted in Supabase.
| Option | Description |
|---|---|
| Meta-Analysis Mode | Hypothesis rows with IV/DV names, beta coefficients, effect sizes, sample sizes |
| Scale & Sample Mode | Measurement scale details and sample characteristics |
| Remove Empty Columns | Drops columns with no extracted data from the CSV |
| Running Trend Synthesis | Claude summarizes emerging themes every N articles |
| Mock Mode | Synthetic data, no API calls, no cost -- for UI testing only |
EasyLit/
|-- easylit_app.py # Flask app entry point and route definitions
|-- jobs.py # Background extraction thread and job state
|-- estimate.py # Token pre-estimation thread (count_tokens)
|-- extraction.py # PDF/data processing helpers
|-- prompts.py # Claude prompt templates and versioning
|-- reports.py # HTML report and BibTeX generation
|-- zotero_client.py # Zotero API helpers
|-- db.py # Supabase database layer + usage cap
|-- auth.py # Google OAuth blueprint
|-- admin.py # Admin dashboard blueprint
|-- digest.py # Usage digest mailer
|-- config.py # Default config and load/save stubs
|-- deploy/
| |-- setup.sh # Droplet provisioning script
| |-- easylit.nginx # Nginx reverse proxy config
| |-- easylit.service # systemd service unit
|-- schema.sql # Main Supabase schema
|-- schema_prompt_versions.sql # Prompt versioning table
|-- requirements.txt # Python dependencies
|-- templates/
| |-- index.html # Main app frontend
| |-- report.html # Report template
| |-- admin/
| |-- dashboard.html # Admin dashboard
| |-- login.html # Admin login
|-- README.md # This file
- Backend: Python, Flask, gunicorn, PyZotero, Anthropic Python SDK
- Frontend: Vanilla JavaScript, HTML/CSS
- AI: Anthropic Claude (
claude-sonnet-4-20250514) - Database / Auth: Supabase (PostgreSQL + service role key)
- Google OAuth: Authlib
- Encryption: Python
cryptographylibrary (Fernet) - Deployment: DigitalOcean Droplet (nginx + gunicorn + systemd + Let's Encrypt)
- Create a $6/mo Droplet (1 vCPU, 1GB RAM, Ubuntu 22.04)
- Point your domain's A record to the Droplet IP
- SSH in and run:
bash deploy/setup.sh your-domain.com - Copy your
.envto/opt/easylit/.env systemctl start easylit
To deploy updates:
cd /opt/easylit && git pull && source venv/bin/activate && pip install -r requirements.txt && systemctl restart easylit- Jobs run as background threads on the server. Closing the browser tab does not interrupt an in-progress extraction.
- If a PDF cannot be retrieved, the article is logged as skipped and processing continues.
- The token pre-estimation pre-flight uses
count_tokens()which is billed at a fraction of full inference cost but is not free. - Prompt versions are loaded from Supabase at job start. Changes in the admin Prompts tab take effect on the next run with no redeployment needed.
- Daily usage cap is checked at job start. Users who exceed the cap see a clear message with their spend and the limit.
EasyLit is released under the Apache License 2.0. You are free to use, modify, and redistribute the code for any purpose, commercial or otherwise, subject to the terms of the license. Attribution is required; warranty is not provided.
M. Opie Frazier