lorph_demo.1.mp4
Lorph is a full-stack AI research application built on the Ollama framework for interacting with cloud-based Large Language Models (LLMs). It combines a React frontend for user interaction and local file processing with a Node.js backend dedicated to deep web research, scraping, and synthesizing source-backed responses.
- Multi-Model LLM Access: Connects to multiple cloud-based LLMs through Ollama.
- Deep Web Research: The backend parses user intent, generates search queries, runs parallel web searches, and scrapes web pages to provide accurate, cited answers.
- Local File Processing: Extracts text directly in the browser from Images (OCR), PDFs, Word documents (.docx), Excel files (.xlsx), and plain text/code files.
- Rich Media & Citations: Automatically embeds inline citations, extracted images, and YouTube videos found during the research phase.
- Responsive UI: A dark-themed, chat-based interface that handles dynamic model selection, file attachments, and real-time streaming text.
Important
The diagram below shows Lorph’s high-level deep research workflow. It is a conceptual view of the pipeline, not an exact map of the project’s internal code structure.
flowchart TB
subgraph Client [Frontend - React / Vite]
UI[User Interface]
FP[File Processor]
MD[Markdown Renderer]
end
subgraph Backend [Node.js / Express Server]
DRE[DeepResearchEngine]
end
subgraph External [External Services]
OLLAMA[Ollama Cloud API]
DDG[DuckDuckGo Lite]
YT[YouTube Search via yt-search]
WEB[Target Websites]
PROXIES[Proxy Fallbacks]
end
UI -- "1. User Prompt + Files" --> FP
FP -- "2. Extract Text in Browser" --> UI
UI -- "3. Deep Research Prompt + Context" --> DRE
subgraph Research Loop [Depth: 3 Iterations]
DRE -- "4. Generate/Refine 15-20 Queries" --> OLLAMA
DRE -- "5. Execute Parallel Searches" --> DDG
DRE -- "5. Execute Parallel Searches" --> YT
DRE -- "6. Deduplicate & Select New URLs" --> DRE
DRE -- "7. Scrape Selected URLs" --> WEB
DRE -. "Fallback when needed" .-> PROXIES
PROXIES --> WEB
DRE -- "8. Analyze Findings & Continue" --> DRE
end
DRE -- "9. Build Final Research Context" --> OLLAMA
OLLAMA -- "10. Stream Research Response" --> DRE
DRE -- "11. Stream to Client" --> UI
UI --> MD
MD -- "12. Render Rich Text & Media" --> UI
Follow these steps to run Lorph locally.
git clone https://github.com/AL-MARID/Lorph.git
cd LorphLorph requires the Ollama client to connect to cloud models and an API key for authentication.
- Ollama Account: Create an account at Ollama.
- Email Verification: Verify your registered email address.
- Login Credentials: Have your Ollama login credentials readily available.
Install the Ollama client on your local machine.
- Linux & macOS:
curl -fsSL https://ollama.com/install.sh | sh - Windows: Download from ollama.com/download/windows
- Start Ollama Server: In a new terminal, initiate the server process:
ollama serve
- Device Pairing & Login: In a separate terminal, authenticate your device:
Follow the on-screen instructions to open the authentication URL and connect your device.
ollama signin
To enable Lorph to connect with Ollama's cloud models, an API key must be configured.
- Generate API Key: After completing the device pairing, generate a new API key from your Ollama settings: ollama.com/settings/keys.
- Create
.env.localfile: In the root of the Lorph project directory, create a new file named.env.local. - Add API Key: Insert the generated key into the
.env.localfile.OLLAMA_CLOUD_API_KEY=your_api_key_here
Use your preferred package manager to install the required packages.
npm install-
Development Mode:
npm run dev
This starts the Express server and Vite middleware concurrently.
-
Production Build:
npm run build npm start
Access the application at http://localhost:3000 in your browser.
Lorph is configured to interact with the following cloud-based LLMs through Ollama:
deepseek-v3.1:671b-cloudgpt-oss:20b-cloudgpt-oss:120b-cloudkimi-k2:1t-cloudqwen3-coder:480b-cloudglm-4.6:cloudglm-4.7:cloudminimax-m2:cloudmistral-large-3:675b-cloud
- Frontend: React 19, Vite, Tailwind CSS, Lucide React
- Backend: Node.js, Express
- Language: TypeScript
- Markdown Rendering: React Markdown, remark-gfm, React Syntax Highlighter
- File Processing: PDF.js, Tesseract.js, Mammoth, read-excel-file
- Web Scraping & Search: Cheerio, node-fetch, DuckDuckGo Lite, yt-search
Lorph extracts text from attached files locally in the browser before sending the context to the backend.
- Images (JPEG, PNG): OCR text extraction via Tesseract.js.
- PDF: Multi-page text extraction via PDF.js.
- Word (DOCX): Raw text extraction via Mammoth.
- Excel (XLSX): Row-column parsing via read-excel-file.
- Plaintext / Code: Direct file read (TXT, MD, JSON, CSV, JS, TS, PY, etc.).
The backend handles web research through an iterative, multi-step process that generates targeted queries, runs parallel searches, filters and deduplicates discovered URLs, scrapes selected pages, and synthesizes a final answer with inline citations:
- Initial Intent Parsing: The LLM analyzes the user's core intent and generates an initial batch of 15-20 highly specific search queries.
- Iterative Research Loop (Depth of 3): The engine performs 3 complete cycles of research. In each cycle, it:
- Executes parallel searches across DuckDuckGo Lite and YouTube for all current queries.
- Discovers and deduplicates candidate URLs across multiple search rounds before selecting a smaller subset for deeper scraping.
- Deeply scrapes prioritized URLs using proxy fallbacks when needed, extracting core text, OpenGraph images, and embedded videos via Cheerio.
- Analyzes the newly gathered context to generate another 15-20 highly targeted queries for the next iteration to fill knowledge gaps.
- Source Tracking & Deduplication: Across the 3 iterations, the engine continuously tracks, filters, and deduplicates discovered sources.
- Synthesis & Citation: The most relevant extracted context is sent to the LLM to synthesize a comprehensive final response with inline citations and embedded rich media.
Note
Notes:
- Deep research discovers more sources than it deeply reads.
- Link deduplication happens during source collection.
- Image OCR is currently configured with English recognition.
- In development, the frontend is served through Vite middleware mounted inside the Express server.
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature). - Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature/YourFeature). - Open a Pull Request.
This project is distributed under the MIT License. See the LICENSE file for details.