An AI-powered tool that transcribes, summarizes, and extracts action items from meeting recordings, lectures, and interviews.
Katip helps you save time by turning long audio recordings into useful summaries and to-do lists. Upload your meeting, lecture, or interview recording, and Katip will:
- Transcribe the audio to text using OpenAI's Whisper
- Summarize the key points and important decisions
- Extract action items and create a task list
Available as a web app, desktop app (Windows, macOS, Linux), and mobile app (Android).
- ποΈ Audio Transcription - Convert speech to text with Whisper
- π Smart Summaries - Get structured summaries of main topics and decisions
- π€ Local LLM Support - Use Ollama, LM Studio, or Llama.cpp for private, offline summarization
- βοΈ Cloud LLM Support - OpenAI, Groq, OpenRouter, Gemini integration
- β Task Extraction - Automatically identify and list action items
- π Multi-language - Support for 10 languages
- π» Cross-platform - Web, desktop, and mobile apps
- π Local-First - Your data stays on your device by default, optional cloud sync
- π¨ Modern UI - Clean interface with dark mode support
- π Open Source - Fully transparent and customizable
- β‘ GPU Acceleration - Vulkan support for faster transcription
- Node.js (v20 or higher)
- pnpm (v10 or higher)
- Rust (latest stable)
- LLVM (Windows only, required for whisper-rs compilation)
For mobile development:
- Android Studio (for Android)
# Clone the repository
git clone https://github.com/odest/katip.git
cd katip
# Install dependencies
pnpm install
# Start development
pnpm devDesktop App:
# CPU-only (default)
pnpm tauri dev
# With Vulkan GPU acceleration (recommended for AMD/NVIDIA GPUs)
pnpm tauri dev -- --features vulkanWeb App:
pnpm --filter web devBuild for Production:
# Desktop (CPU-only)
pnpm tauri build
# Desktop with GPU acceleration
pnpm tauri build -- --features vulkan
# Android
pnpm tauri android build- Upload Audio - Drop your meeting or lecture recording
- Transcription - Whisper converts speech to text
- AI Processing - LLM analyzes the transcript
- Get Results - View summary and action items
If you are using the Web version and want to connect to a local LLM provider like Ollama or LM Studio, you need to configure CORS to allow requests from the browser.
For Ollama, set the OLLAMA_ORIGINS environment variable before starting the server:
# Windows (PowerShell)
$env:OLLAMA_ORIGINS="*"; ollama serve
# Mac/Linux
OLLAMA_ORIGINS="*" ollama serveFor LM Studio, enable CORS in the server settings (Settings β Server β Enable CORS).
For the desktop app, you need to download a Whisper model in ggml format. Models are available at Hugging Face. Recommended models:
- tiny/base - Fast, lower accuracy
- small/medium - Balanced
- large-v3-turbo-q5_0 - Best accuracy, requires more resources
- Frontend: Next.js, React, TypeScript
- Desktop/Mobile: Tauri 2.0, Rust
- Transcription: whisper.cpp (native), Transformers.js (web)
- AI: OpenAI Whisper, OpenAI-compatible LLM providers
- Styling: Tailwind CSS, shadcn/ui
- State: Zustand
- Database: SQLite (local), SQLocal (web), PostgreSQL (cloud sync)
- Build: pnpm, Turborepo
katip/
βββ apps/
β βββ native/ # Desktop & mobile (Tauri + Next.js)
β β βββ src/ # Next.js frontend
β β βββ src-tauri/ # Rust backend
β βββ web/ # Web app (Next.js)
βββ packages/
β βββ ui/ # Shared UI components, hooks, stores
β βββ database/ # Drizzle ORM schemas (SQLite & PostgreSQL)
β βββ i18n/ # Translations (10 languages)
β βββ eslint-config/ # Shared ESLint rules
β βββ typescript-config/ # Shared TypeScript config
We welcome contributions! Please check CONTRIBUTING.md for guidelines.
This project is licensed under GPL-3.0. See LICENSE for details.
Built with tauri-nextjs-template