Kreuzberg is a polyglot document intelligence framework with a fast Rust core. We build tools that help developers extract, process, and understand documents at scale, from PDFs to Office files, images, archives, emails, in 50+ formats.
We're setting out to make high-performance document intelligence faster, cheaper, and more ecological.
A polyglot document intelligence engine
- β Rust core
- β Bindings for Python, TypeScript/Node.js, Ruby, Go, Java, C#, PHP, Elixir
- β OCR with table extraction
- β Streaming parsers for multi-GB files
- β Built-in chunking + embeddings for RAG
- β CLI, REST API, Docker, MCP server
- Read More here: https://kreuzberg.dev/
A fully managed document intelligence API. Same engine, zero setup.
Planned features:
- Hosted REST API
- Async jobs + webhooks
- Built-in chunking for RAG pipelines
- Premium OCR backends
- Usage dashboard & analytics
- Simple pay-as-you-go pricing
High-performance HTML β Markdown conversion powered by Rust. Shipping as a Rust crate, Python package, PHP extension, Ruby gem, Elixir Rustler NIF, Node.js bindings, WebAssembly, and standalone CLI with identical rendering behaviour.
- Truly polyglot β Python, Rust, JS, Ruby, Go, Java, C#, PHP, Elixir.
- High throughput β Optimized for batch workloads and multi-GB documents.
- Memory efficient β Streaming architecture keeps RAM usage constant.
- Flexible deployment β Use as library, CLI, Docker image, or REST API.
- MIT license β Safe for enterprise, commercial use, and closed-source products.
- Built for RAG β Native chunking + embeddings with full customization.
Join our dev community, ask questions, and share what youβre building.
- Discord link https://discord.gg/xzx4KkAPED
- Subreddit https://www.reddit.com/r/kreuzberg_dev/
- Linkedin https://www.linkedin.com/company/kreuzberg-dev/
- X/Twitter β https://x.com/kreuzberg_dev
Contributions are welcome! We follow a simple workflow:
- Open an issue to propose changes
- Submit a PR
- Maintainers review and merge
Please see CONTRIBUTING.md in the respective repos for detailed guidelines. Kreuzberg.dev repo https://github.com/kreuzberg-dev/kreuzberg
All open-source code is MIT licensed. Itβs permissive, enterprise-safe, and commercial-friendly.
Built with love in the heart of the creative and gritty district of Kreuzberg, Berlin