A local-first personal AI knowledge operating system.
Dump everything, structure it, search it semantically, traverse it as a graph, and let AI agents operate it — all entirely on your Mac, with no cloud dependency and no data leaving your machine.
Most knowledge tools are either too simple (note apps that can't reason) or too complex (enterprise wikis that require a team to maintain). KnowledgeOS is built on a different premise:
Your knowledge should be a living system, not a filing cabinet.
The design decisions all follow from this:
- Local-first, always. Every byte lives under
~/KnowledgeOS/. No sync service, no cloud account required. You own your data unconditionally. - Agent-ready from the start. Claude, ChatGPT, Codex, and local LLMs can search, read, write, and link knowledge through a built-in MCP server — not as a bolt-on feature, but as a first-class access mode.
- Structured but flexible. Everything is a typed object (
KosObject) with a common base. Pages, sources, assets, claims, projects, and chats are all objects you can search, link, and reason over in the same interface. - Soft everything. No hard deletes. No permanent overwrites. Every agent write is logged with a before/after diff. You can always undo, audit, or restore.
- Durable indexes, not canonical truth. Qdrant (vector) and Kùzu (graph) are re-buildable from Postgres + filesystem. Postgres is the operational source of truth. Indexes are caches.
The full product vision is in project-phases/IDEA-DRAFT.md.
Phases 1–4, Phase 5 AI Assistant, Phase 6A/6B Chat Import, Phase 7A MCP Read/Search, Phase 7B MCP Write Tools, and Phase 8A Workspace Lite are complete. Phase 8 (multi-pane workspaces), Phase 9 (career memory), and the multilingual search hardening track are in progress or planned.
See PROGRESS.md for the canonical progress tracker.
| Phase | Status |
|---|---|
| Phase 1 — Foundation | ✅ Complete |
| Phase 2 — Sources & Rich Media | ✅ Complete |
| Phase 3 — Search | ✅ Complete |
| Phase 4 — Graph Lite | ✅ Complete |
| Phase 5 — AI Assistant + Inbox/Triage | ✅ Complete |
| Phase 6A — Chat Import Lite | ✅ Complete |
| Phase 6B — Structured Chat Import | ✅ Complete |
| Phase 7A — MCP Read/Search | ✅ Complete |
| Phase 7B — MCP Write Tools | ✅ Complete |
| Phase 8A — Workspace Lite (split pane) | ✅ Complete |
| Hardening — Search Quality / Multilingual | 🚧 Partial (ILIKE fallback only) |
| Phase 8 — Multi-Pane Workspaces | ⬜ Planned |
| Phase 9 — Career & Project Memory | ⬜ Planned |
| PHASE-PHONE-00 — Mobile Architecture & Contract | ✅ Complete (docs only) |
| PHASE-PHONE-01A/B/C — Mobile Auth + Networking + iOS Scaffold | ⬜ Planned |
| PHASE-PHONE-02A/B → 06 — iOS client, capture, AI, edit-lite, offline, device install | ⬜ Planned |
Test counts (as of 2026-05-15):
- Frontend unit/component tests: 27 in
apps/web/src/**/__tests__/ - API integration + backend unit tests: 147 across
tests/api/andtests/unit/(131 api + 16 unit) - Worker extractor tests: 19 in
tests/worker/ - MCP package tests: 35 in
services/mcp/tests/ - Playwright E2E tests: 10 in
tests/e2e/specs/ - Total: 238 tests across all local suites
See each phase's plan in project-phases/ for the full subtask spec.
Browser (Next.js 14 + React + Tiptap)
└─► FastAPI /api/v1
├─► Postgres 16 ← operational source of truth
│ objects, pages, edges, chunks, agent_runs, ingestion_jobs
├─► Redis ← RQ job queue + cache
├─► ~/KnowledgeOS/library/ ← binary assets (SHA-256 content-addressed)
├─► Qdrant ← vector index (re-buildable)
└─► Kùzu ← graph index (re-buildable)
Python Workers (RQ)
└─► ingestion → text extraction → chunking → embedding → graph sync → AI extraction
MCP Server (:8765)
└─► read / search / write / ingest tools
↑
Claude / ChatGPT / Codex / Cursor / local agents
Canonical source of truth: Postgres (structured data) + local filesystem (binary assets).
Qdrant and Kùzu are indexes. Treat them as caches; they can be rebuilt from Postgres at any time.
All host ports bind to 127.0.0.1 only. LAN access is opt-in.
| Layer | Technology |
|---|---|
| Frontend | Next.js 14 + React 18 + Tailwind CSS |
| Editor | Tiptap / ProseMirror (stores content as JSON) |
| Backend API | FastAPI + SQLAlchemy 2.0 async |
| Background workers | Python + Redis + RQ |
| MCP server | Python MCP server over stdio (Phase 7A) |
| Operational DB | Postgres 16 |
| Vector DB | Qdrant (local Docker) |
| Graph DB | Kùzu (embedded in worker process) |
| Queue / cache | Redis 7 |
| Asset storage | Local filesystem, content-addressed by SHA-256 |
| AI | OpenAI API (Phase 5+); Ollama local LLMs (later) |
| Deployment | Docker Compose on Mac |
| iPhone client (planned) | SwiftUI native iOS + XcodeGen (apps/ios/, Phase PHONE-01C onward) |
- macOS, Docker Desktop, Node.js 20+, pnpm 10+, Python 3.12+, uv
# 1. Clone and enter the repo
git clone <repo-url> && cd agentic-knowledge-management
# 2. Create local directories + copy .env
bash scripts/setup.sh
# 3. Edit infra/.env as needed — all settings have working defaults
# (SESSION_SECRET is optional; see docs/SECURITY.md for the session model)
# 4. Start all services
docker compose -f infra/docker-compose.yml up -d
# 5. Open the app
open http://localhost:3000Register an account on first visit. All data stays local.
Simpler, non-engineer walkthrough: quickstart.md.
| Symptom | What to do |
|---|---|
bind: address already in use on 8001 |
Free 127.0.0.1:8001 (often a local uvicorn on that port). Stop it, then docker compose -f infra/docker-compose.yml up -d again. |
Mount error / path contains YOUR_USERNAME |
Set LIBRARY_ROOT in infra/.env to your real library path (under ~/KnowledgeOS/). Run bash scripts/setup.sh (it patches a stale placeholder). If a bad volume already exists: docker compose -f infra/docker-compose.yml down, docker volume rm infra_library-data, then up -d. |
| Internal Server Error from the web UI / Next Module not found | Stale node_modules volume for kos-web: docker compose -f infra/docker-compose.yml rm -sf web, docker volume rm infra_kos-web-node-modules, docker compose -f infra/docker-compose.yml up -d --build web. If you use pnpm dev on the host with the API in Docker, add apps/web/.env.local from apps/web/.env.local.example. Confirm docker ps shows kos-web on port 3000. |
pnpm dev # Next.js dev server on :3000 with hot reload
pnpm typecheck # tsc --noEmit (strict mode)
pnpm lint # ESLint via next lint
pnpm build # production buildcd services/api
uv run uvicorn app.main:app --reload --port 8000 # dev server
uv run ruff check . # lint
uv run ruff format . # format
uv run python -c "from app.models import *; print('OK')" # sanity checkTests run against a live knowledgeos_test Postgres database (auto-created and torn down per test). Requires Postgres running (via Docker or local).
cd tests
uv run pytest api/ unit/ -v # API integration + backend unit tests
uv run pytest worker/ -v # worker extractor tests
uv run pytest api/test_pages.py -v # single file
uv run pytest api/test_pages.py::test_create_page # single test
# MCP package tests (separate project)
cd ../services/mcp && uv run pytest tests/ -v # MCP tool + config testsPlaywright E2E tests live in tests/e2e and run against the local Compose stack. Use a sandbox library root so tests never write to the real ~/KnowledgeOS/library:
LIBRARY_ROOT=$PWD/tests/e2e/.tmp/library docker compose -f infra/docker-compose.yml up -d
pnpm test:e2e
pnpm --dir tests/e2e report| Layer | Command |
|---|---|
| Frontend unit/component | make test-unit or pnpm test:web |
| API + unit pytest | make test-api |
| Worker extractors | make test-worker |
| MCP package | make test-mcp |
| Playwright E2E | make test-e2e or pnpm test:e2e |
| Everything | make test-all or pnpm test:all |
make test-e2e expects the Compose stack to be running with a sandbox library root:
LIBRARY_ROOT=$PWD/tests/e2e/.tmp/library docker compose -f infra/docker-compose.yml up -d --build
make test-e2ecd services/api
uv run alembic revision --autogenerate -m "description"
uv run alembic upgrade headdocker compose -f infra/docker-compose.yml up -d # start (background)
docker compose -f infra/docker-compose.yml up # start (foreground logs)
docker compose -f infra/docker-compose.yml down # stop
docker compose -f infra/docker-compose.yml down -v # stop + wipe volumes (destructive)Compose defaults include bind-mounted sources, Alembic before uvicorn, and optional demo seed. With seeding enabled, sign in as demo@example.com / demo-demo-demo unless overridden in infra/.env.
Docker Compose publishes services on loopback-only host ports:
| Service | Host | In-container |
|---|---|---|
| Web | 127.0.0.1:3000 |
3000 |
| API | 127.0.0.1:8001 |
8000 |
| Postgres | 127.0.0.1:5433 |
5432 |
| Redis | 127.0.0.1:6379 |
6379 |
| Qdrant HTTP | 127.0.0.1:6333 |
6333 |
| Qdrant gRPC | 127.0.0.1:6334 |
6334 |
Use http://127.0.0.1:8001 for host-side clients talking to the dockerized API. Use http://api:8000 only from inside the Docker network. If you run the API natively with uvicorn --port 8000, host-side clients should use http://127.0.0.1:8000.
Create a local backup with:
bash scripts/backup.shBackups are written to ~/KnowledgeOS/backups/<timestamp>/ and include a custom-format pg_dump, a compressed library tarball, and a best-effort Qdrant snapshot metadata file. Postgres plus ~/KnowledgeOS/library/ are canonical; Qdrant is a rebuildable search index.
To restore manually, stop the app, restore postgres.dump into a clean Postgres database with pg_restore, unpack library.tar.gz back under ~/KnowledgeOS/, then rebuild/reindex derived search data as needed. Do not rely on Qdrant snapshots as the only backup of user data.
.
├── apps/web/ Next.js frontend
│ └── src/
│ ├── app/ App Router: (auth)/ and (app)/ route groups
│ ├── components/ UI components (editor, assets, auth, layout)
│ ├── lib/ API client, SWR hooks
│ └── types/ Shared TypeScript types
│
├── apps/ios/ SwiftUI iPhone client (planned — Phase PHONE-01C scaffolds it)
│ ├── project.yml XcodeGen source of truth
│ └── KnowledgeOS/ App / Core / Features / Resources
│
├── services/api/ FastAPI backend
│ └── app/
│ ├── api/v1/ Route handlers (auth, objects, pages, assets, health)
│ ├── core/ Deps, security, library, storage utilities
│ ├── db/ Session, base model
│ ├── models/ SQLAlchemy ORM models
│ ├── schemas/ Pydantic request/response schemas
│ └── services/ Business logic layer
│
├── services/worker/ RQ background worker (Phase 2+) — run as `rq worker kos-ingest`
├── services/mcp/ MCP server, stdio transport (Phase 7A)
│
├── packages/
│ ├── shared-types/ Reserved (currently empty placeholder)
│ ├── schemas/ Reserved (currently empty placeholder)
│ └── prompts/ Reserved (currently empty placeholder)
│
├── infra/ Docker Compose, Dockerfiles, .env.example
├── scripts/ setup.sh, run_tests.sh, backup.sh
├── tests/api/ Integration tests (pytest + httpx ASGI)
├── tests/unit/ Pure unit tests (parsers, URL safety)
├── docs/ Architecture, data model, API, MCP, ingestion, security
└── project-phases/ Phase-by-phase implementation plans
Every entity in the system — pages, assets, notes, bookmarks, collections — is a row in the objects table with a kind discriminator. Specialized tables (pages, assets, etc.) extend it by foreign key.
objects (id, user_id, kind, title, tags[], metadata{}, is_pinned, is_archived, deleted_at)
├── pages (content_json, content_text, word_count, version)
├── assets (sha256, storage_path, content_type, size_bytes, status)
└── edges (source_id, target_id, kind: link/child/citation/related/...)
Future object types (sources, claims, projects, chats, concepts, tasks, workspaces) extend the same base. See docs/DATA_MODEL.md for the full schema.
Binary files are stored content-addressed under LIBRARY_ROOT:
~/KnowledgeOS/library/assets/<sha256[:2]>/<sha256>/original.<ext>
Write-once semantics: if the path already exists, the upload is a no-op (deduplication by hash).
All user-owned records use deleted_at (nullable timestamp). Hard deletes are never performed. Trash is filtered out by deleted_at IS NULL in all standard queries.
The complete product is built across 9 phases. Phases 1–6B, Phase 7A, and Phase 8A are done; remaining phases are planned.
| Module | Description | Phase | Status |
|---|---|---|---|
| Wiki pages | Rich Tiptap editor, auto-save, word count, typed links | 1 | ✅ |
| Asset library | Upload, dedup, preview, gallery, full-screen modal | 1 | ✅ |
| Rich media sources | PDF text+thumbnail, YouTube oEmbed+transcript, web article, CSV preview | 2 | ✅ |
| Keyword + semantic search | Postgres FTS, Qdrant vector, hybrid reranking, Cmd+K modal | 3 | ✅ |
| Graph Lite | Typed edges, backlinks, related, ObjectPicker, LinkToModal, GraphPanel | 4 | ✅ |
| AI assistant + Inbox/Triage | Summarize, extract claims/tasks, suggest links, KB Q&A, triage inbox | 5 | ✅ |
| Chat import (raw) | ChatGPT/Claude/Markdown/plain text import → searchable chats | 6A | ✅ |
| Chat structured import | AI summary, decisions, claims/tasks extracted with turn refs | 6B | ✅ |
| MCP read/search server | stdio MCP with search_objects, hybrid_search, get_*, get_related_objects |
7A | ✅ |
| Workspace Lite (split pane) | Open any object in a side pane from search/backlinks/related | 8A | ✅ |
| MCP write tools | create_page, update_page, create_edge, archive_object, ingest_url, ingest_file |
7B | ✅ |
| Multi-pane workspaces | Persistent layouts, drag-across-pane, workspace-scoped AI | 8 | ⬜ |
| Career/project memory | Project schema, evidence-linked resume bullets, STAR stories | 9 | ⬜ |
| iPhone client — contract | Mobile spec docs: MOBILE_APP, MOBILE_API_CONTRACT, MOBILE_NETWORKING | PHONE-00 | ✅ |
| iPhone client — MVP | Bearer auth, SwiftUI scaffold, read/search/capture/AI on iOS Simulator + device | PHONE-01A/B/C → 03C | ⬜ Planned |
- Career memory: project records, AI-generated resume bullets & STAR stories — see AGENT_GUIDE.md.
| Phase | Goal | Status |
|---|---|---|
| 1 — Foundation | Docker, Postgres, auth, page CRUD, Tiptap editor, asset upload | Done |
| 2 — Sources & Rich Media | PDF/YouTube/web/CSV ingestion, RQ worker, citation edges | Done |
| 3 — Search | Chunking, Postgres FTS, Qdrant vectors, hybrid search, Cmd+K UI | Done |
| 4 — Graph Lite | Typed edge API, backlinks, related objects from Postgres edges | Done |
| 5 — AI Assistant + Inbox/Triage | AI sidebar, summarize/extract/suggest, KB Q&A, triage inbox | Done |
| 6A — Chat Import Lite | Raw upload/paste of ChatGPT/Claude/Markdown/text exports | Done |
| 6B — Structured Chat Import | AI summaries, extracted claims/tasks, turn-grounded graph links | Done |
| 7A — MCP Read/Search | stdio MCP server, read-only tools, internal-token auth | Done |
| 8A — Workspace Lite | Frontend split pane (no schema change) | Done |
| Hardening — Search Quality | Multilingual ILIKE fallback ✅; snippet sanitization, JP fixtures, debug UI, index-status endpoint pending | Partial |
| 7B — MCP Write Tools | create_page, update_page, create_edge, archive_object, ingest_url, ingest_file |
Done |
| 8 — Multi-Pane Workspaces | Persistent layout engine, saved workspaces, workspace-scoped AI | Planned |
| 9 — Career Memory | Project schema UI, resume bullet generator, STAR story generator | Planned |
| PHONE-00 — Mobile Architecture & Contract | MOBILE_APP, MOBILE_API_CONTRACT, MOBILE_NETWORKING docs | Done |
| PHONE-01A — Mobile Backend Auth & API | Bearer tokens, /auth/mobile-login, /mobile/bootstrap, fix get_current_user stub |
Planned |
| PHONE-01B — Mac ↔ iPhone Networking | infra/docker-compose.mobile.yml, scripts/mobile_network_check.sh, ATS strategy |
Planned |
| PHONE-01C — iOS App Scaffold | SwiftUI app + XcodeGen project.yml, Connect screen, health check |
Planned |
| PHONE-02A/B — API Client & Simulator QA | Typed API client, Keychain auth store, simulator MCP smoke tests | Planned |
| PHONE-03A/B/C — Read & Search / Capture / AI MVP | Hybrid search, page/source/chat/project readers, capture, KB Q&A on iPhone | Planned |
| PHONE-04 — Edit-Lite | Title/tags/plain-body edits with Tiptap JSON safety | Planned |
| PHONE-05 — Offline Cache & Queue | Recent-object cache + outgoing note queue | Planned |
| PHONE-06 — Device Install & Private Release | Physical iPhone install via Xcode, TestFlight checklist | Planned |
Each phase has a detailed spec in project-phases/.
Phases 7A + 7B are shipped. KnowledgeOS exposes a local stdio MCP server in services/mcp/ that external agents (Claude Desktop, Claude Code, Cursor, Codex) can spawn as a subprocess. The server talks to FastAPI over 127.0.0.1 using a shared MCP_INTERNAL_TOKEN.
Enable it by setting MCP_ENABLED=true and a random MCP_INTERNAL_TOKEN in infra/.env, then point your agent client at uv run --project services/mcp kos-mcp. For the dockerized API, keep MCP_API_BASE_URL=http://127.0.0.1:8001; for a native API run on port 8000, override it to http://127.0.0.1:8000.
To enable write tools, also set MCP_ALLOW_WRITE_TOOLS=true.
Read/search tools (Phase 7A — live): search_objects, hybrid_search, get_object, get_page, get_source, get_related_objects, answer_from_kb
Write tools (Phase 7B — live, requires MCP_ALLOW_WRITE_TOOLS=true):
create_page— create a new pageupdate_page— update title/content/tags (supportsexpected_versionfor optimistic locking)create_edge— link two objects with a typed relationshiparchive_object— soft-archive (reversible via restore endpoint)ingest_url— ingest a URL as a new source (web/youtube)ingest_file— ingest a local file fromLIBRARY_ROOT
Safety invariants (always enforced):
- Every write tool call creates an
agent_runsaudit row - Mutating tools (
update_page,archive_object) createobject_revisionsrows with before/after snapshots - Rate-limited per agent identity: 60 writes/minute, 600 writes/hour
- Soft-delete only —
archive_objectis reversible, no hard deletes through MCP - No arbitrary shell execution through any MCP tool
- No file access outside
~/KnowledgeOS/ - API keys and session secrets are never returned in tool outputs
See docs/MCP_TOOLS.md for the full tool spec and docs/SECURITY.md for the auth and audit design.
A native SwiftUI iPhone client lives alongside apps/web/ at apps/ios/ (scaffold lands in Phase PHONE-01C). It is a thin client — reads, searches, captures, and asks grounded questions through the same /api/v1 surface the browser uses. No second backend; no direct DB or filesystem access from the phone.
Phase PHONE-00 (docs-only) is complete. The contract is fixed; implementation phases are planned:
docs/MOBILE_APP.md— product spec, MVP scope, screen map, endpoints intentionally NOT exposed on mobiledocs/MOBILE_API_CONTRACT.md— bearer auth contract (proposed for Phase 01A), verified read-side endpoint table, error envelope, paginationdocs/MOBILE_NETWORKING.md— Simulator / LAN / Tailscale profiles,infra/docker-compose.mobile.ymldesign, ATS strategy
Implementation phases live under project-phases/PHASE-PHONE-*.md (PHONE-01A backend auth → PHONE-06 device install). The big picture is in project-phases/IDEA-iPHONE-APP.md.
How to open it in Xcode (once Phase PHONE-01C lands):
cd apps/ios
xcodegen generate # produces KnowledgeOS.xcodeproj
open KnowledgeOS.xcodeprojTestFlight distribution is Phase PHONE-06.
Copy infra/.env.example to infra/.env before starting Docker.
| Variable | Default | Required |
|---|---|---|
SESSION_SECRET |
(not set) | No — reserved for future signed URLs |
POSTGRES_DB |
knowledgeos |
No |
POSTGRES_USER |
kos |
No |
POSTGRES_PASSWORD |
kospass |
No |
LIBRARY_ROOT |
~/KnowledgeOS/library |
No |
OPENAI_API_KEY |
(not set) | Phase 5+ |
Press ? anywhere in the app (outside a text field) to open the keyboard shortcut overlay.
| Key | Action |
|---|---|
⌘K |
Open search |
⌘\ |
Collapse / expand sidebar |
? |
Show shortcuts |
j / ↓ |
Move highlight down (lists) |
k / ↑ |
Move highlight up (lists) |
Enter |
Open highlighted item (lists) |
Backspace |
Soft-delete highlighted item (lists) |
See docs/SHORTCUTS.md for the full reference.
If you are a coding agent (Claude, Codex, Cursor) working in this repo:
- Read
CLAUDE.mdfirst — it is the operating contract for agents. - Read
docs/AGENT_GUIDE.mdfor data model rules and agent write constraints. - Postgres is the source of truth. Do not mutate DB directly — go through the API or typed service functions.
- Every write must create an
agent_runsaudit row. - Qdrant and Kùzu are indexes. They are re-buildable; never treat them as canonical.
- Soft-delete only.
deleted_atis the pattern. Never callDELETEon user data.
| File | Contents |
|---|---|
docs/ARCHITECTURE.md |
System diagram, service boundaries, sequence flows |
docs/DATA_MODEL.md |
Full Postgres schema, object types, edge types |
docs/API.md |
REST API reference |
docs/MCP_TOOLS.md |
MCP tool spec, resources, prompts |
docs/INGESTION.md |
Ingestion pipeline stages, job types, media handling |
docs/SECURITY.md |
Auth, sessions, audit log, MCP safety model |
docs/AGENT_GUIDE.md |
Rules for AI agents writing to this system |
project-phases/IDEA-DRAFT.md |
Original full product spec |
project-phases/PHASE-1-FOUNDATION.md |
Phase 1 subtask spec |
docs/MOBILE_APP.md |
iPhone app product spec — MVP scope, screen map, endpoint exclusion list |
docs/MOBILE_API_CONTRACT.md |
iPhone bearer-auth contract + verified read-side endpoint table |
docs/MOBILE_NETWORKING.md |
Simulator / LAN / Tailscale profiles + ATS strategy |
project-phases/IDEA-iPHONE-APP.md |
Canonical iPhone-track concept and phase plan |