Skip to content

stormlightlabs/documango

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Documango

Documango is a terminal-first documentation browser for Go, Rust, AT Protocol, GitHub, and other ecosystems. It ingests source materials into a single SQLite database (.usde) with compressed Markdown, full-text search, and agent-friendly metadata.

Requirements

  • Go 1.24+ (module declares go 1.24.5)
  • rg (ripgrep) optional for section extraction; falls back to grep
  • Git for AT Protocol ingestion

Quick Start

Install & build the CLI:

go mod tidy
task build

Initialize a database:

./tmp/documango init -d ./tmp/docs.usde

Ingest AT Protocol documentation:

./tmp/documango add atproto -d ./tmp/docs.usde

Ingest a Hex package (Elixir/Gleam):

./tmp/documango add hex gleam_stdlib -d ./tmp/docs.usde

Ingest GitHub repository documentation:

./tmp/documango add github folke/snacks.nvim -d ./tmp/docs.usde

Search (namespace-aware):

./tmp/documango search "rust/serde/Serialize"
./tmp/documango search "atproto/lexicon/app.bsky.feed.post"
./tmp/documango search -p "go/net" "Client"

Read a document (raw markdown):

./tmp/documango read -d ./tmp/docs.usde atproto/lexicon/com.atproto.repo.createRecord

Read a document (rendered with Glamour):

./tmp/documango read -d ./tmp/docs.usde -r -w 80 atproto/lexicon/com.atproto.repo.createRecord

Extract a section by heading:

./tmp/documango read section -d ./tmp/docs.usde -q "Definition" -r atproto/lexicon/com.atproto.repo.createRecord

Usage

Configuration

Documango uses XDG Base Directory paths with DOCUMANGO_HOME override support:

  • macOS: ~/Library/Application Support/documango/
  • Linux: ~/.local/share/documango/ (data), ~/.config/documango/ (config), ~/.cache/documango/ (cache)

Configuration is stored in TOML format:

./tmp/documango config show
./tmp/documango config set display.render_markdown true
./tmp/documango config edit

Commands

Database
  • documango init [database-name]: create a new .usde database
  • documango init -p /path/to/db.usde: create at explicit path
Add (Ingest)
  • documango add go <module>: ingest Go module from proxy.golang.org
  • documango add go --stdlib [-s <start>] [-m <max>]: ingest Go stdlib packages
  • documango add atproto: ingest AT Protocol lexicons, specs, and docs
  • documango add hex <package>: ingest Elixir or Gleam package from Hex.pm
  • documango add rust <crate>: ingest Rust crate from crates.io
  • documango add github <owner/repo>: ingest Markdown documentation from GitHub repository
Search
  • documango search [-l N] [-t TYPE] [-f FORMAT] [-p PREFIX] <query>
    • Namespace Aware: Queries starting with rust/, go/, atproto/, hex/, or github/ automatically filter by that namespace.
    • Path Qualified: Searching for rust/serde/Serialize automatically treats rust/serde/ as a package prefix and Serialize as the symbol query.
    • FTS5 Optimized: Handles special characters (/, ::, -) automatically by quoting terms to prevent SQL syntax errors.
    • Formats: table (default), json, paths
    • Types: Func, Type, Package, Lexicon, etc.
Read
  • documango read [-r] [-w N] [-s SECTION] <path>: read full document
  • documango read section -q <heading> [-r] [-w N] <path>: extract section by heading
    • Flags: --rg (force ripgrep), --gr (force grep)
List & Info
  • documango list [--type PREFIX] [--tree] [--count]: list all documentation paths
  • documango info <path>: show document metadata
Cache
  • documango cache status: show cache size and entry count
  • documango cache list [PREFIX]: list cached items
  • documango cache prune [--age N]: remove entries older than N days
  • documango cache clear [--type TYPE]: clear cache
Config
  • documango config show: display current configuration
  • documango config get <key>: get configuration value
  • documango config set <key> <value>: set configuration value
  • documango config edit: open in $EDITOR
  • documango config path: print config file path
  • -q, --quiet: suppress non-error output
  • --no-color: disable colored output
Global Flags
  • -d, --database PATH: database path (default: XDG data directory)
  • -v, --verbose: enable verbose output
  • -q, --quiet: suppress non-error output
  • --no-color: disable colored output
MCP (Model Context Protocol)
  • documango mcp serve [--stdio] [--http ADDR]: start the MCP server
    • --stdio: use standard input/output (for Claude Desktop, etc.)
    • --http: use streamable HTTP transport on the given address
  • -d, --database PATH: specify the database to serve

Model Context Protocol (MCP)

Documango exposes its documentation through the Model Context Protocol, allowing AI agents to search and read documentation programmatically.

Tools

  1. search_docs(query, package): Search for documentation symbols or guides.
  2. read_doc(path): Retrieve the full decompressed Markdown content of a document.
  3. get_symbol_context(symbol): Retrieve a minimal token signature and summary for a symbol.

Integration

To use with Claude Desktop or Antigravity, add the following to your configuration:

{
  "mcpServers": {
    "documango": {
      "command": "/path/to/documango",
      "args": ["mcp", "serve", "--stdio"]
    }
  }
}

Data Layer

Ingesters

Go

Documango uses a single Go ingestion pipeline for both:

  • Go modules via proxy.golang.org
  • Go standard library via pkg.go.dev (directory list) + go.googlesource.com (archive fetch)

Caching: Module zips and stdlib tarballs are cached indefinitely in ~/.cache/documango/ to avoid re-fetching.

The standard library is stored in the Go namespace with paths like:

  • go/net/http
  • go/crypto/tls

Go ingestion extracts:

  • Markdown docs via gomarkdoc
  • FTS5 search entries (name, type, body)
  • Agent context (signatures + synopsis)
AT Protocol

Ingests three documentation sources from Bluesky's GitHub repositories:

  • Lexicons: JSON schemas converted to Markdown (atproto/lexicon/*)
  • Protocol Specs: Technical specifications from atproto-website (atproto/spec/*)
  • Developer Docs: Tutorials and guides from bsky-docs (atproto/docs/*)
Hex.pm (BEAM)

Ingests documentation for BEAM (Elixir, Erlang, and Gleam) packages directly from Hex.pm.

  • Gleam: Parses package-interface.json to extract full type signatures, function signatures with labeled parameters, type definitions with constructors, and documentation strings. Generates comprehensive Markdown with Gleam syntax code blocks.
  • Elixir/Erlang: Extracts documentation and metadata from ExDoc's search_data-*.js.

Caching: Documentation tarballs are cached in ~/.cache/documango/hex/packages/.

The packages are stored in the hex namespace with paths like:

  • hex/phoenix/Phoenix.Controller
  • hex/gleam_stdlib/gleam/list
GitHub

Ingests Markdown documentation from GitHub repositories. Automatically discovers and processes all Markdown files including README, docs/ folders, and nested documentation.

  • API mode: For smaller repositories, fetches content via the GitHub Git Trees API and raw file URLs
  • Clone mode: Falls back to git clone for large repositories when the tree API returns truncated results
  • Front matter: Extracts title from YAML front matter (e.g., title: My Doc) or falls back to the first H1 heading
  • Rate limiting: Respects GitHub API rate limits with automatic retry and wait behavior

Caching: Cloned repositories are cached in ~/.cache/documango/github/repos/ for reuse across ingestions.

Documents are stored in the github namespace with paths like:

  • github/folke/snacks.nvim/README.md
  • github/folke/snacks.nvim/docs/dashboard.md
  • github/changesets/changesets/docs/intro.md

Model

Documentation is stored in a single SQLite database, called Unified Semantic Documentation Engine (.usde).

SQLite Schema

Documango stores all documentation in a single SQLite database (.usde). The design is intentionally simple and optimized for fast local search and cheap retrieval:

  • documents holds compressed Markdown blobs, keyed by a virtual path (e.g., go/net/http)
  • search_index is an FTS5 virtual table (trigram tokenizer) that supports fast substring search and ranking
  • agent_context stores low‑token summaries and signatures for fast AI retrieval without decompressing full docs
erDiagram
  documents ||--o{ search_index : "doc_id"
  documents ||--o{ agent_context : "doc_id"

  documents {
    INTEGER id PK
    TEXT path
    TEXT format
    BLOB body
    BLOB raw_html
    TEXT hash
  }

  search_index {
    TEXT name
    TEXT type
    TEXT body
    INTEGER doc_id FK
  }

  agent_context {
    INTEGER doc_id FK
    TEXT symbol
    TEXT signature
    TEXT summary
  }
Loading

Search scoring details:

  • Exact match bonus: +100 if the symbol name exactly matches the query
  • BM25 score: Subtracted from the bonus (BM25 returns lower values for better matches)
  • Result: Higher scores = more relevant

Notes

  • Stdlib ingestion can be rate-limited by upstream. Use -s/-m to ingest in batches.
  • The read section command searches headings and returns the section until the next same-or-higher heading level.
  • Cache is stored in:
    • ~/.cache/documango/ (Linux)
    • ~/Library/Caches/documango/ (macOS)
  • Default database is created in:
    • ~/.local/share/documango/default.usde (Linux)
    • ~/Library/Application Support/documango/default.usde (macOS)