Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .gitbook-branch-readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# GitBook Documentation Branch

The `gitbook-docs` branch contains **generated** GitBook-compatible documentation,
automatically updated by GitHub Actions on every push to `main`.

**Do not edit this branch manually** — all changes will be overwritten.

## How it works

1. `scripts/prepare_gitbook_site.py` copies `docs/` into `site/`, maps root
files (`README.md`, `CONTRIBUTING.md`, `DEVELOPMENT.md`) into the site, and
expands any `{{#include ...}}` markers
2. The contents of `site/` are pushed to this branch
3. GitBook syncs from this branch
5 changes: 5 additions & 0 deletions .gitbook.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
root: ./

structure:
readme: index.md
summary: SUMMARY.md
83 changes: 83 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
name: Documentation

on:
push:
branches: [main]
paths:
- "docs/**"
- "README.md"
- "CONTRIBUTING.md"
- "DEVELOPMENT.md"
- "LICENSE"
- "SECURITY.md"
- "CONTRIBUTOR_AGREEMENT.md"
- "cli/**"
- "sdk/**"
- "server/backend/README.md"
- "scripts/prepare_gitbook_site.py"
- "scripts/check_docs.py"
- ".gitbook.yaml"
- ".gitbook-branch-readme.md"
- ".github/workflows/docs.yml"
pull_request:
paths:
- "docs/**"
- "README.md"
- "CONTRIBUTING.md"
- "DEVELOPMENT.md"
- "LICENSE"
- "SECURITY.md"
- "CONTRIBUTOR_AGREEMENT.md"
- "cli/**"
- "sdk/**"
- "server/backend/README.md"
- "scripts/prepare_gitbook_site.py"
- "scripts/check_docs.py"
- ".gitbook.yaml"
- ".gitbook-branch-readme.md"
- ".github/workflows/docs.yml"
workflow_dispatch:

jobs:
docs:
permissions:
contents: write
runs-on: ubuntu-latest
steps:
- name: Check out the repository
uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: actions/setup-python@v5
with:
python-version: "3.13"

- name: Check documentation links
run: python scripts/check_docs.py

- name: Build GitBook site
run: python scripts/prepare_gitbook_site.py

- name: Deploy to gitbook-docs branch
if: ${{ (github.event_name == 'push' || github.event_name == 'workflow_dispatch') && github.ref == 'refs/heads/main' }}
run: |
git config user.name 'github-actions[bot]'
git config user.email 'github-actions[bot]@users.noreply.github.com'

mv site/ /tmp/gitbook-site/

git fetch origin gitbook-docs || true
if git rev-parse --verify origin/gitbook-docs >/dev/null 2>&1; then
git checkout gitbook-docs
else
git checkout --orphan gitbook-docs
git rm -rf .
fi

rsync -a --delete --exclude='.git' /tmp/gitbook-site/ .
git add -A
if ! git diff --cached --quiet; then
git commit -m "docs: update GitBook documentation from ${{ github.sha }}"
git push origin gitbook-docs
fi
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,4 @@ plugins/cq/bin/

# Generated by `make sync-schema`; canonical sources live in schema/*.json.
schema/python/src/cq_schema/_data/
site/
7 changes: 7 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,13 @@ repos:

- repo: local
hooks:
- id: check-docs
name: check docs links
entry: python scripts/check_docs.py
language: system
pass_filenames: false
files: '^docs/|^README\.md$|^CONTRIBUTING\.md$|^DEVELOPMENT\.md$|^LICENSE$|^SECURITY\.md$|^CONTRIBUTOR_AGREEMENT\.md$|^cli/|^sdk/|^server/backend/README\.md$'

- id: ty-check-install
name: ty (scripts/install)
entry: bash -c 'cd scripts/install && uvx ty check src/cq_install --python .venv'
Expand Down
2 changes: 1 addition & 1 deletion DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Data-plane reads remain open:

Exploratory — this is a `0.x.x` project. Expect breaking changes to the database format and SDK interfaces before v1. We'll provide migration scripts where possible so your knowledge units survive upgrades.

See [`docs/`](docs/) for the proposal and PoC design.
See the [proposal](docs/CQ-Proposal.md) and [architecture overview](docs/architecture.md) for the design.

### Migrating from earlier releases

Expand Down
5 changes: 4 additions & 1 deletion cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ frameworks.
## Installation

```bash
# Homebrew.
brew install --cask mozilla-ai/tap/cq

# Go install.
go install github.com/mozilla-ai/cq/cli@latest

Expand Down Expand Up @@ -129,7 +132,7 @@ Knowledge units live in one of three tiers:

With `CQ_ADDR` set, `cq propose` sends the unit straight to the remote as `private` (falling back to local if the remote is unreachable). With no remote, everything stays local. `cq status` shows the count in each tier.

See the [top-level README](../README.md#knowledge-tiers) for the full description.
See the [top-level README](../README.md) for the full description.

## Development

Expand Down
26 changes: 26 additions & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Table of Contents

* [Introduction](index.md)

## Guides

* [Architecture](architecture.md)
* [Development](DEVELOPMENT.md)

## Components

* [CLI](cli/README.md)
* [CLI Development](cli/DEVELOPMENT.md)
* [Go SDK](sdk/go/README.md)
* [Go SDK Development](sdk/go/DEVELOPMENT.md)
* [Python SDK](sdk/python/README.md)
* [Python SDK Development](sdk/python/DEVELOPMENT.md)
* [Server](server/README.md)

## Reference

* [Proposal](CQ-Proposal.md)

## Community

* [Contributing](CONTRIBUTING.md)
204 changes: 204 additions & 0 deletions scripts/check_docs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
"""Validate checked-in docs before publishing.

Checks all source files that will be published to the GitBook site for broken
internal links. External links are skipped so the checker stays fast and works
offline.

Usage:
python scripts/check_docs.py
"""

from __future__ import annotations

import re
from pathlib import Path

REPO_ROOT = Path(__file__).resolve().parent.parent
DOCS_DIR = REPO_ROOT / "docs"

# Source files outside docs/ that are published to the site.
# Must stay in sync with ROOT_FILES in prepare_gitbook_site.py.
PUBLISHED_ROOT_FILES: tuple[Path, ...] = (
REPO_ROOT / "README.md",
REPO_ROOT / "CONTRIBUTING.md",
REPO_ROOT / "DEVELOPMENT.md",
REPO_ROOT / "LICENSE",
REPO_ROOT / "SECURITY.md",
REPO_ROOT / "CONTRIBUTOR_AGREEMENT.md",
REPO_ROOT / "cli" / "README.md",
REPO_ROOT / "cli" / "DEVELOPMENT.md",
REPO_ROOT / "sdk" / "go" / "README.md",
REPO_ROOT / "sdk" / "go" / "DEVELOPMENT.md",
REPO_ROOT / "sdk" / "python" / "README.md",
REPO_ROOT / "sdk" / "python" / "DEVELOPMENT.md",
REPO_ROOT / "server" / "backend" / "README.md",
)

# SUMMARY.md uses site-relative paths by design (GitBook navigation file).
# Source-relative resolution would produce false negatives, so skip it.
SKIP_LINK_CHECK: frozenset[Path] = frozenset({(DOCS_DIR / "SUMMARY.md").resolve()})

LINK_RE = re.compile(r"!\[[^\]]*\]\(([^)\n]+)\)|(?<!!)\[([^\]]*)\]\(([^)\n]+)\)")
HEADER_RE = re.compile(r"^(#{1,6})\s+(.+?)\s*$")
CODE_FENCE_RE = re.compile(r"^```")
SKIPPED_PREFIXES = ("http://", "https://", "mailto:", "tel:", "data:")


def all_published_sources() -> set[Path]:
"""Return resolved paths of every source file that will appear in the site."""
sources = {p.resolve() for p in PUBLISHED_ROOT_FILES if p.exists()}
sources.update(p.resolve() for p in DOCS_DIR.rglob("*") if p.is_file())
return sources


def strip_code_blocks(text: str) -> str:
"""Remove fenced code blocks so code samples are not linted as page links."""
output: list[str] = []
in_fence = False

for line in text.splitlines():
if CODE_FENCE_RE.match(line):
in_fence = not in_fence
output.append("")
continue
output.append("" if in_fence else line)

return "\n".join(output)


def slugify_heading(raw_heading: str) -> str:
"""Approximate the anchor slugs used by common Markdown site generators."""
heading = re.sub(r"`([^`]*)`", r"\1", raw_heading.strip().lower())
heading = re.sub(r"[^\w\s-]", "", heading)
heading = re.sub(r"\s+", "-", heading)
heading = re.sub(r"-{2,}", "-", heading)
return heading.strip("-")


def extract_anchors(path: Path) -> set[str]:
"""Collect heading anchors from a Markdown document."""
anchors: set[str] = set()
for line in path.read_text(encoding="utf-8").splitlines():
match = HEADER_RE.match(line)
if match:
anchors.add(slugify_heading(match.group(2)))
return anchors


def split_target(raw_target: str) -> tuple[str, str]:
"""Split a Markdown link target into path and optional anchor."""
target = raw_target.strip()
if target.startswith("<") and target.endswith(">"):
target = target[1:-1]
if " " in target and not target.startswith("#"):
target = target.split(" ", 1)[0]
if "#" in target:
path_part, anchor = target.split("#", 1)
return path_part, anchor
return target, ""


def resolve_target(source_path: Path, target_path: str) -> Path | None:
"""Resolve a relative link target from a source file.

Returns None for directory targets with no publishable index (source-code
directory references) rather than raising an error.
"""
base = source_path.parent
resolved = (base / target_path).resolve()

if resolved.is_dir():
for name in ("README.md", "index.md"):
candidate = resolved / name
if candidate.exists():
return candidate
return resolved # Directory with no index; caller will flag as unpublished

if resolved.exists():
return resolved

if resolved.suffix == "":
md = resolved.with_suffix(".md")
if md.exists():
return md

return resolved # May not exist; caller checks


def validate_summary(errors: list[str]) -> None:
"""Ensure docs/SUMMARY.md exists."""
if not (DOCS_DIR / "SUMMARY.md").exists():
errors.append("docs/SUMMARY.md is missing")


def iter_link_targets(text: str) -> list[str]:
"""Extract raw link targets from Markdown text (code blocks already stripped)."""
targets: list[str] = []
for m in LINK_RE.finditer(text):
raw = m.group(1) or m.group(3)
if raw:
targets.append(raw)
return targets


def main() -> int:
"""Validate docs links and anchors. Returns a process exit code."""
errors: list[str] = []
published = all_published_sources()

anchors_by_file: dict[Path, set[str]] = {}
for path in published:
if path.suffix == ".md":
anchors_by_file[path] = extract_anchors(path)

validate_summary(errors)

sources_to_check = [
p for p in sorted(published)
if p.suffix == ".md" and p not in SKIP_LINK_CHECK
]

for source_path in sources_to_check:
text = strip_code_blocks(source_path.read_text(encoding="utf-8"))

for raw_target in iter_link_targets(text):
if raw_target.startswith(SKIPPED_PREFIXES):
continue

target_path, anchor = split_target(raw_target)

if target_path == "":
target_file = source_path
else:
target_file = resolve_target(source_path, target_path)
if not target_file.exists():
errors.append(
f"{source_path.relative_to(REPO_ROOT)} -> missing target `{target_path}`"
)
continue
if target_file.resolve() not in published:
errors.append(
f"{source_path.relative_to(REPO_ROOT)} -> `{target_path}` exists but is not published to the site"
)
continue

if anchor and target_file.suffix == ".md":
target_anchors = anchors_by_file.get(target_file.resolve())
if target_anchors is not None and slugify_heading(anchor) not in target_anchors:
errors.append(
f"{source_path.relative_to(REPO_ROOT)} -> missing anchor `#{anchor}` in "
f"{target_file.relative_to(REPO_ROOT)}"
)

if errors:
print("Documentation checks failed:\n")
for error in errors:
print(f"- {error}")
return 1

print(f"Documentation checks passed ({len(sources_to_check)} files checked).")
return 0


if __name__ == "__main__":
raise SystemExit(main())
Loading
Loading