Cotabby is a macOS menu bar app for on-device inline autocomplete. The core loop is:
- Track the currently focused editable field through Accessibility.
- Monitor global keyboard input without stealing focus.
- Decide whether the field, permissions, settings, and runtime are eligible.
- Build an autocomplete request from the focused text context and optional visual context.
- Generate locally through Apple Intelligence or llama.cpp.
- Normalize the model output into a short continuation.
- Render ghost text near the caret.
- Insert accepted chunks when the user presses
Tabwhile keeping the remaining tail alive.
Privacy and local-first behavior matter. Do not introduce hosted API dependencies unless the user explicitly asks for that direction.
- Explain both the "what" and the "why" for architecture and code changes.
- Assume the user is actively learning Swift, AppKit, Accessibility APIs, llama.cpp integration, async/await, actor isolation, and macOS app architecture.
- Teach at the file, type, and subsystem level, not just the line level.
- Call out tradeoffs when there are multiple valid approaches.
- Prefer clean boundaries over quick coupling, especially across
App,UI,Services,Models, andSupport.
When creating or editing a file, explain:
- what the file is responsible for
- why the file exists as its own boundary
- which objects own it or collaborate with it
- how data flows into and out of it
When adding a struct, class, enum, actor, or protocol, explain:
- what responsibility it owns
- what other objects it collaborates with
- why it should exist as its own type instead of being folded into another file
- how long it lives and who owns it
Cotabby/App/: app entrypoint, composition root, lifecycle wiring, and coordinators.Cotabby/UI/: SwiftUI/AppKit presentation: settings, onboarding, menu views, overlays, and visual affordances.Cotabby/Services/: side-effectful boundaries: Accessibility, input monitoring, text insertion, screenshots/OCR, visual context, llama runtime, permissions, downloads, updates, and launch services.Cotabby/Models/: shared value types, settings snapshots, states, domain models, and protocol contracts.Cotabby/Support/: pure helper logic, prompt rendering, availability rules, normalization, reconciliation, geometry helpers, and low-level bridging utilities.CotabbyTests/: unit and microbench tests. Prefer testing pureSupport/andModels/logic when possible.CotabbyInference: the llama.cpp wrapper, consumed as a SwiftPM package (github.com/FuJacob/cotabbyinference, pinned tomain) rather than vendored in-tree.
Start here when you need to understand lifecycle:
Cotabby/App/Core/CotabbyApp.swiftCotabby/App/Core/AppDelegate.swiftCotabby/App/Core/CotabbyAppEnvironment.swift
CotabbyAppEnvironment builds the long-lived dependency graph once. AppDelegate starts, stops,
and wires cross-subsystem subscriptions. SwiftUI views should observe objects from that graph
rather than creating services directly.
This ownership rule prevents duplicate Accessibility observers, duplicate input monitors, runtime reload races, and mismatched settings state.
Read the coordinator in this order:
Cotabby/App/Coordinators/SuggestionCoordinator.swiftCotabby/App/Coordinators/SuggestionCoordinator+Lifecycle.swiftCotabby/App/Coordinators/SuggestionCoordinator+Input.swiftCotabby/App/Coordinators/SuggestionCoordinator+Prediction.swiftCotabby/App/Coordinators/SuggestionCoordinator+Acceptance.swift
The coordinator owns orchestration and user-facing state. It should not absorb every rule. Prefer:
SuggestionRequestFactoryfor pure request constructionSuggestionAvailabilityEvaluatorfor pure gating decisionsSuggestionSessionReconcilerfor acceptance and active-tail reconciliationSuggestionTextNormalizerfor backend-independent output cleanupSuggestionWorkControllerfor generation task identity/cancellationSuggestionInteractionStatefor active suggestion session storage
This split matters because autocomplete is a state machine. Pure rules are easier to test and reason about than coordinator mutations.
Focus and geometry live in:
FocusTracker: observes focus/value/selection changes and publishes snapshots.FocusSnapshotResolver: reduces raw AX elements into Cotabby-supported focus snapshots.AXTextGeometryResolver: resolves caret and input geometry.AXHelper: low-level Accessibility/Core Foundation helper calls.FocusModels: pure focus values, identities, capabilities, and debug inspection data.
Accessibility data is eventually consistent and app-specific. Browser editors, Electron apps,
native AppKit fields, and secure fields expose different AX shapes. Preserve stale-result guards,
focusChangeSequence, and capability checks unless the change explicitly replaces them.
Visual context currently flows through:
VisualContextCoordinator: field-scoped visual-context session lifecycle.ScreenshotContextGenerator: screenshot -> OCR ->OCRTextHygienecleanup -> bounded excerpt.WindowScreenshotService: captures the relevant window or region.ScreenTextExtractor: Vision OCR extraction, carrying per-line recognition confidence.OCRTextHygiene: pure cleanup of raw OCR (drops low-confidence lines and chrome noise). There is no model summarization step; a base model conditions fine on cleaned raw context.VisualContextModels: configuration, status, and excerpt values.
Do not put raw screenshots, unbounded OCR dumps, or noisy AX tree text directly into prompts. Normalize, bound, and mark unavailable states explicitly. Screen Recording permission is separate from Accessibility and Input Monitoring.
Runtime generation is split by responsibility:
SuggestionEngineRouter: selects Apple Intelligence vs Open Source.FoundationModelSuggestionEngine: Apple on-device generation path.LlamaSuggestionEngine: request-to-prompt, llama result handling, and cache reset handoff.LlamaRuntimeManager: UI-facing runtime state, model selection, warmup, and lifecycle control.LlamaRuntimeCore: serialized actor around mutable llama.cpp pointers, prompt tokenization, KV-cache reuse, sampling, an optional deterministic constrained decoder (runConstrainedDecode, gated behind the default-offcotabbyConstrainedDecoderEnabled), and shutdown.BaseCompletionPromptRenderer: prompt construction for the Open Source path. The llama models are now base (non-instruct) GGUFs, so this renders a pure text continuation: no instruction preamble, custom rules and context fold into a short conditioning preface (a base model conditions on description, it does not obey commands), sections are character-budgeted viaPromptSectionBudget, and the caret prefix comes last.FoundationModelPromptRendererstays instruct-shaped because Apple's Foundation Models path gives us a first-class instructions channel.
Keep llama.cpp pointer work serialized inside LlamaRuntimeCore. The manager should publish state;
the core should own native correctness.
OverlayControllerowns the ghost-text panel lifecycle and positioning.SuggestionOverlayPresenterdecides whether a suggestion should be shown or hidden.ActivationIndicatorControllerowns the optional caret/field-edge indicator.FocusDebugOverlayControlleris for developer visibility and should stay gated behind debug options, not normal user settings.- Settings panes (under
Cotabby/UI/Settings/Panes/) and onboarding views should remain presentation-focused. Push behavior into services, models, or support helpers.
- Use
@MainActorfor UI, AppKit, SwiftUI state, most Accessibility access, and published models. - Use actors or explicit serialization for mutable native/runtime state.
- Do not block the main actor with OCR, screenshots, model loading, or generation.
- Make cancellation and stale-result checks explicit around async work. The user can keep typing, switch apps, focus another field, or accept a partial suggestion while work is still running.
- Prefer narrow protocols from
SuggestionSubsystemContracts.swiftwhen the coordinator only needs behavior, not a concrete service. - Treat Core Foundation and AX bridging as unsafe boundaries. Add comments that explain ownership, casting, and failure handling.
- Add real teaching comments, not labels.
- Prefer file-level and type-level
///comments that explain purpose, ownership, and design. - Add targeted inline comments for tricky lifecycle behavior, concurrency, cancellation, AX timing, Core Foundation bridging, native pointer state, and macOS quirks.
- Comments should explain why the code is written this way, which invariant it protects, or which pitfall it avoids.
- Avoid useless comments that merely restate the code.
- If Swift syntax is likely to be unfamiliar, annotate it briefly the first time it appears in a new
concept-heavy area. Examples:
@Published,@ObservedObject,@StateObject,@MainActor,Task, async/await, actor isolation, closures, convenience initializers,AXUIElement,CFTypeRef, andunsafeBitCast.
Prefer this order when changing behavior:
- Pure rules in
Support/ - Domain models and contracts in
Models/ - Service boundary behavior in
Services/ - Coordinator orchestration in
App/ - SwiftUI/AppKit presentation in
UI/
This order reduces regression risk because deterministic code changes before stateful orchestration. It also creates better tests.
Cotabby has a structured logging system built for AI-assisted debugging. During development the app
is launched with -cotabby-debug, which enables on-disk JSONL sinks in addition to the always-on
Console.app stream.
Log file locations (only populated when -cotabby-debug is set):
~/Library/Logs/Cotabby/cotabby.jsonl— main event stream. One JSON object per line, with all metadata flattened as top-level fields so it can be filtered withjq.~/Library/Logs/Cotabby/llm-io.jsonl— full LLM prompts and completions, one record per generation. Sharesrequest_idwith the main log so a single suggestion can be joined across files.~/Desktop/cotabby-ax-dump.txt— most recent Chrome AX tree snapshot. Overwritten on each Chrome focus change (debounced by focused-element identity).- Rotated previous logs:
*.jsonl.1(one-step rotation when a file exceeds 10 MB).
Correlation IDs. Every prediction gets a request_id like req_a3f9k2lq, stamped on every log
line touching that request (coordinator state transitions, router selection, engine generation, LLM
I/O capture). Pull a complete history of one suggestion:
jq 'select(.request_id == "req_a3f9k2lq")' ~/Library/Logs/Cotabby/cotabby.jsonl
jq 'select(.request_id == "req_a3f9k2lq")' ~/Library/Logs/Cotabby/llm-io.jsonlUseful jq recipes:
# Recent errors across the app
jq 'select(.level == "error")' ~/Library/Logs/Cotabby/cotabby.jsonl
# Llama generations slower than 500 ms
jq 'select(.engine == "llama" and .latency_ms > 500)' ~/Library/Logs/Cotabby/llm-io.jsonl
# Coordinator state transitions
jq 'select(.category == "suggestion" and .stage != null)' ~/Library/Logs/Cotabby/cotabby.jsonl
# Runtime model load/decode events
jq 'select(.category == "runtime")' ~/Library/Logs/Cotabby/cotabby.jsonlSymptom → category map:
- Ghost text didn't appear →
suggestion+focus - Wrong text inserted → look up the request in
llm-io.jsonl, then walksuggestionfor acceptance - Model won't load / decode fails →
runtime+models - Permission dialog loop →
app(permission state transitions) - Chrome-specific weirdness → start with
~/Desktop/cotabby-ax-dump.txt, thenfocus - Wrong backend chosen →
suggestionrouter selection log (engine,fallback_engine)
Console.app fallback (when -cotabby-debug wasn't set):
log show --predicate 'subsystem == "com.cotabby.app"' --last 10m
log stream --predicate 'subsystem == "com.cotabby.app"' --level debugRule of thumb. When a user reports a bug, first tail / jq the relevant file with the
symptom → category map. Do not ask the user to re-explain symptoms before checking the logs.
Use the narrowest meaningful validation first, then broaden if the change touches shared behavior. Common commands:
xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build \
-derivedDataPath build/DerivedData
xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build-for-testing \
-derivedDataPath build/DerivedDataAlways pass -derivedDataPath build/DerivedData so the output lands in the repo-scoped build/
directory (already gitignored) instead of accumulating under
~/Library/Developer/Xcode/DerivedData/Cotabby-*, where every build leaves a fresh multi-GB module
cache and SwiftPM checkout that nothing trims. When a task is done and the artifacts are no longer
needed, rm -rf build/DerivedData before reporting completion.
Run targeted tests for changed pure logic when available. If xcodebuild test fails locally because
of app-hosted test bundle signing or Team ID mismatch, report the exact failure and still provide the
successful build/build-for-testing result.
- The worktree may already contain user edits. Never revert unrelated changes.
- Before editing, inspect
git status -sband the relevant files. - Keep commits scoped. Do not silently include unrelated dirty files.
- Avoid destructive commands such as
git reset --hardorgit checkout --unless the user explicitly asks for that operation.