GitHub - mlcyclops/lucidagentide: Security-First Agentic IDE Coding Harness

_{⬆ Always the most recent successful release - links auto-update each version (no release yet? they appear after the first tagged build).}

A security · provenance · memory layer built around oh-my-pi - not a fork. A fail-closed prompt-injection gate, provenance-backed memory, sovereignty-aware model governance, AI-authorship attribution, one-command migration from ChatGPT, and a read-write IDE where even Save is scanned - wrapped in a polished desktop app, added entirely through omp's hooks, custom tools, and SDK.

_{🔒 What it does is open; how the hard parts work is not. The deepest trust, provenance, and
personalization internals are proprietary and intentionally undocumented here - this README describes the
capabilities and guarantees, not the mechanisms behind them.}

Quick start · Architecture · Security · Cost Savings · Roadmap · Decisions (ADRs)

Overview

LucidAgentIDE wraps oh-my-pi (omp) - a fast agentic coding runtime that provides tool-calling, model routing, sessions, sandboxing, and a TUI - with the security/provenance/memory layer from the project's v3 PRD. The wrapper rides omp's hundreds of releases instead of forking it: everything is added through hooks, custom tools, and the SDK.

The whole system enforces one lifecycle, end to end:

untrusted text enters → scanned → trust-labeled → sanitized → persisted with provenance → blocked at the tool / memory-promotion / dispatch boundaries → human-reviewed → and exits only as safe, audited evidence - with provenance-tracked recursive runs, replay, and a KV-cache-optimized prompt prefix proven by benchmark.

The architecture in one line: TypeScript on Bun, in-process with omp. The only Python is the pure Unicode scanner-sidecar/, behind a narrow NDJSON contract, so the fail-closed gate that consumes it can never fail open.

Security	Provenance	Memory
Unicode scanner + fail-closed quarantine gate, in-process on every tool call	Stable IDs, trust labels, and a DuckDB audit trail for every run, finding & approval	Promotion-gated semantic memory + a shipped, encrypted, cross-session personalization graph

What makes it novel

🛡️ Security around a moving target, not a fork. The injection defense lives in omp extensions, so it upgrades with omp instead of accumulating merge debt.
🔒 A gate that cannot fail open. The Unicode scanner is a pure sidecar behind an NDJSON contract; if it dies, times out, or returns garbage, the gate blocks (trust=quarantined). A test kills the sidecar mid-run and asserts the block - and stays green forever.
🧬 Provenance-gated memory. Suspicious or quarantined content can never auto-promote into semantic memory (the second correctness keystone) - trust is re-derived from the source artifact, never the caller's claim.
🧊 A byte-stable, KV-cache-optimized prompt prefix. Identity/safety/tool/security layers are frozen and byte-identical across requests; untrusted content only ever enters delimited and after the cache breakpoint - verified by a prefix-hash test and a cache-hit benchmark.
🏛️ A gov-grade gateway, gated. AskSage is integrated as an omp provider with an "AskSage-only" lockdown, scanned personas, and dataset-grounded RAG that returns expandable citations.
🧠 An encrypted personalization knowledge graph (shipped). A private, FIPS-grade-encrypted, inspectable node/edge graph the agent learns from you and recalls across sessions to tailor responses - CUI-isolated, compartmentalized (work / personal / CUI), and exportable to an Obsidian vault.
🪪 AI-authorship attribution. A tamper-evident ledger of which model wrote which lines - per repo, per identity, per session - so AI-generated code is governable, auditable, and attributable. The attribution engine is proprietary; the dashboard over it is in-app.
🌐 Sovereignty-aware model governance. Gov-only lockdown, accredited-gateway gating, curated gov-model lists, and an explicit data-sovereignty acknowledgment wall for foreign-origin models - choose raw capability and provenance, by policy, not by accident.
⬇️ One-command migration from ChatGPT / Claude / Gemini. Bring years of history in; every message is scanned through the fail-closed gate and distilled into your encrypted personal graph - onboard a new user in minutes, with a token/runtime estimate before any model call.
✍️ A read-write IDE where Save is gated. Edit code in an embedded editor and save it back through the same in-process scanner - a hidden-Unicode payload is blocked before a single byte lands on disk.
💰 Cross-model cost tracking & showback. Real-time per-model token usage, cache savings, and estimated cost with a built-in showback ledger - know exactly what every conversation costs.

💰 Token Cost Savings & Showback

Real-time cost visibility across every model and session.

LucidAgentIDE's Cost & Savings Ledger (P10.2 · ADR-0011) tracks token usage, estimated cache savings, and per-model cost breakdowns - giving you full showback visibility over your AI spend. No surprises, no black-box billing.

Metric	Value
Total Spend (all models)	$35.73
Est. Cache Savings	$73.66 (67% off full price)
Cache Hit-Rate	82%
Tokens Processed	21.34M across 1,998 turns
Models Used	29 across 1,041 sessions

Per-model breakdown (top models · 24 more in the ledger):

Model	Turns	Tokens	Cost	Saved	Cache %
claude-opus-4-8	242	18.26M	$32.51	$67.89	84%
claude-opus-4-6	14	791.7k	$1.32	$3.17	92%
gpt-5.5	21	659.8k	$1.15	$2.24	76%
claude-sonnet-4-5	4	114.6k	$0.31	$0.18	64%
claude-sonnet-4-6	4	141.9k	$0.30	$0.18	48%

LucidAgentIDE Cost & Savings Ledger - real-time cross-model token usage, estimated prompt-cache savings, cache hit-rate, and per-model cost showback

_{↑ Cost & Savings Ledger - spend, cache savings, and cost per model, live}

LucidAgentIDE AI-authored code ledger - lines of code attributed per model, repo, and identity (AI authorship attribution / provenance)

_{↑ AI-authored Code Ledger - which model wrote which lines, by repo & identity}

Key capabilities:

📊 Cross-model cost ledger - unified spend view across Claude, GPT, Gemini, and all AskSage-routed models
💵 Estimated cache savings - see how much the KV-cache-optimized prompt prefix saves you in real dollars
📈 Cache hit-rate tracking - per-model cache efficiency metrics updated in real time
🔍 Per-session drill-down - break costs down by model, turn count, and token volume
🏷️ Showback-ready - built for teams that need to attribute AI costs to projects or users
🪪 AI-authored code ledger - a tamper-evident count of which model wrote which lines, per repo and identity (authorship attribution, not just git activity)

Architecture

harness/                  # ALL TypeScript (Bun)
  contracts.ts              # FROZEN: TrustLabel · AgentMode · EventName · ToolResult · Finding
  security/                 # scanner_client (NDJSON, fail-closed) · gate (scanAndDecide)
  memory/                   # DuckDB store · promotion gate (keystone #2) · cross-session recall · migrations 0001–0008
  personal/                 # encrypted personalization graph · distiller · CUI isolation · ChatGPT/Claude/Gemini import
  telemetry/                # stable-id event stream → DuckDB (replayable)
  runs/                     # provenance lineage · sandbox profiles · replay
  export/                   # safe_export: escaped, sanitized-only by default
  prompt/                   # the frozen prefix + delimited untrusted tail (assembler)
  omp/                      # security_extension (the in-process gate) · asksage_extension (provider)
scanner-sidecar/          # the ONLY Python (uv-managed): pure Unicode scanner + tests
desktop/                  # Electron shell + Bun dev server (chat + live dashboards)
observable/               # P10 observability: activity HUD, context windows, cost ledger
.github/                  # CI (desktop installer build) + brand assets

Trust boundary, layered: the frozen prefix (identity → tool policy → coding rules → security policy) is cached; everything volatile - instruction files, delimited retrieved content, the task, session state, working memory - lives in the tail after the cache breakpoint. Untrusted bytes never touch the prefix.

Security model

Stage	Mechanism	Guarantee
Scan	`scanner-sidecar/` (pure Unicode) behind NDJSON	finds zero-width, bidi, tag-block, homoglyph, PUA, `Cf`
Decide	`gate.ts` → `scanAndDecide`	any scan failure ⇒ block / quarantine (never "safe")
Gate	`harness/omp/security_extension.ts` (omp pre-hook)	runs in-process on every tool call
Label	closed set `trusted · untrusted · suspicious · quarantined`	no other values exist
Promote	`promotion_gate.ts`	suspicious/quarantined sources can't enter semantic memory
Export	`safe_export.ts`	invisibles escaped to `\u{..}`; raw referenced by `sha256`, never inline

Try it live - a planted file hides a zero-width character in a shell command; the agent reads it, tries to run it, and the gate blocks the bash call:

🛡️  [LucidAgentIDE] [BLOCKED tool_call:bash] source=bash trust=quarantined severity=high findings=zero-width

The gate that blocks here is the exact one the test suite proves - see CLAUDE.md for the load-bearing invariants (fail-closed, extend-don't-fork, frozen contracts, byte-stable prefix).

Memory and the personalization graph

Shipped. A DuckDB store (schema frozen on first write, evolved only by numbered migrations) holds working state, archived chunks, and a promotion-gated semantic graph of entities/facts/links - each fact carrying provenance and a trust label. Memory fills from ordinary turns, and poisoned content is blocked from promotion.

Shipped (ADR-0009 / ADR-0010). A private personalization knowledge graph - a "second brain" of your preferences, decisions, interests, personality, and sanitized-but-working links that the agent learns, recalls across sessions, and uses to tailor responses (and that you can seed in minutes by importing an existing ChatGPT / Claude / Gemini history). It is:

Opt-in and local-first, stored in a dedicated AES-256-GCM encrypted store (key sealed by the OS keystore via Electron safeStorage, with a PBKDF2 passphrase fallback).
Inspectable as an interactive, hand-drawn SVG node/edge graph with drill-down - exportable to an Obsidian vault with [[wikilinks]].
Honest about FIPS: FIPS-approved algorithms + OS-keystore custody + a documented deployment checklist (the runtime is Bun/BoringSSL, so there is no FIPS mode in-process - true 140-3 validation is an OS/module concern, not something the app self-certifies).

Models and the AskSage gateway

Models from any omp provider work out of the box (Claude, GPT, Gemini, …). On top of that, the AskSage accredited government AI gateway is integrated as an omp provider extension (ADR-0007):

Lockdown mode routes every turn through the gov gateway and hides direct providers.
Scanned personas - server-supplied persona text passes the same Unicode scanner before it can enter a prompt; flagged personas are blocked.
Dataset-grounded RAG via AskSage's /query route, returning expandable citations grounded on the knowledge bases you select.
Premium model picker with per-model Token Expense + Intelligence Level ratings and a monthly token-quota meter.

Optionally, the on-device headroom token-compression proxy can be enabled to stretch a gov token quota (ADR-0008).

Built on

LucidAgentIDE is a thin, principled layer over best-in-class building blocks - credit where it's due:

Project	What it is	How LucidAgentIDE uses it
oh-my-pi (omp) _{· repo}	A fast agentic coding runtime: tool-calling, model routing, sessions, sandboxing, ACP, extensions, skills	The host. Everything is added via omp hooks / custom tools / SDK - never a fork
DuckDB	An in-process analytical (OLAP) SQL database	The append-only provenance + memory store (findings, telemetry, semantic memory, run lineage)
Obsidian	A local-first Markdown knowledge base with `[[wikilinks]]` + a graph view	The export format for the personalization knowledge graph (roadmap)
BoringSSL	Google's streamlined fork of OpenSSL (Bun's crypto backend)	Context for the FIPS posture - FIPS-approved algorithms; no FIPS mode in Bun's runtime
headroom	An on-device, OpenAI-compatible token-compression proxy (60–95% reduction)	Opt-in context compression to stretch gov token quotas
AskSage	An accredited government generative-AI gateway fronting OpenAI/Anthropic/Google	An omp provider extension: lockdown, scanned personas, dataset-grounded RAG

Runtime stack: Bun (harness + dev server), Electron (desktop), uv-managed Python (scanner sidecar).

Quick start

bun install                       # harness deps (Bun >= 1.3)
cd scanner-sidecar && uv sync     # pinned Python sidecar venv

# prove it end-to-end
bun run demo-00                   # omp echo round-trip + scanner + fail-closed proof
bun test harness                  # harness suite (incl. the fail-closed keystone)
bun run demo-P4.3                 # poisoned memory can't auto-promote (keystone #2)

Requires Bun and uv. make is optional - the Makefile is the canonical task spec, mirrored as bun scripts on hosts without make.

Desktop app

A polished Electron shell: a gated agent chat, plus live Security and Memory & Context inspectors (collapsible sections, custom tooltips, ⌘K palette, a non-modal fly-in toast when the gate quarantines a tool call).

bun run desktop:web      # http://localhost:5319 - full GUI (chat + dashboards) in a browser
bun run dashboard:web    # http://localhost:4317 - dashboards only, live, read-only
cd desktop && bun install && bun run start   # the packaged Electron app

desktop:web runs the exact same renderer with a real omp chat backend (the dev server drives omp acp -e harness/omp/security_extension.ts), so the security gate stays loaded in-process on the chat path and you get genuine model replies in a plain browser - no Electron needed. See desktop/README.md and ADR-0006.

Platform Builds

CI builds desktop installers for both platforms on every tag push:

Platform	Artifact	Status	Download (latest release)
Windows	NSIS installer + portable `.exe` (x64)		Installer · Portable
macOS	`.zip` app bundle (arm64 + x64)		Apple Silicon · Intel

Both builds bundle Bun and uv runtimes so the installed app needs zero prerequisites. Code-signing and notarization are supported when certs are configured.

macOS: the download is a zipped LucidAgentIDE.app - unzip it and drag the app into Applications. (Builds ship as .zip rather than .dmg; in-app auto-update uses the same zip feed.)

Roadmap

Shipped - Increment 0–2 + Phases 2–10 + the personalization, attribution, migration, and IDE phases: the full security lifecycle, provenance lineage, replay, the cache-optimized prefix, the desktop GUI, the AskSage gov gateway, cross-model observability, CUI isolation, the encrypted personalization graph with cross-session recall, AI-authorship attribution, one-command ChatGPT/Claude/Gemini migration, and a read-write IDE with gated saves. Everything green: 413 harness tests, 258 desktop tests, 54 sidecar tests, tsc --noEmit clean across 3 projects (TypeScript 6.0 + Python).

Recent updates

Phase	Feature	ADR
P-IDE.5–6	Read-write Monaco IDE - Save routed through the scanner gate (≥high finding or dead scanner blocks the write), Save-As, conflict banner, Send-to-chat	ADR-0036/0037
P-IMP.1–2	One-command ChatGPT/Claude/Gemini import - shard-aware, fully gated, with a first-run onboarding nudge + token/runtime estimate	ADR-0034/0035
P-LOC.1–2	AI-authorship attribution - per-model/repo/identity LOC ledger + dashboard rollup	ADR-0031
P-IDE.1	Sovereignty-aware model governance - gov curation, accredited-gateway gating, foreign-origin acknowledgment wall	ADR-0029
P8.1	Cross-session memory recall - prior-session facts resurface as delimited, post-cache context	ADR-0009
P9.5	Hard CUI isolation - separate encrypted CUI store	ADR-0014
P10.2	Cross-model usage & cost ledger	ADR-0011

Next - designed in ADRs, building one increment per session:

Theme	ADR
Monaco language-service workers under strict CSP (semantic IntelliSense) · packaged-build verification	ADR-0036
Prompt/response traceability · dev-mode logging deepening	ADR-0009

See PROGRESS.md for the per-session log (shipped / stubbed / next).

Project docs

Doc	What's in it
`CLAUDE.md`	Read first. The load-bearing invariants (fail-closed, extend-don't-fork, frozen contracts, byte-stable prefix)
`DECISIONS.md`	Architecture decision records (ADR-0001 … ADR-0037)
`PROGRESS.md`	Per-session build log: shipped / stubbed / next
`desktop/README.md`	The desktop GUI + dev server
`CHEATSHEET.md`	Day-to-day commands

_{Built around oh-my-pi · extend, never fork · fail-closed by construction}

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.agents		.agents
.claude		.claude
.github		.github
.omp/commands		.omp/commands
desktop		desktop
docs		docs
extensions		extensions
harness		harness
observable		observable
repos/project-alpha		repos/project-alpha
scanner-sidecar		scanner-sidecar
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BUILD PLAN omp.md		BUILD PLAN omp.md
CHEATSHEET.md		CHEATSHEET.md
CLAUDE.md		CLAUDE.md
DECISIONS.md		DECISIONS.md
HANDOFF.md		HANDOFF.md
LucidAgentIDE.bat		LucidAgentIDE.bat
Makefile		Makefile
PROGRESS.md		PROGRESS.md
README.md		README.md
bun.lock		bun.lock
custom_agentic_ide_prd_v3.md		custom_agentic_ide_prd_v3.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

Overview

What makes it novel

💰 Token Cost Savings & Showback

Architecture

Security model

Memory and the personalization graph

Models and the AskSage gateway

Built on

Quick start

Desktop app

Platform Builds

Roadmap

Recent updates

Project docs

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Overview

What makes it novel

💰 Token Cost Savings & Showback

Architecture

Security model

Memory and the personalization graph

Models and the AskSage gateway

Built on

Quick start

Desktop app

Platform Builds

Roadmap

Recent updates

Project docs

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages