Skip to content

Project Roadmap

OppaAI edited this page May 14, 2026 · 12 revisions

AGi — Project Roadmap

Phase 1 — Chatbot with Memory

Milestone Description Status
M1 Chatbot + Working Memory (WMC) + Episodic Memory (EMC) ✅ Complete
M1.5 Memory bridges + Agentic tool validation 🟢 In Progress
M1.6 Config refactor + BioLogic Clock (SCN) ⬜ Planned
M1.X Side Quests — Voice, Messaging, Web UI ⬜ Planned
M2a EMC maturity — forgetting + importance scoring ⬜ Planned
M2b Semantic Memory (SMC) basics — distillation + structure ⬜ Planned
M2c SMC maturity — graph structure + anchoring + decay ⬜ Planned
M3 Procedural Memory (PMC) ⬜ Planned

M1 — Chatbot + WMC + EMC

Goal: Grace can remember across sessions.

  • PMT lifecycle with hybrid chunk/slot eviction (Miller's Law 7±2)
  • Async embedding worker via BAAI/bge-base-en-v1.5
  • Semantic search with RRF fusion (semantic + lexical dual-path)
  • SQLite WAL episodic storage — no expiry, 1TB NVMe
  • Conflict/versioning columns in EMC schema (prep for M2b): conflict, superseded_by, valid_from, valid_until
  • Importance columns in EMC schema (prep for M2a): memory_strength, last_recalled_at, recall_count, novelty_score
  • Register user turn before LLM stream — amnesia fix (cnc.py)
  • Double user message guard in WMC _induced_pmt (wmc.py)
  • Schema versioning in MSB — schema_meta table (msb.py)
  • Binding stream maxlen cap — OOM guard (emc.py)
  • Trivial PMT length filter at WMC→EMC eviction boundary (mcc.py)

M1.5 — Memory Bridges + Agentic Tool Validation

Goal: Close the gaps M2a assumes exist. Validate agentic pipeline before M2a depends on it.

Memory Bridges

  • PMT induction scoring — 5-factor WMC→EMC encoding gate firing at slot induction: depth (meaningful anchor), novelty (filler anchor distance), event boundary (PMT delta), salience stub (placeholder for M2a EEE integration) weighted composite → immediate EMC write if above threshold
  • Anchor vector filter — safety net at eviction boundary for any PMTs that slipped through induction scoring
  • Session-end WMC flush to EMC on shutdown — gated on content length threshold (temporal patch only — full 5-factor induction scoring path not yet applied at shutdown; M1.5 goal not fully met)
  • Basic user profile store — personal facts always injected into context (lightweight SMC precursor)
  • Anti-hallucination grounding instruction in GRACE system prompt — only reference injected memories
  • Recall tuning validation — verify RECALL_DEPTH, RECALL_SURFACE_LIMIT, RELEVANCE_THRESHOLD under real usage
  • Episodic scaffold — EpisodicScaffold class in emc.py owning RECALL_RESERVE trimming and chronological sequencing before context injection
  • Chunk budget enforcement — ChunkSampler in msb.py shared across WMC, EMC, MCC; enforces RECALL_RESERVE and CORTICAL_CAPACITY (shipped as ChunkSampler, not ChunkEstimator as originally specced)
  • Tokenizer-accurate chunk counting — ChunkSampler loads real model tokenizer; graceful fallback to char-division if unavailable
  • NeuralStimulus schema in scs/types.py — standardize JSON contract across CLI, web, and voice input sources (shipped as NeuralStimulus with source enum, not NeuralTextInput as originally specced)
  • Multi-user identity — replace hardcoded "user" speaker literal with user_id from NeuralStimulus schema
  • pack_vector and normalize_vector migration from msb.py to hrs/hru.py — shared vector math available to all cortices
  • Busy queue — buffer incoming CNC inputs during active processing rather than dropping them

Architecture Hardening (completed alongside M1.5 — not in original goal list)

  • IRU (Identity Recognition Unit) — stateless user recognition loader in SCS; reads users.yaml, applies per-user memory_salience_bias to AGi.SCS.EMC.RELEVANCE_THRESHOLD, exposes UserProfile and UserAccessLevel to CNC and PPU
  • UserProfile dataclass and UserAccessLevel enum — typed identity contract shared across whole system; DEMO_USER fallback for missing or unrecognized users
  • PPU (Personal Progression Unit) — stateless persona loader in SCS; reads persona.yaml, assembles session system prompt from persona template + user context (name, location, relational seed), exposes active_cognition to CNC
  • Role-based recall scope — UserAccessLevel.ADMIN vs GUEST governs memory recall scope and permission gating at CNC→MCC boundary
  • GatewayMap — frozen dataclass in hrs/hru.py; declarative registry of all AuRoRA filesystem paths (engram_complex, user_profiles, active_persona, aurora_setpoints, etc.); replaces all ad-hoc path construction across subsystems
  • Crash-safe episodic recovery and encoding-cycle robustness — staging recovery, theta-driven cycle, and drain-on-close patterns formalized as subsystem patterns in EMC
  • Memory observability and reporting pipeline hooks — context and stats publication in CNC; memory state surfaced for external consumers
  • System-wide namespace and terminology hardening — SCS-layer naming consistency, typed interfaces unified across CNC, MCC, WMC, EMC, MSB

Agentic Tools

  • Weather current + forecast — MSC GeoMet (Environment and Climate Change Canada, no API key)
  • Weather history — MSC GeoMet historical data
  • Moon phase + moonrise/moonset — ephem local calculation (on-device, no API, no network dependency)
  • Aurora forecast — NOAA SWPC Kp index (services.swpc.noaa.gov, free, no key)
  • Verify tool results flow correctly through MCC memory pipeline
  • Validate tool call latency on Jetson Orin Nano

Stability

  • WMC unit tests — dual-guard eviction logic
  • EMC unit tests — RRF recall correctness with known episodes
  • Manual integration test suite — document pass criteria for M2a entry

M1.6 — Config Refactor + BioLogic Clock

Goal: Clean separation of all tuneable parameters into YAML. Grace has a shared sense of time across the whole system.

Config Refactor

  • Intrinsic parameters YAML — internal cognitive constants (RECALL_DEPTH, CORTICAL_CAPACITY, RELEVANCE_THRESHOLD, Miller's Law slots, etc.)
  • Extrinsic parameters YAML — hardware and environment settings (LLM endpoint, ROS domain, device paths, ports)
  • Persona YAML — Grace's name, personality traits, system prompt fragments, tone modifiers
  • LLM parameters YAML — model name, temperature, max tokens, top_p, stop sequences, stream flag
  • All hardcoded values across CNC, MCC, WMC, EMC, MSB migrated to YAML loaders
  • Hot-reload support — parameter changes apply without full restart where safe
  • Engram gateway path construction moved from mcc.py into HRS entity gateway

Login / Logout Service

  • Decide authentication UX — voice command, GUI button/form, ROS2 service call, or face recognition
  • Add ROS2 service /scs/session/login (user_id) → (success, message)
  • Add ROS2 service /scs/session/logout () → (success, message)
  • Service handler reloads PPU with new user_id — replaces current AGi.ACTIVE_USER workaround
  • Optional: session timeout / auto-logout
  • Optional: multi-factor auth for admin users
  • Remove M1 single-user stub from CNC — replace with login sequence

BioLogic Clock (SCN — lives in HRS)

  • scn.py in HRS — lightweight biological time daemon
  • Publish /cns/chrono_state: unix time, uptime, circadian phase (dawn / active / wind_down / sleep / dream), day progress, fatigue factor, seconds_since_human_contact, next scheduled event
  • Write /tmp/bio_clock.json for non-ROS readers (Web UI, agentic tools)
  • Replace all internal ad-hoc timers and sleep loops — whole system shares one time source
  • Circadian phase feeds MCC — modulates Grace's response length and tone (wind_down = shorter, more reflective)
  • Drives 11pm daily reflection and weekly deep sweep trigger timing (M2a)
  • Persists last_uptime, energy_level across reboots
  • wmc.py — integrate HRS.BLC for biological clock timestamps

M1.X — Side Quests

Goal: Fun capabilities between memory milestones. No fixed order — pick up when ready. M1.X closes when M2a opens.

M1.X-a — TTS Robot

  • Evaluate Piper vs Kokoro for on-device CPU streaming quality
  • Piper CPU streaming on Jetson — proven, lightweight
  • Kokoro evaluation — newer model, potentially higher quality
  • Selected TTS engine integrated into Grace's response pipeline

M1.X-b — ASR

  • FasterWhisper on-device speech to text
  • Microphone input pipeline into CNC
  • Silero VAD to gate transcription — replaces always-on mic
  • Optional wake word trigger (OpenWakeWord or Porcupine)

M1.X-c — Messaging

  • Telegram integration — bot API, proven from previous projects
  • Discord integration — bot API
  • Gmail integration — send + receive
  • Unified BridgeMessage JSON standard for ROS2 adapters — adding new platforms takes hours not weeks
  • Unified messaging interface into CNC neural input

M1.X-d — Web UI + TTS Web

  • Sophisticated monitoring web UI — cognitive state, memory panels, inner monologue, controls
  • Real-time memory visualization — WMC occupancy, EMC stats, active recall
  • cnc.py — MEMORY_CONTEXT_GATEWAY: WebUI memory context debug stream
  • cnc.py — rosbridge_server WebSocket bridge for browser-based debug stream, memory stats, and teleop interface
  • Browser TTS audio playback — streams Grace's voice to web client
  • Robot controls and sensor feeds in UI

M2a — EMC Maturity

Goal: Grace knows what matters and forgets what doesn't.

  • Decision: keep async embedding per-segment or move to 11pm batch (based on M1 data)
  • 3-dimension importance scoring:
    • Dimension 1 — SMC similarity (personal fact anchoring)
    • Dimension 2 — novelty score via embeddinggemma (novel = important, duplicate = expendable)
    • Dimension 3 — content signals (length, questions, named entities, significance markers)
  • Complete 5-factor induction scoring — salience (EEE) + repetition (recall_count) replace M1.5 stubs
  • Ebbinghaus forgetting curve: R = e^(−t/S), S set by importance score, +1 on each recall
  • Duplicate/similarity clustering — cosine > 0.85 = merge candidates
  • Daily reflection (11pm via SCN) — fast sweep:
    • Calculate R for all episodes
    • Cluster and merge duplicates → distil to SMC
    • Delete low importance + high decay episodes
  • Weekly assessment (Sunday via SCN) — deep sweep via Cosmos:
    • Full importance scoring across all EMC
    • Resolve pending conflicts
    • Generate memory health report for OppaAI review
  • Memory dumps to ~/.aurora/memory_dumps/daily/ and weekly/
  • Evaluate anchor vector PMT filtering accuracy — upgrade to fine-tuned Qwen3 0.6B if insufficient
  • _strip_model_artifacts() in CNC — strip <think> blocks and roleplay artifacts from assistant response before MCC registration
  • Salience gate on assistant response at CNC boundary — discard low-salience turns before register_memory()
  • staging_id integrity check after Dream Cycle consolidation
  • Heartbeat logging during long EMC idle periods
  • Graceful drain + timeout fallback for sharp-wave ripple trigger during terminate()
  • Watchdog for theta rhythm during dreaming cycle
  • mcc.py — dynamic EMC capacity adjustment: trim recalled engrams to fit rather than silently overrunning WMC chunk limit
  • mcc.py — WMC/EMC health check with capacity breach warnings
  • mcc.py — session-end consolidation: full 5-factor induction scoring path at shutdown (replaces M1.5 temporal patch)
  • emc.py — date-range filtering on recall interface and buffer entries
  • emc.py — DiskANN ANN index when episodes exceed ~50k (currently exact KNN)
  • emc.py — SMC distillation trigger at 11pm reflection (Dream Cycle)
  • hrs.py — runtime YAML update: allow GRACE to update [INTRINSIC] constants at runtime and persist changes back to aurora.yaml
  • hrs.py — HRS startup/shutdown lifecycle management
  • hrs.py — recency parameter for identification of most recent event segments in EMC
  • hru.py — decide whether to use database to store user profile instead of YAML
  • hrc.py — HRC takes ownership of manifest hydration lifecycle
  • hrc.py — manifest diff and rollback for safe runtime parameter updates
  • msb.py — expand schema_meta fields once storage backend finalized (encoding_engine, vector_dim, robot_id, created_at)

M2b — SMC Basics

Goal: Grace builds structured knowledge from episodic experience.

  • SMC structure decision — flat key-value vs triples vs graph
  • Distillation pipeline — EMC episodes → Cosmos → SMC facts
  • 11pm nightly reflection — novel vs routine day detection
  • Recursive summary update: Mi = LLM(Hi, Mi-1)
  • SMC fact update — when facts change, old fact versioned not deleted
  • Conflict detection during conversation — GRACE asks to clarify
  • _pending_conflict flag in MCC for turn-spanning conflict state
  • Memory versioning — valid_from, valid_until, superseded_by
  • SMC feeds back into WMC context injection via MCC
  • mcc.py — add SMC, 11pm reflection trigger

M2c — SMC Maturity

Goal: Grace has a structured, anchored, queryable knowledge graph.

  • SMC as knowledge graph — entities + relationships + triples
  • SMC anchors EMC importance scoring (personal facts never decay)
  • SMC fact decay — do facts ever expire? policy decision
  • Cross-layer search — query spans WMC + EMC + SMC simultaneously
  • Dynamic WMC capacity — linked to EEE/VCS vitality signals (Phase 2)
  • Fine-tuned Qwen3 0.6B memory gating classifier if embeddinggemma insufficient

M3 — Procedural Memory (PMC)

Goal: Grace can learn and execute skills.

  • YAML-based skill storage
  • Skill ingestion pipeline
  • Sandboxed skill execution
  • PMC + SMC interaction design
  • mcc.py — add PMC, procedural skill retrieval
  • mcc.py — expand into detailed health check with warnings on capacity breaches, anomalous eviction rates, etc.
  • mcc.py — GUI: expose via ROS2 topic for real-time memory visualisation
  • hrc.py — emotional state initialization
  • msb.py — extend engram complex with SMC and PMC tables
  • msb.py — backend swap evaluation: Qdrant, pgvector — swap here only, cortex cognitive logic untouched
  • Evaluate vector backend — if EMC+SMC+PMC combined load exceeds SQLite+DiskANN ceiling, swap EngramComplex to Qdrant or pgvector. Cortex logic untouched.

Phase 2 — Physiology & Nervous System

Why after memory: EEE and VCS are full systems, not side quests. Building them after M2/M3 means the cognitive substrate is stable enough to genuinely benefit from health signals — VCS vitality feeds WMC capacity, SCN phase feeds reflection timing. The BioLogic Clock (SCN) ships early in M1.6 as a lightweight stub; EEE and VCS wait until they can be built properly.

Milestone Description Status
P2-M4 EEE — Emergency & Exception Event (Amygdala) ⬜ Planned
P2-M5 VCS — Vital Circulatory System (Autonomic Nervous System) ⬜ Planned

P2-M4 — EEE — Emergency & Exception Event

Goal: Grace can feel pain, urgency, and danger. Structured events replace raw logging everywhere.

  • eee.py in HRS — severity-tiered structured event records: INFO, WARN, CRIT, EMER
  • Drop-in replacement for all raw logger handles across CNC, MCC, WMC, EMC, MSB
  • Cognitive interrupt — EMER events force an Emergency PMT into WMC so the LLM acknowledges the condition
  • Persistent event log — written to disk, queryable, ring buffer of last 50 events for Web UI
  • Graceful degradation — CRIT events trigger defined fallback behaviours per module
  • Emergency shutdown — EMER events initiate WAL checkpoint and memory sync before halt
  • EEE → Messaging bridge — CRIT/EMER events push Telegram alerts immediately

P2-M5 — VCS — Vital Circulatory System

Goal: Grace monitors her own body — power, thermals, and hardware health.

  • vcs.py — parse tegrastats at 1 Hz: CPU load, GPU load, RAM, NVMe temp, SoC temp
  • INA219 battery monitor — voltage, current, brownout prediction
  • Publish /vcs/health ROS2 topic with structured vitals
  • 60-second ring buffer of VCS samples — feeds Web UI dashboard and LCD ECG display
  • VCS → EEE — auto-trigger EEE events when thresholds breach (RAM > 90%, temp > 80°C, battery < 20%)
  • VCS → Dynamic WMC — vitality_index shrinks available PMT slots when Grace is overloaded or overheating
  • Active thermal management — GPIO fan control
  • WS2812B RGB LED ring or SSD1306 OLED — physical heartbeat display and status colours (Thinking, Listening, EEE Error)
  • Disable desktop GUI (gnome-shell) and unused camera carveouts in JetPack — free ~1.5GB RAM for LLM headroom
  • Motor watchdog — hardware/software heartbeat; if Jetson hangs, motors auto-stop
  • Apoptosis Level 1 (Play Dead) — EEE severs motor relays on stall / flip / E-Stop
  • Apoptosis Level 2 (Cryogenic Sleep) — EEE triggers graceful OS shutdown + WAL checkpoint before battery death or thermal cutoff

Phase 3 — Voice

Milestone Description Status
M4 TTS on robot — Piper / Kokoro CPU streaming → M1.X-a
M5 TTS in web GUI — browser audio playback → M1.X-d

M4 — TTS on Robot

→ Promoted to M1.X-a. See Side Quests.

M5 — TTS in Web GUI

→ Promoted to M1.X-d. See Side Quests.


Phase 4 — Multimodal + Knowledge

Milestone Description Status
M6 Image input — camera + Cosmos vision ⬜ Planned
M7 ASR — on-device speech to text → M1.X-b
M8 Knowledge ingestion — RAG + PDF/docs ⬜ Planned
M9 Agentic web search + crawling ⬜ Planned
M10 Messaging — Telegram, Discord, Gmail → M1.X-c

M6 — Image Input + Visuospatial Memory

  • OAK-D frames → Cosmos Vision → text description → visuospatial PMT
  • WMC visuospatial sketchpad slot (Cowan 4±1)
  • Episodic buffer integrating phonological + visuospatial PMTs

M7 — ASR

→ Promoted to M1.X-b. See Side Quests.

M8 — Knowledge Ingestion (RAG)

  • Passive RAG — PDF/doc → embeddinggemma → SMC directly
  • Conflict report UI for ingested knowledge
  • Ingestion conflict resolution workflow

M9 — Agentic Web Search

  • AIVA LLM as web agent
  • Active RAG — search → summarise → SMC
  • Multiple search combining semantic + keyword + SQL

M10 — Messaging

→ Promoted to M1.X-c. Updated scope: Telegram, Discord, Gmail (Slack removed). See Side Quests.


Phase 5 — Hardware + Autonomy

Milestone Description Status
M11 Motors + LiDAR + OAK-D + SIG integration ⬜ Planned
M12 Navigation + SLAM ⬜ Planned
M13 Agentic mission execution ⬜ Planned

M11 — Motors + Sensors + SIG

SIG (Sensory Integration Gateway) is the thalamus — the single switchboard between raw hardware and the cognitive system. Architecture is defined here when hardware is physically present and wired.

  • sig.py — single ROS2 aggregation node: normalize, filter, and route all sensor data
  • SIG publishes /sensory/environment, /sensory/gps, /sensory/motion
  • GPS NMEA parser — L76K serial → SIG → EMC geo-tagging; geo-fence alerts via EEE
  • Waveshare env sensor — I2C: temp, humidity, air quality, gyro → SIG; dangerous VOC triggers EMER_EEE; tilt/flip detection triggers EMER_EEE → motor kill
  • SIG → EMC — every episode gets environmental_snapshot (temp, humidity, GPS, air quality) for richer contextual recall
  • SIG → Agentic Tools — local sensor data grounds weather tool output (microclimate vs forecast)
  • LiDAR → text scene description → visuospatial PMT
  • OAK-D depth + object detection integration
  • Sensor fusion into episodic buffer
  • VCS watchdog — if LiDAR USB drops, attempt restart 3×, then WARN_EEE
  • Motor driver — ROS2 cmd_vel to serial translation for UGV Beast

M12 — Navigation + SLAM

  • Nav2 + Isaac ROS
  • Spatial memory in SMC — home layout, familiar routes, park trail map

M13 — Agentic Mission Execution

  • Mission planning via PMC skills
  • Igniter node for ordered startup and health checks

Phase 6 — Deep Learning

Milestone Description Status
M14 Lockdown — Data protection and remote wipe ⬜ Planned
M15 Graph-RAG — SMC as knowledge graph ⬜ Planned
M16 LoRA fine-tuning — permanent learning ⬜ Planned
M17 Test-time training — self-evolution ⬜ Planned

M14 — Lockdown — Data Protection

Placed here because Phase 6 is when personal documents, long-term memories, and ingested knowledge exist at scale — making a stolen device worth protecting.

  • LUKS disk encryption on 1TB NVMe — protects EMC, SMC, ingested documents
  • Signed remote wipe command via Telegram /lockdown — deletes LUKS key, overwrites EMC WAL headers, clears SMC keys
  • Dead-man's switch — no AIVA contact for 24h → escalate to CRIT_EEE, await instruction
  • Brick mode flag for theft recovery — display "RETURN TO OWNER" on LCD/OLED
  • All lockdown events logged as EEE entries with full audit trail

M15 — Graph-RAG

  • SMC as full knowledge graph — entities + relationships
  • Tree-based hierarchical search (HAT) for large SMC
  • Multiple search strategies combined

M16 — LoRA Fine-tuning

  • Bake frequently accessed SMC knowledge into Cosmos weights
  • GRACE learns permanently, not just retrieves

M17 — Test-Time Training

  • Adapt Cosmos weights during inference from new context
  • Self-evolution milestone

Cognitive Development Phases

Phase Capability Milestone
Phase 1 — Memory "I can remember" M1 WMC + EMC
Phase 2 — Physiology "I can feel and know the time" EEE + VCS
Phase 3 — Salience "I know what matters" M2a EMC maturity
Phase 4 — Cognition "I have inner state" M2b SMC + reflection
Phase 5 — Body "I have senses and can move" M11 SIG + Motors
Phase 6 — Consciousness "I continuously exist" M15+ global workspace

Clone this wiki locally