CodeAGI is an experimental autonomous cognition runtime for persistent agent research in digital workspaces.
It is not AGI. It is a serious, test-backed system for exploring whether an agent can become more useful over time through persistent memory, world modeling, planning, verification, reflection, scheduling, guarded execution, and longitudinal evaluation.
The current runtime is real and exercised by tests:
- persistent mission, task, world-state, queue, memory, and eval storage
- working, semantic, and procedural memory
- world entities, relations, and snapshot history
- planner, verifier, critic, and reflection loop
- guarded multi-step execution with cycle traces
- scheduler-backed mission queue selection
- real workspace actions:
- read/write/append files
- list directories
- safe command execution
- repo search
- patch application
- policy checks for command execution
- repeatable repo eval fixtures
- CLI commands for runtime control, diagnostics, and repo evals
The test suite currently covers:
- runtime initialization and persistence
- mission/task creation and status tracking
- working memory, plans, critiques, reflections, semantic facts, and procedures
- world-model updates and dependency relations
- guarded command policy
- real file, command, search, and patch execution in a workspace root
- multi-step cycle execution and stop conditions
- repo fixture evaluation
Run it locally:
cd codeagi
python3 -m pip install --user .
python3 -m unittest discover -s tests -vcp .env.example .env
export CODEAGI_RUNTIME_ROOT="$HOME/CodeAGI/runtime"
export CODEAGI_LONG_TERM_ROOT="$HOME/CodeAGI/long-term"
export CODEAGI_WORKSPACE_ROOT="$HOME/CodeAGI/workspace"If you want long-term memory on the external 4TB drive, override it explicitly:
export CODEAGI_LONG_TERM_ROOT="/Volumes/CodeAGI-4TB/CodeAGI"python3 -m pip install --user .
python3 -m codeagi doctorpython3 -m codeagi init
python3 -m codeagi statuspython3 -m codeagi mission create "search repo for deploy_app and inspect deployment code"
python3 -m codeagi runpython3 -m codeagi eval repo --fixture repo_search
python3 -m codeagi eval repo --fixture repo_patchSupported commands:
python3 -m codeagi initpython3 -m codeagi statuspython3 -m codeagi runpython3 -m codeagi doctorpython3 -m codeagi mission create "..." [--priority N]python3 -m codeagi mission listpython3 -m codeagi task create <mission_id> "..." [--action-kind ...]python3 -m codeagi task listpython3 -m codeagi eval repo --fixture repo_search|repo_patch
CodeAGI supports optional LLM integration for smarter planning, safety critique, and reflection. It works with any OpenAI-compatible API — Ollama, OpenAI, Groq, DeepSeek, and others. No external Python packages are required; all HTTP calls use the standard library.
When the LLM is unavailable, the system automatically falls back to its built-in keyword and rule-based heuristics.
| Variable | Default | Description |
|---|---|---|
CODEAGI_LLM_ENABLED |
0 |
Set to 1 to enable LLM calls |
CODEAGI_LLM_BASE_URL |
http://localhost:11434/v1 |
OpenAI-compatible API base URL |
CODEAGI_LLM_MODEL |
qwen3:14b |
Model name |
CODEAGI_LLM_API_KEY |
(empty) | API key (not needed for Ollama) |
Ollama (default, no API key needed):
ollama pull qwen3:14b
export CODEAGI_LLM_ENABLED=1
export CODEAGI_LLM_BASE_URL=http://localhost:11434/v1
python3 -m codeagi runOpenAI:
export CODEAGI_LLM_ENABLED=1
export CODEAGI_LLM_BASE_URL=https://api.openai.com/v1
export CODEAGI_LLM_MODEL=gpt-4o-mini
export CODEAGI_LLM_API_KEY=sk-...
python3 -m codeagi runGroq:
export CODEAGI_LLM_ENABLED=1
export CODEAGI_LLM_BASE_URL=https://api.groq.com/openai/v1
export CODEAGI_LLM_MODEL=llama-3.3-70b-versatile
export CODEAGI_LLM_API_KEY=gsk_...
python3 -m codeagi runCommand execution is intentionally restricted.
Currently allowed command families are limited to a safe set:
pwdlscatechorgfindpython3without arbitrary flags
Commands containing dangerous tokens or shell metacharacters are blocked by policy and fail the task.
CodeAGI does not currently claim:
- human-level intelligence
- AGI
- open-ended autonomy
- unrestricted shell control
- production reliability in hostile or high-risk environments
It does claim, honestly, that the current repo contains a working autonomous-agent research runtime with real execution, real persistence, real evaluation hooks, and real safety constraints.