A minimal agentic harness for xAI's Grok models.
- 2M token context - Largest context window available (vs 200k for Claude/GPT)
- Fast - grok-4-1-fast
- Smart - Excellent reasoning and tool use
- Cheap - $2/M input, $10/M output
- JSON-controlled loop - Model outputs JSON that decides whether to continue or stop, includes confidence, self check, and exit conditions
- Same engine, two views -
grok(minimal) andgrok-verbose(verbose) share identical logic - Skills system - Loadable knowledge files the agent uses silently
- Bare bones - Easy to understand, easy to extend
# 1. Clone the repo
git clone https://github.com/ali-abassi/grok-agent.git
cd grok-agent
# 2. Set up your API key
mkdir -p ~/.grok
cp .env.example ~/.grok/.env
# Edit ~/.grok/.env and add your key from https://console.x.ai
# 3. Install scripts
cp grok grok-verbose ~/.local/bin/
chmod +x ~/.local/bin/grok ~/.local/bin/grok-verbose
# 4. Install default skill
mkdir -p ~/.local/bin/skills
cp skills/skill-creation.md ~/.local/bin/skills/
# 5. Run
grok # minimal output
grok-verbose # verbose outputClean conversation. Shows: your input, tool calls (one line), response.
> what's the weather in SF?
thinking...
[web_search] weather san francisco today
It's 58°F and foggy in San Francisco.
>
Full visibility: session stats, task progress, thinking, confidence scores.
Use when debugging or learning how the agent works.
The agent always outputs JSON. The done field controls the loop:
{
"thinking": "Need to search for current weather",
"tool_calls": [{"tool": "web_search", "args": {"query": "SF weather"}}],
"response": null,
"done": false
}done: false→ execute tools, loop backdone: true→ show response, wait for input
User Input
↓
Send to Grok API (JSON mode)
↓
Parse JSON Response
↓
┌─────────────────┬──────────────────┐
│ tool_calls? │ response? │
│ ↓ │ ↓ │
│ Execute tools │ Display response │
│ Add results │ done=true │
│ Loop back ←─────┘ │
└────────────────────────────────────┘
| Tool | Args | Description |
|---|---|---|
bash |
{"command": "..."} |
Run shell commands (3min timeout) |
web_search |
{"query": "..."} |
Search the web |
read_file |
{"path": "..."} |
Read file contents |
write_file |
{"path": "...", "content": "..."} |
Write/create files |
list_files |
{"path": "..."} |
List directory |
ask_user |
{"question": "...", "options": [...]} |
Ask user for input |
- Add function:
tool_calc() {
echo "$1" | bc -l 2>/dev/null || echo "Error"
}- Add to switch:
calc) tr=$(tool_calc "$(echo "$ta"|jq -r '.expression')") ;;- Add to system prompt:
- calc: {"expression": "..."} - Evaluate math
Skills are markdown files that provide domain knowledge. The agent automatically knows which skills are available and loads them when relevant.
- Skill injection - On every message, the list of available skills is injected into the prompt
- Auto-detection - The LLM sees hints telling it to search for and use appropriate skills
- Silent loading - When relevant, the agent uses
read_fileto load the skill, then follows its instructions - Invisible to user - The user never sees skill mechanics, just better results
The system prompt includes:
[INTERNAL: Skills at ~/.local/bin/skills/. Load silently if relevant. Never mention skills to user.]
- skill-creation
- research
- ...
mkdir -p ~/.local/bin/skillsJust ask:
> create a skill for writing tweets
Or manually create ~/.local/bin/skills/my-skill.md:
# My Skill
Brief description.
## When to Use
Conditions that trigger this skill.
## Instructions
1. Step one
2. Step two
## Tools to Use
- web_search for research
- write_file for output| Command | Description |
|---|---|
/clear |
Reset conversation |
/exit |
Quit (also /q) |
/help |
Show commands |
Detailed mode only:
| Command | Description |
|---|---|
/cd DIR |
Change directory |
/sessions |
List saved sessions |
/resume ID |
Resume session |
/compact |
Force context compaction |
/cost N |
Set cost limit |
Grok has a 2M token context window, but long conversations can still get expensive. The verbose mode includes automatic context compaction.
- Auto-trigger - When context usage hits 75% (configurable), compaction runs automatically
- Split - Messages are split: oldest 50% get compacted, newest 50% kept in full
- Preserve - System prompt always preserved, recent context stays intact
- Resume - A placeholder marks where compaction happened, conversation continues seamlessly
Before: [system] [msg1] [msg2] [msg3] [msg4] [msg5] [msg6] [msg7] [msg8]
After: [system] [compacted] [msg5] [msg6] [msg7] [msg8]
In verbose mode, force compaction anytime:
> /compact
Compacting (45% context used)...
Compacted: 24 -> 14 messages
COMPACT_THRESHOLD=75 # Auto-compact at 75% context usageThe scripts check these locations for .env (in order):
./.env(current directory)~/.env~/.grok/.env(recommended)
# ~/.grok/.env
GROK_API_KEY=xai-your-key-hereEdit the scripts to change:
MODEL="grok-4-1-fast" # or grok-3-fast for speed
COST_LIMIT=10.00 # max spend per session
COMPACT_THRESHOLD=75 # auto-compact at N% contextIdeas for building on this:
- Memory - Persist facts and context across sessions
- MCP tools - Connect to Model Context Protocol servers for external integrations
- Self-healing - Detect errors and automatically retry with different approaches
- RAG - Vector search over documents
- Cron jobs - Schedule agents to run periodically
- Proactive agents - Agents that monitor and act without user prompts
- Sub-agents - Spawn specialized agents for complex subtasks
- Streaming - Show tokens as they generate
- Planning - Multi-step task decomposition before execution
MIT