[Feature] Small Model Mode

## Problem

Nanocoder is local-first, but many of its defaults (prompt size, tool count, context management) are tuned for larger models. When using small local models (1B-8B parameters via Ollama), several things break down:

| Problem | Detail |
|---------|--------|
| **Large system prompt** | ~2,000 words (~14KB) before any conversation starts — eats 10-15% of a small model's context |
| **36+ tool definitions** | XML fallback injects all tool schemas into the system prompt, adding ~7,000 tokens |
| **Small context windows** | Most small models have 4K-32K context — the system prompt + tools leave little room for actual work |
| **Unreliable tool calling** | Malformed XML leads to correction loops that waste tokens and context |
| **Weak multi-step reasoning** | Small models struggle to chain 5+ tool calls effectively |

The goal is to make Nanocoder work well with small local models — either standalone or in a hybrid setup alongside a frontier model.

---

## Proposed Features

### 1. Slim System Prompt

Create a drastically reduced prompt (~200-300 words instead of ~2,000). Strip out philosophy, examples, and edge-case guidance. Small models perform better with short, direct instructions — they can't effectively use long prompts anyway.

A `main-prompt-slim.md` alongside the existing `main-prompt.md`, selected based on mode or model size.

### 2. Tool Subsetting

Instead of giving the model all 36+ tools, provide a focused subset. This is likely the **single highest-impact change** for small model usability.

**Static tool profiles** — predefined sets the user can select:

- `code-edit`: read_file, string_replace, write_file, find_files, search_file_contents, execute_bash
- `explore`: read_file, find_files, search_file_contents, list_directory
- `git`: git_status, git_diff, git_log, git_add, git_commit
- `minimal`: read_file, string_replace, execute_bash

**Dynamic tool selection** — use keyword matching on the user's message to pick ~5-8 relevant tools per turn.

**Progressive tool loading** — start with read-only tools, unlock write tools after the model reads relevant files.

### 3. Single-Tool Mode

Force the model to call one tool at a time instead of attempting to chain multiple calls in one response. Small models are much more reliable when focused on one action per turn.

Implemented as a prompt instruction combined with validation that rejects multi-tool responses and asks the model to pick one.

### 4. Guided Workflows

Instead of open-ended "figure it out" prompting, provide step-by-step scaffolding:

- User says "fix this bug" → Nanocoder injects: "Step 1: Read the file. Step 2: Identify the issue. Step 3: Apply the fix."
- The model only handles one step at a time, reducing reasoning load
- Implemented as prompt templates per task type (fix, explain, refactor, etc.)

### 5. Frontier Model as Planner (Hybrid Mode)

Use a frontier model (via OpenRouter or other provider) for **planning** and a local small model for **execution**:

- Frontier model analyses the task and creates a step-by-step plan with specific tool calls
- Small model executes each step (or Nanocoder auto-executes the plan)
- Dramatically cheaper — frontier model called once for planning, small model handles the rest
- Could also use the frontier model as a fallback when the small model's XML is repeatedly malformed

### 6. Aggressive Context Management

Small models need tighter context management than the current defaults:

- Lower auto-compact threshold (40% instead of 60%)
- More aggressive compression (shorter summaries, smaller truncation limits)
- Auto-drop tool result contents after they've been consumed
- Sliding window that keeps only the last N messages at full fidelity

### 7. Simplified Tool Schemas

For XML fallback, current schemas include full descriptions, parameter types, and examples. For small models:

- One-line descriptions
- Skip optional parameters
- Remove examples from schemas
- Simpler parameter names

### 8. Prefill / Constrained Output

For XML tool calling, prefill the assistant response with the opening XML tag when the expected tool is known. For example, after asking the model to read a file, prefill with `<read_file>` so it only needs to fill in the parameters. Some providers support this via assistant message prefix.

### 9. Smart Retry with Simplification

When a small model fails (malformed XML, wrong tool), instead of sending the same error back:

- Simplify the available tools (remove irrelevant ones for the retry)
- Rephrase the instruction more directly
- Provide the specific XML template to fill in
- After N failures, offer to hand off to a frontier model

### 10. Task-Specific Micro-Agents

Pre-built prompt + tool combos for common tasks small models handle well:

- **Read and explain** — read_file only, no tool output needed
- **Find and replace** — search_file_contents + string_replace
- **Run and fix** — execute_bash + read_file + string_replace

User selects the micro-agent, or Nanocoder picks based on intent.

### 11. Auto-Detection

Automatically enable small model optimisations based on model name. If the model name contains size indicators like `1b`, `3b`, `7b`, `8b`, enable the mode without requiring manual configuration. Users can override this.

---

## Configuration

```json
{
  "smallModelMode": {
    "enabled": true,
    "slimPrompt": true,
    "toolProfile": "code-edit",
    "maxToolsPerTurn": 1,
    "aggressiveCompact": true,
    "simplifiedSchemas": true,
    "plannerModel": "openrouter/claude-sonnet-4-5"
  }
}
```

Or at the provider/model level:

```json
{
  "providers": [
    {
      "name": "ollama",
      "baseUrl": "http://localhost:11434/v1",
      "models": ["llama3.2:3b"],
      "smallModelMode": {
        "enabled": true,
        "toolProfile": "minimal"
      }
    }
  ]
}
```

---

## Implementation Priority

### Phase 1 — High impact, lower effort
- [ ] Slim system prompt
- [ ] Static tool profiles (tool subsetting)
- [ ] Single-tool mode
- [ ] Aggressive auto-compact defaults for small models

### Phase 2 — High impact, higher effort
- [ ] Simplified tool schemas
- [ ] Smart retry with simplification
- [ ] Auto-detection by model name/size

### Phase 3 — Architectural
- [ ] Frontier-as-planner hybrid mode
- [ ] Guided workflows with task classification
- [ ] Task-specific micro-agents
- [ ] Prefill / constrained output


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Small Model Mode #391

Problem

Proposed Features

1. Slim System Prompt

2. Tool Subsetting

3. Single-Tool Mode

4. Guided Workflows

5. Frontier Model as Planner (Hybrid Mode)

6. Aggressive Context Management

7. Simplified Tool Schemas

8. Prefill / Constrained Output

9. Smart Retry with Simplification

10. Task-Specific Micro-Agents

11. Auto-Detection

Configuration

Implementation Priority

Phase 1 — High impact, lower effort

Phase 2 — High impact, higher effort

Phase 3 — Architectural

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem	Detail
Large system prompt	~2,000 words (~14KB) before any conversation starts — eats 10-15% of a small model's context
36+ tool definitions	XML fallback injects all tool schemas into the system prompt, adding ~7,000 tokens
Small context windows	Most small models have 4K-32K context — the system prompt + tools leave little room for actual work
Unreliable tool calling	Malformed XML leads to correction loops that waste tokens and context
Weak multi-step reasoning	Small models struggle to chain 5+ tool calls effectively

[Feature] Small Model Mode #391

Description

Problem

Proposed Features

1. Slim System Prompt

2. Tool Subsetting

3. Single-Tool Mode

4. Guided Workflows

5. Frontier Model as Planner (Hybrid Mode)

6. Aggressive Context Management

7. Simplified Tool Schemas

8. Prefill / Constrained Output

9. Smart Retry with Simplification

10. Task-Specific Micro-Agents

11. Auto-Detection

Configuration

Implementation Priority

Phase 1 — High impact, lower effort

Phase 2 — High impact, higher effort

Phase 3 — Architectural

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions