feat: PI agent (local mlx_lm backend) + task-discipline quality pass#10
Merged
Conversation
added 3 commits
May 10, 2026 23:11
deimagjas
commented
May 12, 2026
deimagjas
commented
May 13, 2026
deimagjas
commented
May 13, 2026
deimagjas
commented
May 13, 2026
- Dockerfile.pi: shorten the header to operator-facing info only; design rationale lives in docs/agents/pi-agent.md. - Dockerfile.pi: drop the cargo-based builder stage. Ubuntu 26.04 ships ripgrep / fd-find / bat / eza in apt, so the multi-stage build was paying compile time for tools we did not need to build from source. Symlink fdfind→fd and batcat→bat for ergonomics. Trims build from ~3 min to ~50 s. Drops dust / procs / btm (they are not in apt and were nice-to-have only). - entrypoint-pi.sh: collapse the redundant "starting" phase. The gap between it and "working" is empty in PI agents (no credential copy), so we go straight to "working" — one write_status, one emit_marker. - entrypoint-pi.sh: add a comment explaining why su-exec is required: the entrypoint does root-only work first (chown of the host-mounted worktree, models.json under /home/agent), then drops to `agent` so `pi` runs unprivileged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
@earendil-works/pi-coding-agentnpm) inside a hardened Ubuntu 26.04 container (Dockerfile.pi), backed by a localmlx_lm.serverrunning on the host (managed by a newiacCLI). No Anthropic API credits consumed. Coexists with the existing Claude agents via Open/Closed: new image, new entrypoint (entrypoint-pi.sh), new Makefile section, new CLI subapp (q pi), new tests/evals/docs — without touching the existing Claude paths.entrypoint-pi.shnow wraps every--taskwith a structural preamble before invokingpi -p: enforces relative paths, narrow scope, and a mandatorygit add -A && git commit && git log -1 --onelinepostcondition. Verified end-to-end onpi/format-bytes-v2(commitfe7d407, 9/9 functional cases pass, zero side-effects in the main repo).iac server startdefaults totemp=0.2,top_p=0.9(was0.9/0.95) sincepi-coding-agentdoes not expose per-request sampling — the server defaults are what every agent actually uses. New--tempand--top-pflags override at start time.Test plan
cd app/cli && uv run pytest -q→ 92 passed (16 new pi_agents unit tests, 10 new pi_agents acceptance scenarios)uv run ruff check src tests→ cleancd config && make build-pi→ imageclaude-pi:ubuntubuilt successfully,pi 0.74.0installeduv run iac server start→ reportstemp=0.2 top_p=0.9,/v1/modelsreachablecurl http://192.168.100.1:8080/v1/modelsworks from inside the container network (host.containers.internalis NOT supported by Apple Container CLI — gateway IP workaround used and documented)q pi spawn --branch pi/smoke --task "reply OK"returns "OK" from local Gemma-26b in 5sq pi spawn --branch pi/format-bytes-v2 --task "Add format_bytes(n) to iac/main.py..."→commits=1, onlyiac/main.pytouched on the branch, no unsolicited files in the main repo, 9/9 functional cases passArchitecture (Open/Closed)
claude-agent:wolfi(Chainguard)claude-pi:ubuntu(Ubuntu 26.04, kernel 7.x)config/entrypoint.shconfig/entrypoint-pi.shmlx_lm.serveron host (local)q agents …q pi …spawn,list-agents,stop-agent, …spawn-pi,list-pi-agents,stop-pi-agent, …CLAUDE_CONTAINER_OAUTH_TOKENMAX_PI_AGENTS=1(Gemma-26b + 6 GB cache leaves little headroom)Notable design decisions verified during testing
host.containers.internal(apple/container#346). The PI container reaches the host server via the bridge gateway IP (192.168.100.1for the default subnet). Documented and exposed asPI_BASE_URL.pi-coding-agentdoes not accept per-requesttemperature/top_p(verified in itsmodels.jsonschema and CLI flags). Sampling control therefore lives iniac— at server-start time.exit_code: 0without an actual commit, and a high-temp model inventing extra files. All three are now structurally prevented by the preamble + low temp defaults, not just by hoping the orchestrator phrases the task well.Files of interest
iac/main.py,iac/pyproject.tomlconfig/Dockerfile.pi,config/entrypoint-pi.shconfig/Makefile(new PI section)app/cli/src/container_cli/commands/pi_agents.py,app/cli/src/container_cli/main.py.claude/skills/spawn-agent/SKILL.md,.claude/skills/spawn-agent/evals/evals.json(evals 9-11)docs/agents/pi-agent.md(new),docs/agents/cli.md,docs/agents/container-agent.mdapp/cli/tests/test_pi_agents.py(new, 16 cases),app/cli/tests/acceptance/features/pi_agents.feature(new, 10 scenarios)Generated with Claude Code