feat: local Docker sandbox + A2A inner loop with pluggable backends#196
feat: local Docker sandbox + A2A inner loop with pluggable backends#196mdear wants to merge 3 commits into
Conversation
Local Docker sandbox runtime: - DockerSandbox provider with shell executor and port pool manager - Orphan container cleanup with configurable TTL - Docker compose local stack with stack_control.sh tooling - e2b.Dockerfile: gh CLI v2.89+ installed, adapter deps included - Storage proxy router for MinIO-backed local file serving - Frontend: local sandbox support in workspace state and UI A2A inner loop framework: - A2AInnerLoop strategy with SSE streaming client - CircuitBreaker (threshold=5) with automatic native fallback - EventStreamAdapter mapping SSE events to agent runtime events - ContextAdapter for conversation history parity with native loop - ToolBridge for bidirectional tool registration between backends - AdapterServer (FastAPI/uvicorn) running inside sandboxes on :18100 - Backend registry: simulate, copilot, claude_code, codex Copilot backend: - CopilotBackend using GitHub Copilot SDK (github-copilot-sdk>=0.1.25) - 15 native sandbox tools bridged to Copilot CLI - Fresh sessions per run for reliable tool availability - CLI path resolution: bundled SDK binary primary, gh fallback Bug fixes: - A2A reasoning events visible in frontend (delta_status tracking) - SSE stream kept open across SDK continuation turns - ToolInvocation TypedDict argument extraction + sandbox FOWNER Config & infra: - AgentSettings: AGENT_INNER_LOOP_MODE, AGENT_A2A_BACKEND, fallback - SandboxSettings: SANDBOX_PROVIDER=docker, host config - StorageSettings: STORAGE_SERVE_BASE_URL for proxied URLs - CreditsSettings: CREDITS_BILLING_ENABLED toggle for self-hosted - Alembic migration: add summary_authority column - Untrack docker/.stack.env.local (contains local secrets) Tests: 730+ unit tests, E2E test plan (17/23 PASS, 6 DEFERRED) Docs: design docs, implementation guide, E2E test plan
|
This rebased PR is based on #172 |
|
Coming soon : A2A replacement inner loop backends for Chat Mode |
|
Hi @mdear, could you split this pr to resolve DockerSandbox first? I think it could be splitted into 3 prs, and we can work with you and resolve each individually! Thank you so much for your constant support |
Chat A2A turn loop: - Add A2A turn loop service for chat mode with event translation layer - Route chat requests through A2A adapter (Copilot) in local mode - A2A client singleton with URL tracking and auto-refresh on sandbox change - Council service with parallel LLM execution and synthesis via A2A - Fix council text doubling: separate delta collection from full_content - Fix async coroutine bug in file_processor, vectorstore, LLM providers (await get_storage().read() instead of anyio.to_thread.run_sync) Session lifecycle: - Add delete_after column and schedule-delete endpoint for timed cleanup - Orphan sandbox cleanup with async-safe threading - Frontend session delete UI (sidebar + project list) Infrastructure: - Expand stack_control.sh (build-sandbox, patch-sandbox, cleanup, setup) - Health endpoint reports A2A inner loop mode and backend - Agent settings for A2A billing strategy and multipliers - Migration for session delete_after column Testing: - Add comprehensive E2E test suite (32 tests, 11 categories) with content doubling detection and server error scanning - Add unit tests: council billing, A2A turn loop, orphan cleanup - Add repository integration tests - E2E test-cycle prompt for autonomous fix/retest workflow Docs: - A2A billing model, inner loop parity, tool bridge gap analysis - Chat A2A integration assessment, implementation guide
|
Yes, I can do that. Stand by, I'll split into three progressive PRs, each that can stand on their own, each building on the previous PRs, with appropriate unit and e2e tests accompanying each. I'm happy to say the local model with CoPIlot inner loop appears to be stable and working well. I haven't done extensive testing yet but the proof of concept is now holding up. First PR : Docker local and core unit test rewrite (most unit tests were written against the outdated develop branch). I held off writing many front end unit tests due to lack of technical experience, as my expertise is primarily back-end, and these features are heavily backend and sandbox weighted. Second PR : Agentic A2A inner loop replacement Third PR : Chat A2A inner loop replacement, council billing foundation, A2A council billing enhancements |
|
This PR has been split into three progressive PRs for easier review. All content from this PR is covered across:
Merge order: #198 → #199 → #200 Closing this PR in favour of the split. |
Summary
Local Docker sandbox runtime and A2A inner loop framework with pluggable backends for self-hosted deployments.
Local Docker Sandbox
DockerSandboxprovider with shell executor and port pool managerstack_control.shtooling (build-sandbox, patch-sandbox, cleanup, setup)e2b.Dockerfile: gh CLI v2.89+ installed, adapter deps includedA2A Inner Loop Framework
A2AInnerLoopstrategy with SSE streaming clientCircuitBreaker(threshold=5) with automatic native fallbackEventStreamAdaptermapping SSE events to agent runtime eventsContextAdapterfor conversation history parity with native loopToolBridgefor bidirectional tool registration between backendsAdapterServer(FastAPI/uvicorn) running inside sandboxes on :18100Chat A2A Turn Loop
A2ATurnLoopServicefor chat mode with event translation layer(
await get_storage().read()instead ofanyio.to_thread.run_sync)Copilot Backend
CopilotBackendusing GitHub Copilot SDK (github-copilot-sdk>=0.1.25)Session Lifecycle
delete_aftercolumn andschedule-deleteendpoint for timed session cleanupdelete_aftercolumnBug Fixes
get_storage().read()not awaited in 5 locations)Config & Infra
AgentSettings: A2A billing strategy, multipliers, inner loop mode configSandboxSettings: SANDBOX_PROVIDER=docker, host configStorageSettings: STORAGE_SERVE_BASE_URL for proxied URLsCreditsSettings: CREDITS_BILLING_ENABLED toggle for self-hosteddocker/.stack.env.local(contains local secrets)Testing
(INF, CHAT, IMG, WEB, CODE, SESS, AGEN, XFEAT, HIST, CNCL, A2A)
Docs