Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization by konjoinfinity · Pull Request #58 · konjoai/squish

konjoinfinity · 2026-06-17T20:41:26Z

Summary

First follow-up sprint from the post-review roadmap (PR #57). Track A — lock-in & cleanup: low-risk, fully Linux-validatable items that cement the earlier except sweep and clear small debts. All 5 planned items are included.

Local CI-mode run (CI=1, which unlocks the integration/Metal-gated suite per tests/conftest.py): 4014 passed, 127 skipped, ruff check squish/ tests/ clean.

Changes

1. Enable ruff BLE001 (pyproject.toml) — any new except Exception: in squish/ now fails CI (Wall-2 enforcement of python-conventions.md). The intentional boundaries carry # noqa: BLE001; tests/** are exempted via per-file-ignores. Fixed two malformed except (ImportError, Exception) probes in astc_loader.py.

2. Catalog fetch retry/backoff (squish/catalog.py) — _background_fetch did a single urlopen with no retry. Extracted a testable _fetch_catalog_bytes() that retries transient errors (3 attempts, exponential backoff 0.5/1.0/2.0s); non-transient ValueError is not retried. Stays non-blocking; bundled catalog remains on total failure. opener/sleeper are injectable so the 5 new unit tests are fully isolated from any in-flight background-refresh thread. +5 tests.

3. python_repl resource sandbox (squish/agent/builtin_tools.py) — the REPL previously exec'd in-process with only a SIGALRM timeout, so an allocation bomb could exhaust the host. A naive in-process RLIMIT_AS cap is unworkable (the model's mmap'd weights count against the process address space). Instead it now runs in a freshly spawned child (no inherited model, ~30 MiB baseline) with hard RLIMIT_AS (default 512 MiB) + RLIMIT_CPU limits, stdout captured over a pipe, wall-clock guard in the parent. Falls back to the in-process path (logged) when isolation is unavailable. +11 tests (the tool had none before).

4. Remove dead inference-backend abstraction (squish/server.py) — _InferenceBackend/_MLXEagerBackend/_MLCBackend + _active_backend were never wired into dispatch (only read as None) and exercised solely by their own test. Removed ~53 lines + tests/integration/test_phase_f_backends.py. The --inference-backend CLI flag (a separate string global) is untouched.

5. Centralize ~/.squish (squish/config.py + 8 modules) — added squish_home() honoring $SQUISH_HOME (default ~/.squish), routed config_path() through it, replaced 19 inline Path.home() / ".squish" computations. With SQUISH_HOME unset the paths are byte-identical (no behaviour change); setting it relocates the whole on-disk footprint. +5 tests.

Notes

A3's memory cap is a resource sandbox, not a security sandbox (open is still exposed — unchanged). The spawn approach was validated against the address-space concern: a fresh child baselines at ~30 MiB, so the cap is meaningful and Linux-testable.
Tracks B (medium perf) and C (architectural perf) from the roadmap remain MLX/Apple-Silicon-only and will follow separately, validated via the macOS CI jobs.

🤖 Generated with Claude Code

Locks in the repo-wide except sweep: any new `except Exception:` in squish/ now fails CI (Wall-2 enforcement of python-conventions.md). The 12 intentional boundaries already carry `# noqa: BLE001`. Fixed two malformed probes in astc_loader.py (`except (ImportError, Exception)` → logged boundary). tests/** are exempted via per-file-ignores — they use broad catches for failure-path assertions and best-effort probes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

_background_fetch did a single urlopen with no retry. Extract a testable _fetch_catalog_bytes() that retries transient errors (OSError/URLError/timeout) up to 3 attempts with exponential backoff (0.5/1.0/2.0s); non-transient ValueError (bad URL) is not retried. Stays non-blocking (daemon thread); bundled catalog remains in effect on total failure. Adds 5 unit tests covering success, retry-then-succeed, exhaustion, non-retryable, and backoff schedule. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

…uish path Two cleanup items from the post-review roadmap (Track A): MLC backend decision — _InferenceBackend / _MLXEagerBackend / _MLCBackend and the _active_backend global were never wired into dispatch (_active_backend is only ever read as None); they were exercised solely by their own test. Removed the ~53 lines of dead abstraction and tests/integration/test_phase_f_backends.py. The --inference-backend CLI flag (a separate string global) is unchanged. Centralize ~/.squish — add squish_home() to config.py honoring $SQUISH_HOME (default ~/.squish), route config_path() through it, and replace 19 inline `Path.home() / ".squish"` computations across cli, catalog, server, daemon (squishd/launchagent), hardware/chip_detector, platform/ane_router, and runtime/auto_profile. With SQUISH_HOME unset the paths are byte-identical, so behaviour is unchanged; setting it relocates the entire on-disk footprint. Adds 5 squish_home() unit tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

…olation The retry unit tests patched the shared globals urllib.request.urlopen and time.sleep, so a background catalog-refresh daemon thread spawned by another test (test_catalog_branches via _try_refresh_catalog) raced on them and perturbed the call counts in the full-suite run. Make opener/sleeper injectable (defaulting to the real functions) and have the tests pass their own mocks — fully deterministic regardless of in-flight background threads. Production behaviour is unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

squish_python_repl previously exec'd code in-process with only a SIGALRM timeout, so an allocation bomb could exhaust the host serving process. An in-process RLIMIT_AS cap is unworkable here — the loaded model's mmap'd weights count against the process address space. Instead, run the code in a freshly *spawned* child (no inherited model, ~30 MiB baseline) with hard RLIMIT_AS (default 512 MiB) and RLIMIT_CPU limits, capturing stdout over a pipe and enforcing a wall-clock guard in the parent. Falls back to the original in-process path (logged) when process isolation is unavailable. - Shared restricted namespace (_repl_namespace) across both paths; no DRY drift. - New max_memory_mb parameter (exposed in the tool schema). - Adds 11 behavioural tests (the tool had none): output capture, error capture, blocked imports, enforced memory limit, runaway-loop termination, and the in-process fallback. Full CI-mode suite: 4014 passed, 127 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

…_AS is no-op on macOS) macOS/Darwin accepts setrlimit(RLIMIT_AS) but does not enforce it, so the 320 MB allocation ran fine under the 64 MB cap on the macOS CI runners. Skip test_memory_limit_enforced on non-Linux and document the caveat in the function: the spawn isolation, RLIMIT_CPU, and wall-clock timeout still apply everywhere, but the memory cap is Linux-enforced / best-effort on macOS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv

claude added 5 commits June 17, 2026 20:25

konjoinfinity changed the title ~~Track A: lock-in & cleanup — ruff BLE001, catalog retry, dead-backend removal, ~/.squish centralization~~ Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization Jun 17, 2026

wesleyscholl marked this pull request as ready for review June 17, 2026 21:53

wesleyscholl merged commit 6929e01 into main Jun 17, 2026
16 checks passed

wesleyscholl deleted the claude/track-a-lockin-cleanup branch June 17, 2026 21:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization#58

Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization#58
wesleyscholl merged 6 commits into
mainfrom
claude/track-a-lockin-cleanup

konjoinfinity commented Jun 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

konjoinfinity commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

konjoinfinity commented Jun 17, 2026 •

edited

Loading