Skip to content

Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization#58

Merged
wesleyscholl merged 6 commits into
mainfrom
claude/track-a-lockin-cleanup
Jun 17, 2026
Merged

Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization#58
wesleyscholl merged 6 commits into
mainfrom
claude/track-a-lockin-cleanup

Conversation

@konjoinfinity

@konjoinfinity konjoinfinity commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

First follow-up sprint from the post-review roadmap (PR #57). Track A — lock-in & cleanup: low-risk, fully Linux-validatable items that cement the earlier except sweep and clear small debts. All 5 planned items are included.

Local CI-mode run (CI=1, which unlocks the integration/Metal-gated suite per tests/conftest.py): 4014 passed, 127 skipped, ruff check squish/ tests/ clean.

Changes

1. Enable ruff BLE001 (pyproject.toml) — any new except Exception: in squish/ now fails CI (Wall-2 enforcement of python-conventions.md). The intentional boundaries carry # noqa: BLE001; tests/** are exempted via per-file-ignores. Fixed two malformed except (ImportError, Exception) probes in astc_loader.py.

2. Catalog fetch retry/backoff (squish/catalog.py) — _background_fetch did a single urlopen with no retry. Extracted a testable _fetch_catalog_bytes() that retries transient errors (3 attempts, exponential backoff 0.5/1.0/2.0s); non-transient ValueError is not retried. Stays non-blocking; bundled catalog remains on total failure. opener/sleeper are injectable so the 5 new unit tests are fully isolated from any in-flight background-refresh thread. +5 tests.

3. python_repl resource sandbox (squish/agent/builtin_tools.py) — the REPL previously exec'd in-process with only a SIGALRM timeout, so an allocation bomb could exhaust the host. A naive in-process RLIMIT_AS cap is unworkable (the model's mmap'd weights count against the process address space). Instead it now runs in a freshly spawned child (no inherited model, ~30 MiB baseline) with hard RLIMIT_AS (default 512 MiB) + RLIMIT_CPU limits, stdout captured over a pipe, wall-clock guard in the parent. Falls back to the in-process path (logged) when isolation is unavailable. +11 tests (the tool had none before).

4. Remove dead inference-backend abstraction (squish/server.py) — _InferenceBackend/_MLXEagerBackend/_MLCBackend + _active_backend were never wired into dispatch (only read as None) and exercised solely by their own test. Removed ~53 lines + tests/integration/test_phase_f_backends.py. The --inference-backend CLI flag (a separate string global) is untouched.

5. Centralize ~/.squish (squish/config.py + 8 modules) — added squish_home() honoring $SQUISH_HOME (default ~/.squish), routed config_path() through it, replaced 19 inline Path.home() / ".squish" computations. With SQUISH_HOME unset the paths are byte-identical (no behaviour change); setting it relocates the whole on-disk footprint. +5 tests.

Notes

  • A3's memory cap is a resource sandbox, not a security sandbox (open is still exposed — unchanged). The spawn approach was validated against the address-space concern: a fresh child baselines at ~30 MiB, so the cap is meaningful and Linux-testable.
  • Tracks B (medium perf) and C (architectural perf) from the roadmap remain MLX/Apple-Silicon-only and will follow separately, validated via the macOS CI jobs.

🤖 Generated with Claude Code

claude added 5 commits June 17, 2026 20:25
Locks in the repo-wide except sweep: any new `except Exception:` in squish/ now
fails CI (Wall-2 enforcement of python-conventions.md). The 12 intentional
boundaries already carry `# noqa: BLE001`. Fixed two malformed probes in
astc_loader.py (`except (ImportError, Exception)` → logged boundary). tests/**
are exempted via per-file-ignores — they use broad catches for failure-path
assertions and best-effort probes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
_background_fetch did a single urlopen with no retry. Extract a testable
_fetch_catalog_bytes() that retries transient errors (OSError/URLError/timeout)
up to 3 attempts with exponential backoff (0.5/1.0/2.0s); non-transient
ValueError (bad URL) is not retried. Stays non-blocking (daemon thread); bundled
catalog remains in effect on total failure. Adds 5 unit tests covering success,
retry-then-succeed, exhaustion, non-retryable, and backoff schedule.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…uish path

Two cleanup items from the post-review roadmap (Track A):

MLC backend decision — _InferenceBackend / _MLXEagerBackend / _MLCBackend and the
_active_backend global were never wired into dispatch (_active_backend is only ever
read as None); they were exercised solely by their own test. Removed the ~53 lines
of dead abstraction and tests/integration/test_phase_f_backends.py. The
--inference-backend CLI flag (a separate string global) is unchanged.

Centralize ~/.squish — add squish_home() to config.py honoring $SQUISH_HOME (default
~/.squish), route config_path() through it, and replace 19 inline
`Path.home() / ".squish"` computations across cli, catalog, server, daemon
(squishd/launchagent), hardware/chip_detector, platform/ane_router, and
runtime/auto_profile. With SQUISH_HOME unset the paths are byte-identical, so
behaviour is unchanged; setting it relocates the entire on-disk footprint. Adds
5 squish_home() unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…olation

The retry unit tests patched the shared globals urllib.request.urlopen and
time.sleep, so a background catalog-refresh daemon thread spawned by another
test (test_catalog_branches via _try_refresh_catalog) raced on them and
perturbed the call counts in the full-suite run. Make opener/sleeper injectable
(defaulting to the real functions) and have the tests pass their own mocks —
fully deterministic regardless of in-flight background threads. Production
behaviour is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
squish_python_repl previously exec'd code in-process with only a SIGALRM
timeout, so an allocation bomb could exhaust the host serving process. An
in-process RLIMIT_AS cap is unworkable here — the loaded model's mmap'd weights
count against the process address space. Instead, run the code in a freshly
*spawned* child (no inherited model, ~30 MiB baseline) with hard RLIMIT_AS
(default 512 MiB) and RLIMIT_CPU limits, capturing stdout over a pipe and
enforcing a wall-clock guard in the parent. Falls back to the original
in-process path (logged) when process isolation is unavailable.

- Shared restricted namespace (_repl_namespace) across both paths; no DRY drift.
- New max_memory_mb parameter (exposed in the tool schema).
- Adds 11 behavioural tests (the tool had none): output capture, error capture,
  blocked imports, enforced memory limit, runaway-loop termination, and the
  in-process fallback.

Full CI-mode suite: 4014 passed, 127 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
@konjoinfinity konjoinfinity changed the title Track A: lock-in & cleanup — ruff BLE001, catalog retry, dead-backend removal, ~/.squish centralization Track A: lock-in & cleanup — BLE001, catalog retry, python_repl sandbox, dead-backend removal, ~/.squish centralization Jun 17, 2026
…_AS is no-op on macOS)

macOS/Darwin accepts setrlimit(RLIMIT_AS) but does not enforce it, so the 320 MB
allocation ran fine under the 64 MB cap on the macOS CI runners. Skip
test_memory_limit_enforced on non-Linux and document the caveat in the function:
the spawn isolation, RLIMIT_CPU, and wall-clock timeout still apply everywhere,
but the memory cap is Linux-enforced / best-effort on macOS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
@wesleyscholl wesleyscholl marked this pull request as ready for review June 17, 2026 21:53
@wesleyscholl wesleyscholl merged commit 6929e01 into main Jun 17, 2026
16 checks passed
@wesleyscholl wesleyscholl deleted the claude/track-a-lockin-cleanup branch June 17, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants