Skip to content

Repo review: python-conventions except sweep, MLX import gate, agent tool fix, honest size gate#57

Merged
wesleyscholl merged 8 commits into
mainfrom
claude/squish-repo-review-b7imv0
Jun 17, 2026
Merged

Repo review: python-conventions except sweep, MLX import gate, agent tool fix, honest size gate#57
wesleyscholl merged 8 commits into
mainfrom
claude/squish-repo-review-b7imv0

Conversation

@konjoinfinity

@konjoinfinity konjoinfinity commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Full-repo review hardening pass. Two parts:

  1. Targeted correctness/constraint fixes (verified bugs + violations).
  2. Repo-wide except Exception sweep — eliminates every non-boundary broad catch, which the project's own .claude/rules/python-conventions.md forbids ("No bare except: or except Exception: — catch specific exceptions; log and re-raise").

Final state: 2691 passed, 109 skipped (MLX/platform-gated), ruff check clean across squish/, zero non-boundary except Exception remaining.

Part 1 — correctness / constraint fixes

  • MLX import gatingquant/int3_linear.py imported mlx.core/mlx.nn at module top level (violates "MLX must be gated, never imported on Linux paths"). Added the darwin/arm64 guard mirroring sqint2_linear.py; clear ImportError on Linux instead of a crash.
  • Agent tool-dispatch bugagent/tool_name_map.py mapped VSCode tool names to backend tools that were never registered (squish_list_directory/run_terminal/rename_file/create_directory), so those client calls silently failed. Remapped to the real tools, added a genuine squish_create_directory builtin, dropped the dead get_workspace_info entry, and added a regression test asserting every mapping resolves to a registered tool.
  • Latent bugs surfaced by narrowing_configure_blazing_mode called ChipDetector.detect() on the class (it's an instance method) → fixed to ChipDetector().detect().
  • Perf quick winserving/scheduler.py caches the 64-token prefix hash on _Request instead of recomputing it every batch-formation pass.
  • Docs / CI-gate honesty — the 500-line file-size gate was continue-on-error (never blocked despite the docs). Now blocking for new files, with the 39 current oversized files grandfathered in .konjo/oversized-allowlist.txt. Fixed stale MODULES.md version (v9.34.1 → v9.34.2) and a dangling CLAUDE.md reference.

Part 2 — except Exception sweep (~278 sites, 37 files)

Each broad catch was narrowed to an explicit exception tuple matching what the guarded block can raise, bound as exc, and logged (_LOG.debug for expected/optional/probe/cleanup failures, _LOG.warning when functionality degrades). Control flow was preserved everywhere — same fallbacks, returns, re-raises.

12 genuine top-level boundaries were intentionally kept as logged catch-alls marked # noqa: BLE001, where narrowing would risk crashing a server/daemon/worker or breaking a must-degrade-gracefully contract: FastAPI/SSE stream consumers, the token-generation worker thread, the background preload thread, the daemon connection handler, two GPU worker run-loops, the memory-governor callback dispatch, the layer-overlap background worker, tool-registry tool execution, the python_repl sandbox exec(), the startup warm-up, the sparse-ffn mask loader, and the squishd chat-template fallback (arbitrary Jinja).

Module loggers were added where missing and existing ones reused (no duplicates). _fast_imports.py uses a lazy in-handler logger to avoid cold-start import overhead.

Deferred (planned, not in this PR)

  • Perf: server.py hot-path string-join deferral, KV old-tier buffer pre-allocation, vectorized KITTY eviction, tokenizer-decode LUT — darwin-only MLX paths needing benchmarking on Apple Silicon.
  • Architectural: system-prompt KV prefix caching, greedy speculative decoding, continuous batching / paged KV.
  • Follow-ups: enable ruff BLE001 now that the tree is clean; MLC backend decision; catalog fetch retry/backoff.

🤖 Generated with Claude Code

claude added 6 commits June 17, 2026 13:50
…est size gate

Phase 1 of the full-repo review. Addresses verified constraint violations,
a real agent tool-dispatch bug, and CI gate honesty.

Correctness / constraint fixes:
- quant/int3_linear: add darwin/arm64 platform guard before MLX imports,
  mirroring sqint2_linear. Module no longer crashes on direct import under
  Linux; raises a clear ImportError instead (CLAUDE.md: MLX must be gated).
- Replace silent `except Exception: pass`/`return 0` swallows with specific
  exception types + logging in catalog, local_model_scanner, platform/detector,
  and platform/platform_router (python-conventions: no silent failures).

Agent tool dispatch bug:
- tool_name_map pointed VSCode names at backend tools that were never
  registered (squish_list_directory/run_terminal/rename_file/create_directory),
  so those client tool calls failed. Remap list_directory->squish_list_dir,
  run_terminal->squish_run_shell, rename_file->squish_move_file; add a real
  squish_create_directory builtin; drop the dangling get_workspace_info entry.
- Add a regression test asserting every mapped backend name is registered.

Perf quick win:
- scheduler: cache the 64-token prefix hash on _Request instead of recomputing
  it on every batch-formation pass.

Docs / gate honesty:
- File-size gate is now blocking for new files, with current oversized files
  grandfathered in .konjo/oversized-allowlist.txt (was continue-on-error, so
  it never blocked despite the docs). Update CLAUDE.md and fix the stale
  MODULES.md version (v9.34.1 -> v9.34.2) and a dangling doc reference.

Full suite: 2691 passed, 109 skipped (MLX/platform-gated).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…fic types + logging

Part of the repo-wide python-conventions sweep (no bare `except:`/`except Exception:`).
Narrows ~34 sites across the serving/daemon/grammar modules to explicit exception
tuples and adds logging where failures were silently swallowed. Control flow is
unchanged. Four genuine top-level boundaries (daemon connection handler, two GPU
worker run-loops, memory-governor callback dispatch) are kept as logged catch-alls
marked `# noqa: BLE001` because narrowing there would risk crashing the daemon/worker.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
… types + logging

Part of the repo-wide python-conventions sweep. Narrows 35 sites across the
quant/kv/io/context modules to explicit exception tuples and adds _LOG.debug at
sites that previously swallowed silently. Control flow unchanged; no top-level
boundary catch-alls needed. compressed_loader.py gains a module-level
`import subprocess` so the handler can reference subprocess.SubprocessError even
when the in-try import fails.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
Part of the repo-wide python-conventions sweep. Narrows all 62 handlers in cli.py
to explicit exception tuples and adds a module logger (squish.cli): _LOG.debug for
optional/probe failures (hardware detection, optional rich UI, best-effort fetches),
_LOG.warning where functionality degrades. Control flow, _die() calls, re-raises,
and user-facing prints are unchanged; two now-unnecessary `# noqa: BLE001` markers
were narrowed and removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…n with specific types + logging

Part of the repo-wide python-conventions sweep. Narrows 31 sites across 17 files
(speculative, hardware, platform, catalog, telemetry, experimental, agent, backend,
convert, ui, _term, _fast_imports) to explicit exception tuples with logging.

Notes:
- catalog download paths catch httpx.HTTPError (not an OSError subclass) when httpx
  is present, preserving the SSL-retry fallback.
- _fast_imports.py uses a lazy in-handler logger to avoid cold-start import overhead.
- Three genuine boundaries kept as logged `# noqa: BLE001` catch-alls: the
  layer-overlap background worker (re-raised on the consumer thread), tool_registry
  tool execution (error surfaced to the agent), and the builtin python_repl sandbox
  exec (error returned as text). Control flow unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…ing; fix latent bugs

Completes the repo-wide python-conventions sweep (zero non-boundary
`except Exception` remain across squish/).

server.py: 78 handlers narrowed to explicit exception tuples + module logger
(_LOG); 5 genuine top-level boundaries kept as logged `# noqa: BLE001` catch-alls
(SSE stream consumers, the token-generation worker thread, agent tool execution,
and the background preload thread) where a crash would kill the request/server.

Bug fixes surfaced by narrowing (previously masked by the broad catches):
- _configure_blazing_mode called ChipDetector.detect() on the class; detect() is
  an instance method — fixed to ChipDetector().detect().
- _warmup_model and the sparse-ffn mask loader are best-effort startup steps that
  must never crash boot; kept as logged boundaries (the sparse-ffn handler retains
  its `_e82b` marker required by the wave125 guard test).

squishd._messages_to_prompt: chat templates are arbitrary Jinja, so any render
failure must fall back to plain concatenation — kept as a logged boundary.

Test: simulate "mlx-lm not installed" with the realistic
importlib.metadata.PackageNotFoundError (a subclass of ImportError the probe
catches) instead of a generic Exception.

Full suite: 2691 passed, 109 skipped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
@konjoinfinity konjoinfinity changed the title Phase 1 repo review: gate MLX import, log silent failures, fix agent tool map, honest size gate Repo review: python-conventions except sweep, MLX import gate, agent tool fix, honest size gate Jun 17, 2026
claude added 2 commits June 17, 2026 14:24
…narrowing

CI surfaced sites the sweep over-narrowed — code paths designed to fall back or
exit cleanly on ANY failure, with tests that mock a generic Exception as the spec.
Restored these to logged `# noqa: BLE001` boundaries (control flow unchanged):

- quant/compressed_loader._configure_metal_memory: add TypeError — mlx
  set_memory_limit() signature drift (`relaxed=` kwarg removed in mlx 0.31.x)
  raised TypeError at module import, breaking macOS test collection.
- server._apply_chat_template (x2), serving/ollama_compat: chat templates are
  arbitrary Jinja; any render failure must fall back to manual formatting.
- server /sys-stats RSS probe: best-effort stats must not 500 on probe failure.
- backend.load_tensors: any torch-load failure falls back to the numpy loader.
- serving/memory_governor._run: docstring contract is "empty string on any
  failure" (incl. IndexError on empty argv).
- convert.py + cli.cmd_convert_model: any conversion failure cleans partial
  output / exits cleanly via _die (mlx_lm.convert can raise anything).

Local CI-mode run (CI=1): 4006 passed, 127 skipped. Zero non-boundary
`except Exception` remain across squish/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
_ActivationHook accepts either an MLX array or a numpy array. On macOS (real
mlx) a numpy input makes `x.astype(mx.float32)` raise TypeError ("Cannot
interpret 'mlx.core.float32' as a data type"); the original `except Exception`
caught it and fell back to `np.asarray(x, dtype=np.float32)`. The sweep's
narrowed tuple omitted TypeError, so the numpy path failed on macOS only (on
Linux the `import mlx.core` raises ImportError first, masking it). Add TypeError
to the explicit tuple.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants