Repo review: python-conventions except sweep, MLX import gate, agent tool fix, honest size gate#57
Merged
Merged
Conversation
…est size gate Phase 1 of the full-repo review. Addresses verified constraint violations, a real agent tool-dispatch bug, and CI gate honesty. Correctness / constraint fixes: - quant/int3_linear: add darwin/arm64 platform guard before MLX imports, mirroring sqint2_linear. Module no longer crashes on direct import under Linux; raises a clear ImportError instead (CLAUDE.md: MLX must be gated). - Replace silent `except Exception: pass`/`return 0` swallows with specific exception types + logging in catalog, local_model_scanner, platform/detector, and platform/platform_router (python-conventions: no silent failures). Agent tool dispatch bug: - tool_name_map pointed VSCode names at backend tools that were never registered (squish_list_directory/run_terminal/rename_file/create_directory), so those client tool calls failed. Remap list_directory->squish_list_dir, run_terminal->squish_run_shell, rename_file->squish_move_file; add a real squish_create_directory builtin; drop the dangling get_workspace_info entry. - Add a regression test asserting every mapped backend name is registered. Perf quick win: - scheduler: cache the 64-token prefix hash on _Request instead of recomputing it on every batch-formation pass. Docs / gate honesty: - File-size gate is now blocking for new files, with current oversized files grandfathered in .konjo/oversized-allowlist.txt (was continue-on-error, so it never blocked despite the docs). Update CLAUDE.md and fix the stale MODULES.md version (v9.34.1 -> v9.34.2) and a dangling doc reference. Full suite: 2691 passed, 109 skipped (MLX/platform-gated). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…fic types + logging Part of the repo-wide python-conventions sweep (no bare `except:`/`except Exception:`). Narrows ~34 sites across the serving/daemon/grammar modules to explicit exception tuples and adds logging where failures were silently swallowed. Control flow is unchanged. Four genuine top-level boundaries (daemon connection handler, two GPU worker run-loops, memory-governor callback dispatch) are kept as logged catch-alls marked `# noqa: BLE001` because narrowing there would risk crashing the daemon/worker. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
… types + logging Part of the repo-wide python-conventions sweep. Narrows 35 sites across the quant/kv/io/context modules to explicit exception tuples and adds _LOG.debug at sites that previously swallowed silently. Control flow unchanged; no top-level boundary catch-alls needed. compressed_loader.py gains a module-level `import subprocess` so the handler can reference subprocess.SubprocessError even when the in-try import fails. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
Part of the repo-wide python-conventions sweep. Narrows all 62 handlers in cli.py to explicit exception tuples and adds a module logger (squish.cli): _LOG.debug for optional/probe failures (hardware detection, optional rich UI, best-effort fetches), _LOG.warning where functionality degrades. Control flow, _die() calls, re-raises, and user-facing prints are unchanged; two now-unnecessary `# noqa: BLE001` markers were narrowed and removed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…n with specific types + logging Part of the repo-wide python-conventions sweep. Narrows 31 sites across 17 files (speculative, hardware, platform, catalog, telemetry, experimental, agent, backend, convert, ui, _term, _fast_imports) to explicit exception tuples with logging. Notes: - catalog download paths catch httpx.HTTPError (not an OSError subclass) when httpx is present, preserving the SSL-retry fallback. - _fast_imports.py uses a lazy in-handler logger to avoid cold-start import overhead. - Three genuine boundaries kept as logged `# noqa: BLE001` catch-alls: the layer-overlap background worker (re-raised on the consumer thread), tool_registry tool execution (error surfaced to the agent), and the builtin python_repl sandbox exec (error returned as text). Control flow unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…ing; fix latent bugs Completes the repo-wide python-conventions sweep (zero non-boundary `except Exception` remain across squish/). server.py: 78 handlers narrowed to explicit exception tuples + module logger (_LOG); 5 genuine top-level boundaries kept as logged `# noqa: BLE001` catch-alls (SSE stream consumers, the token-generation worker thread, agent tool execution, and the background preload thread) where a crash would kill the request/server. Bug fixes surfaced by narrowing (previously masked by the broad catches): - _configure_blazing_mode called ChipDetector.detect() on the class; detect() is an instance method — fixed to ChipDetector().detect(). - _warmup_model and the sparse-ffn mask loader are best-effort startup steps that must never crash boot; kept as logged boundaries (the sparse-ffn handler retains its `_e82b` marker required by the wave125 guard test). squishd._messages_to_prompt: chat templates are arbitrary Jinja, so any render failure must fall back to plain concatenation — kept as a logged boundary. Test: simulate "mlx-lm not installed" with the realistic importlib.metadata.PackageNotFoundError (a subclass of ImportError the probe catches) instead of a generic Exception. Full suite: 2691 passed, 109 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
…narrowing CI surfaced sites the sweep over-narrowed — code paths designed to fall back or exit cleanly on ANY failure, with tests that mock a generic Exception as the spec. Restored these to logged `# noqa: BLE001` boundaries (control flow unchanged): - quant/compressed_loader._configure_metal_memory: add TypeError — mlx set_memory_limit() signature drift (`relaxed=` kwarg removed in mlx 0.31.x) raised TypeError at module import, breaking macOS test collection. - server._apply_chat_template (x2), serving/ollama_compat: chat templates are arbitrary Jinja; any render failure must fall back to manual formatting. - server /sys-stats RSS probe: best-effort stats must not 500 on probe failure. - backend.load_tensors: any torch-load failure falls back to the numpy loader. - serving/memory_governor._run: docstring contract is "empty string on any failure" (incl. IndexError on empty argv). - convert.py + cli.cmd_convert_model: any conversion failure cleans partial output / exits cleanly via _die (mlx_lm.convert can raise anything). Local CI-mode run (CI=1): 4006 passed, 127 skipped. Zero non-boundary `except Exception` remain across squish/. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
_ActivationHook accepts either an MLX array or a numpy array. On macOS (real
mlx) a numpy input makes `x.astype(mx.float32)` raise TypeError ("Cannot
interpret 'mlx.core.float32' as a data type"); the original `except Exception`
caught it and fell back to `np.asarray(x, dtype=np.float32)`. The sweep's
narrowed tuple omitted TypeError, so the numpy path failed on macOS only (on
Linux the `import mlx.core` raises ImportError first, masking it). Add TypeError
to the explicit tuple.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
This was referenced Jun 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full-repo review hardening pass. Two parts:
except Exceptionsweep — eliminates every non-boundary broad catch, which the project's own.claude/rules/python-conventions.mdforbids ("No bareexcept:orexcept Exception:— catch specific exceptions; log and re-raise").Final state: 2691 passed, 109 skipped (MLX/platform-gated),
ruff checkclean acrosssquish/, zero non-boundaryexcept Exceptionremaining.Part 1 — correctness / constraint fixes
quant/int3_linear.pyimportedmlx.core/mlx.nnat module top level (violates "MLX must be gated, never imported on Linux paths"). Added the darwin/arm64 guard mirroringsqint2_linear.py; clearImportErroron Linux instead of a crash.agent/tool_name_map.pymapped VSCode tool names to backend tools that were never registered (squish_list_directory/run_terminal/rename_file/create_directory), so those client calls silently failed. Remapped to the real tools, added a genuinesquish_create_directorybuiltin, dropped the deadget_workspace_infoentry, and added a regression test asserting every mapping resolves to a registered tool._configure_blazing_modecalledChipDetector.detect()on the class (it's an instance method) → fixed toChipDetector().detect().serving/scheduler.pycaches the 64-token prefix hash on_Requestinstead of recomputing it every batch-formation pass.continue-on-error(never blocked despite the docs). Now blocking for new files, with the 39 current oversized files grandfathered in.konjo/oversized-allowlist.txt. Fixed staleMODULES.mdversion (v9.34.1 → v9.34.2) and a danglingCLAUDE.mdreference.Part 2 —
except Exceptionsweep (~278 sites, 37 files)Each broad catch was narrowed to an explicit exception tuple matching what the guarded block can raise, bound
as exc, and logged (_LOG.debugfor expected/optional/probe/cleanup failures,_LOG.warningwhen functionality degrades). Control flow was preserved everywhere — same fallbacks, returns, re-raises.12 genuine top-level boundaries were intentionally kept as logged catch-alls marked
# noqa: BLE001, where narrowing would risk crashing a server/daemon/worker or breaking a must-degrade-gracefully contract: FastAPI/SSE stream consumers, the token-generation worker thread, the background preload thread, the daemon connection handler, two GPU worker run-loops, the memory-governor callback dispatch, the layer-overlap background worker, tool-registry tool execution, thepython_replsandboxexec(), the startup warm-up, the sparse-ffn mask loader, and the squishd chat-template fallback (arbitrary Jinja).Module loggers were added where missing and existing ones reused (no duplicates).
_fast_imports.pyuses a lazy in-handler logger to avoid cold-start import overhead.Deferred (planned, not in this PR)
BLE001now that the tree is clean; MLC backend decision; catalog fetch retry/backoff.🤖 Generated with Claude Code