Skip to content

fix: HQQ decode (group-size + axis-1), sink util_fraction, + catalog BLE001 lint break#65

Merged
wesleyscholl merged 1 commit into
mainfrom
claude/bugfix-sweep-round3
Jun 18, 2026
Merged

fix: HQQ decode (group-size + axis-1), sink util_fraction, + catalog BLE001 lint break#65
wesleyscholl merged 1 commit into
mainfrom
claude/bugfix-sweep-round3

Conversation

@konjoinfinity

Copy link
Copy Markdown
Collaborator

Summary

Third internal correctness sweep. Three verified bugs (each Linux-reproduced) plus a pre-existing lint break on main. +15 regression tests, full suite green, ruff check squish/ clean.

Fixes

1. quant/hqq.pydecode recomputed the group size
decode derived group_size = ceil(dim_size / n_groups) instead of using the stored config value. For a dim_size that isn't an exact multiple of group_size, every group misaligned against its scale/zero. Verified: (4,100), bits=4 group_size=30 → relative error 0.30 → 0.07. (Same class as the grouped-INT8 bug in #60 — recurring pattern worth watching in the quant loaders.)

2. quant/hqq.pyaxis=1 was broken end-to-end
encode stores axis=1 codes transposed back to the original shape, but decode never transposed them back before the group reshape — so axis=1 raised a broadcast error on every input, even aligned dims. Verified: (96,4) axis=1 raised ValueError, now round-trips at ~0.07 rel error. Found while fixing #1 (same method).

3. streaming/streaming_sink.pyutil_fraction always returned 1.0
min(1.0, n_tokens_seen / max(1, n_tokens_seen)) divides the count by itself → 1.0 for any nonzero count; the window size never entered. A cache with window_size=256 that had seen 5 tokens reported 100% utilization. Added a window_size field (populated by get_stats() from the config) and divide by it. Verified: 5/256 → 0.0195; overfull clamps to 1.0; empty → 0.0.

4. catalog.py — pre-existing BLE001 lint break on main
Two except Exception handlers in the HF repo-file listing were tripping the BLE001 gate (enabled in #58) and failing the lint job on main (they entered via a merge whose own PR diff never showed them). They're documented degrade-gracefully boundaries ([] on any error → local-compression fallback), so marked # noqa: BLE001 with a debug log of the swallowed cause — matching the 12 intentional boundaries from #57. Not part of the bug sweep, but folded in because whole-tree ruff would otherwise fail this PR's lint job too.

Validation

  • CI=1 full suite: 4125 passed, 277 skipped; ruff check squish/ clean (was 2 errors on main).
  • Existing HQQ + streaming-sink suites pass unchanged (no behavior regression).

Note

The third sweep also confirmed no verified bugs in grammar/, milo_quant, aqlm, int3_runtime, prompt_lookup, cli URI handling, or chunked_prefill (the one issue there is MLX-only). The high-yield areas are now well covered — returns are diminishing.

🤖 Generated with Claude Code


Generated by Claude Code

Third internal correctness sweep — three verified bugs plus a pre-existing lint
break on main. +15 regression tests.

1. quant/hqq.py — HQQ.decode recomputed group_size as ceil(dim_size/n_groups)
   instead of using the stored config group size. For a dim_size not an exact
   multiple of group_size, every group misaligned against its scale/zero and
   reconstruction error blew up (e.g. (4,100) g=30: relative error 0.30 → 0.07).
   Same class as the grouped-INT8 bug in #60.

2. quant/hqq.py — HQQ.decode never transposed the stored axis=1 codes back
   (encode stores them transposed to the original shape), so axis=1 raised a
   broadcast error end-to-end, even on aligned dims. Transpose on decode.

3. streaming/streaming_sink.py — SinkStats.util_fraction divided n_tokens_seen
   by itself (max(1, n_tokens_seen)), always returning 1.0 for any nonzero
   count. Added a window_size field (populated by get_stats from the config) and
   divide by it, so a cache that has seen 5 of 256 tokens reports ~0.02, not 1.0.

4. catalog.py — two pre-existing `except Exception` handlers in the HF repo-file
   listing tripped the BLE001 gate (enabled in #58) and were failing the lint
   job on main. They are documented degrade-gracefully boundaries ([] on any
   error), so marked `# noqa: BLE001` with a debug log of the swallowed cause.

CI-mode suite: 4125 passed, 277 skipped. ruff check squish/ clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
@wesleyscholl wesleyscholl marked this pull request as ready for review June 18, 2026 13:22
@wesleyscholl wesleyscholl merged commit 452574c into main Jun 18, 2026
18 checks passed
@wesleyscholl wesleyscholl deleted the claude/bugfix-sweep-round3 branch June 18, 2026 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants