fix: HQQ decode (group-size + axis-1), sink util_fraction, + catalog BLE001 lint break#65
Merged
Merged
Conversation
Third internal correctness sweep — three verified bugs plus a pre-existing lint break on main. +15 regression tests. 1. quant/hqq.py — HQQ.decode recomputed group_size as ceil(dim_size/n_groups) instead of using the stored config group size. For a dim_size not an exact multiple of group_size, every group misaligned against its scale/zero and reconstruction error blew up (e.g. (4,100) g=30: relative error 0.30 → 0.07). Same class as the grouped-INT8 bug in #60. 2. quant/hqq.py — HQQ.decode never transposed the stored axis=1 codes back (encode stores them transposed to the original shape), so axis=1 raised a broadcast error end-to-end, even on aligned dims. Transpose on decode. 3. streaming/streaming_sink.py — SinkStats.util_fraction divided n_tokens_seen by itself (max(1, n_tokens_seen)), always returning 1.0 for any nonzero count. Added a window_size field (populated by get_stats from the config) and divide by it, so a cache that has seen 5 of 256 tokens reports ~0.02, not 1.0. 4. catalog.py — two pre-existing `except Exception` handlers in the HF repo-file listing tripped the BLE001 gate (enabled in #58) and were failing the lint job on main. They are documented degrade-gracefully boundaries ([] on any error), so marked `# noqa: BLE001` with a debug log of the swallowed cause. CI-mode suite: 4125 passed, 277 skipped. ruff check squish/ clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01W8bTep4nw7ybFHhx7QjzMv
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Third internal correctness sweep. Three verified bugs (each Linux-reproduced) plus a pre-existing lint break on
main. +15 regression tests, full suite green,ruff check squish/clean.Fixes
1.
quant/hqq.py—decoderecomputed the group sizedecodederivedgroup_size = ceil(dim_size / n_groups)instead of using the stored config value. For adim_sizethat isn't an exact multiple ofgroup_size, every group misaligned against its scale/zero. Verified:(4,100),bits=4 group_size=30→ relative error 0.30 → 0.07. (Same class as the grouped-INT8 bug in #60 — recurring pattern worth watching in the quant loaders.)2.
quant/hqq.py—axis=1was broken end-to-endencodestoresaxis=1codes transposed back to the original shape, butdecodenever transposed them back before the group reshape — soaxis=1raised a broadcast error on every input, even aligned dims. Verified:(96,4)axis=1raisedValueError, now round-trips at ~0.07 rel error. Found while fixing #1 (same method).3.
streaming/streaming_sink.py—util_fractionalways returned 1.0min(1.0, n_tokens_seen / max(1, n_tokens_seen))divides the count by itself →1.0for any nonzero count; the window size never entered. A cache withwindow_size=256that had seen 5 tokens reported 100% utilization. Added awindow_sizefield (populated byget_stats()from the config) and divide by it. Verified: 5/256 →0.0195; overfull clamps to1.0; empty →0.0.4.
catalog.py— pre-existingBLE001lint break onmainTwo
except Exceptionhandlers in the HF repo-file listing were tripping the BLE001 gate (enabled in #58) and failing the lint job onmain(they entered via a merge whose own PR diff never showed them). They're documented degrade-gracefully boundaries ([]on any error → local-compression fallback), so marked# noqa: BLE001with a debug log of the swallowed cause — matching the 12 intentional boundaries from #57. Not part of the bug sweep, but folded in because whole-tree ruff would otherwise fail this PR's lint job too.Validation
CI=1full suite: 4125 passed, 277 skipped;ruff check squish/clean (was 2 errors onmain).Note
The third sweep also confirmed no verified bugs in grammar/, milo_quant, aqlm, int3_runtime, prompt_lookup, cli URI handling, or chunked_prefill (the one issue there is MLX-only). The high-yield areas are now well covered — returns are diminishing.
🤖 Generated with Claude Code
Generated by Claude Code