feat(encoding): numeric compression levels (1-22) API#63
feat(encoding): numeric compression levels (1-22) API#63
Conversation
- add CompressionLevel::Level(i32) and canonical from_level mapping - port level table, source-size hinting, and matcher sizing behavior - align CLI parsing/range validation and update docs/tests
📝 WalkthroughWalkthroughAdds C zstd–compatible numeric compression levels via Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI/User
participant Encoder as StreamingEncoder
participant Matcher as MatchGenerator
participant Compressor as FrameCompressor
CLI->>Encoder: set_source_size_hint(size)
activate Encoder
Encoder->>Matcher: set_source_size_hint(size)
activate Matcher
Matcher->>Matcher: store hint for next reset
deactivate Matcher
deactivate Encoder
CLI->>Encoder: compress(data, level, store)
activate Encoder
Encoder->>Matcher: reset(level)
activate Matcher
Matcher->>Matcher: resolve LevelParams\napply source_size_hint\nclamp window_log
deactivate Matcher
Encoder->>Compressor: compress_block_encoded(...)
activate Compressor
Compressor->>Matcher: get_next_space()
Matcher-->>Compressor: buffer sized by adjusted window (rgba(70,130,180,0.5))
deactivate Compressor
deactivate Encoder
Encoder-->>CLI: emit compressed frame (header window adjusted)
sequenceDiagram
participant Caller as Caller
participant API as CompressionLevel::from_level
participant Table as Level Parameter Table
Caller->>API: from_level(7)
activate API
API->>Table: lookup level 7
activate Table
Table-->>API: LevelParams{matcher: Lazy2, window_log: X, ...}
deactivate Table
API-->>Caller: CompressionLevel::Better
deactivate API
Caller->>API: from_level(15)
activate API
API->>Table: lookup level 15
activate Table
Table-->>API: LevelParams{matcher: BtLazy2, window_log: Y, ...}
deactivate Table
API-->>Caller: CompressionLevel::Level(15)
deactivate API
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Pull request overview
Adds a C zstd-compatible numeric compression level API (including negative “ultra-fast” levels) and introduces source-size-aware tuning so small inputs can avoid oversized windows/tables, with corresponding CLI and test updates.
Changes:
- Introduces
CompressionLevel::Level(i32)+CompressionLevel::from_level(i32)and a 1–22 level parameter table (plus negative levels). - Adds
set_source_size_hint()plumbing throughMatcher,FrameCompressor, andStreamingEncoder, and forwards pledged content size as a matcher hint. - Updates CLI to accept numeric levels (including negative) and adds
--store; expands roundtrip/tests for new behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| zstd/src/encoding/mod.rs | Extends CompressionLevel and Matcher with numeric levels + source-size hint hook. |
| zstd/src/encoding/match_generator.rs | Implements level parameter resolution, source-size-based parameter capping, and negative-level handling. |
| zstd/src/encoding/frame_compressor.rs | Exposes set_source_size_hint() and supports Level(_) in encoded compression path. |
| zstd/src/encoding/streaming_encoder.rs | Forwards pledged size/hints to matcher; supports Level(_) across streaming paths; adds tests. |
| zstd/src/tests/roundtrip_integrity.rs | Adds comprehensive roundtrip coverage for numeric levels, clamping, negative levels, and size-hint behavior. |
| cli/src/main.rs | Switches --level to i32 numeric levels with bounds checking; adds --store; forwards source size hint. |
| README.md | Documents numeric levels and usage examples. |
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 88f0576 | Previous: bbd826e | Ratio |
|---|---|---|---|
compress/default/small-1k-random/matrix/pure_rust |
6.186 ms |
4.409 ms |
1.40 |
compress/default/small-10k-random/matrix/pure_rust |
7.582 ms |
5.932 ms |
1.28 |
compress/default/low-entropy-1m/matrix/pure_rust |
10.553 ms |
8.754 ms |
1.21 |
compress/default/large-log-stream/matrix/c_ffi |
3.905 ms |
3.324 ms |
1.17 |
compress/best/large-log-stream/matrix/c_ffi |
12.561 ms |
10.602 ms |
1.18 |
decompress/fastest/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.026 ms |
1.35 |
decompress/default/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.027 ms |
1.30 |
decompress/better/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.026 ms |
1.35 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 49-50: Remove the duplicated checklist entry by consolidating the
two bullets into one: keep the existing "Numeric levels `0` (default), `1–22`,
and negative ultra-fast levels via `CompressionLevel::from_level(n)` (C zstd
compatible numbering)" bullet and delete the separate "Negative levels for
ultra-fast compression" bullet so the README checklist only mentions negative
levels once.
In `@zstd/src/encoding/match_generator.rs`:
- Around line 402-403: The Simple backend's reuse of pooled SuffixStore buffers
doesn't validate that a reused store's slot count matches the current block
size: when self.slice_size is reduced (set by
self.base_slice_size.min(max_window_size) and self.reported_window_size =
max_window_size) a previously pooled SuffixStore may be too small for the new
space.len(), causing high collision rates; update the reuse logic that pulls
from suffix_pool (and the analogous code at the other occurrence) to check the
chosen pool entry's capacity/slot count against space.len() and recreate a fresh
SuffixStore when the pooled store's slot count is smaller than space.len(),
ensuring we only reuse pools that are at least as large as the current space.
- Around line 76-79: The reset() path currently derives each level's hash_bits
via dfast_hash_bits_for_window(max_window_size), which changes defaults when no
source-size hint is provided; instead, only shrink the Dfast table when a hint
exists. In reset(), gate the call to dfast_hash_bits_for_window behind
hint.is_some() (or otherwise only apply the smaller value when hint.is_some()),
and otherwise set hash_bits to the documented default DFAST_HASH_BITS
(respecting MIN_WINDOW_LOG where applicable). Apply the same guard to the other
occurrences noted (around the lines referenced) so unhinted resets preserve the
original DFAST defaults.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6d342997-4603-4877-a6a5-1ca2e012f54e
📒 Files selected for processing (2)
README.mdzstd/src/encoding/match_generator.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
zstd/src/encoding/match_generator.rs (2)
563-566:⚠️ Potential issue | 🟠 MajorReject undersized pooled
SuffixStores when the slice grows again.
get_next_space()now resizes pooledVec<u8>buffers correctly, but Lines 563-566 still reuse whateversuffix_pool.pop()returns. After a 1 KiB hinted frame, the next full-size Fastest/negative-level frame can index a much larger block with that 1 KiB suffix table, which stays correct but tanks match quality. Recreate the store whenstore.slots.len() < space.len().🔧 Suggested capacity check for `suffix_pool` reuse
- let suffixes = self - .suffix_pool - .pop() - .unwrap_or_else(|| SuffixStore::with_capacity(space.len())); + let suffixes = match self.suffix_pool.pop() { + Some(store) if store.slots.len() >= space.len() => store, + _ => SuffixStore::with_capacity(space.len()), + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 563 - 566, The pooled SuffixStore popped from self.suffix_pool may be undersized for the new space; update the reuse logic around the suffixes assignment (the pop/unwrap_or_else block that currently constructs suffixes) to check the popped store's capacity/slot count (e.g., store.slots.len()) against space.len() and if store.slots.len() < space.len() create a new SuffixStore::with_capacity(space.len()) instead of reusing the undersized store; keep using self.suffix_pool.pop() when the capacity is sufficient so we only recreate when necessary.
359-360:⚠️ Potential issue | 🟠 MajorKeep unhinted Dfast levels on the documented default table size.
Line 424 still derives
hash_bitsfrommax_window_sizeeven when Line 359 consumed no hint, so plainCompressionLevel::Level(2)now gets a 19-bit Dfast table while levels 3/4 get 20 bits. That changes the default Dfast profile on the non-hinted path; if levels 2–4 are still supposed to share sizing until explicit per-level tuning exists, keepDFAST_HASH_BITSunless a source-size hint was supplied, or makehash_bitsan explicitLevelParamsfield.🔧 One safe way to keep shrinking on the hint path only
fn reset(&mut self, level: CompressionLevel) { let hint = self.source_size_hint.take(); + let hinted = hint.is_some(); let params = Self::level_params(level, hint); let max_window_size = 1usize << params.window_log; @@ let dfast = self .dfast_match_generator .get_or_insert_with(|| DfastMatchGenerator::new(max_window_size)); dfast.max_window_size = max_window_size; dfast.lazy_depth = params.lazy_depth; - dfast.set_hash_bits(dfast_hash_bits_for_window(max_window_size)); + dfast.set_hash_bits(if hinted { + dfast_hash_bits_for_window(max_window_size) + } else { + DFAST_HASH_BITS + }); let vec_pool = &mut self.vec_pool; dfast.reset(|mut data| {Also applies to: 423-424
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 359 - 360, The code currently derives hash_bits from max_window_size in Self::level_params even when no source_size_hint was provided (because self.source_size_hint was consumed earlier), causing unhinted CompressionLevel::Level(2) to get an unexpectedly larger Dfast table; change level parameter logic so that when no hint was supplied (i.e., source_size_hint is None) level_params uses the documented default DFAST_HASH_BITS instead of computing from max_window_size—either by adding an explicit flag/field to LevelParams indicating whether a hint was present or by making hash_bits an explicit LevelParams field and only compute/override it when source_size_hint.is_some(); update functions referencing hash_bits (e.g., level_params and any callers that use LevelParams.hash_bits) to follow this contract so unhinted levels retain the default DFAST_HASH_BITS.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/mod.rs`:
- Around line 91-110: Update the CompressionLevel docs to explain that
Level(i32) values above 11 (e.g., Level(12..=22)) exceed the named preset
hierarchy so “Best” is only the best named preset, not the absolute
highest-ratio mode, and add the same HC/hash-chain caveat already present on
Better/Best: large numeric levels rely on the lazy2 hash-chain backend and are
subject to the ~4 GiB + window limitation caused by the early-return guard in
HcMatchGenerator::insert_position (abs_pos >= u32::MAX) until table rebasing is
implemented; mirror the wording and placement used for Better/Best so readers
understand Level(n) may be stronger but not safer for very large streams.
---
Duplicate comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 563-566: The pooled SuffixStore popped from self.suffix_pool may
be undersized for the new space; update the reuse logic around the suffixes
assignment (the pop/unwrap_or_else block that currently constructs suffixes) to
check the popped store's capacity/slot count (e.g., store.slots.len()) against
space.len() and if store.slots.len() < space.len() create a new
SuffixStore::with_capacity(space.len()) instead of reusing the undersized store;
keep using self.suffix_pool.pop() when the capacity is sufficient so we only
recreate when necessary.
- Around line 359-360: The code currently derives hash_bits from max_window_size
in Self::level_params even when no source_size_hint was provided (because
self.source_size_hint was consumed earlier), causing unhinted
CompressionLevel::Level(2) to get an unexpectedly larger Dfast table; change
level parameter logic so that when no hint was supplied (i.e., source_size_hint
is None) level_params uses the documented default DFAST_HASH_BITS instead of
computing from max_window_size—either by adding an explicit flag/field to
LevelParams indicating whether a hint was present or by making hash_bits an
explicit LevelParams field and only compute/override it when
source_size_hint.is_some(); update functions referencing hash_bits (e.g.,
level_params and any callers that use LevelParams.hash_bits) to follow this
contract so unhinted levels retain the default DFAST_HASH_BITS.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bd1c5550-fd80-4298-9c10-c14dacdad625
📒 Files selected for processing (7)
README.mdcli/src/main.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rszstd/src/tests/roundtrip_integrity.rs
|
@coderabbitai full review |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 968-993: The test
source_size_hint_does_not_shrink_window_below_dictionary_history should also
perform a decode roundtrip to ensure the produced frame is actually decodable;
after using FrameCompressor (set_dictionary, set_source, set_drain, compress)
and reading the frame header via crate::decoding::frame::read_frame_header,
invoke the crate's decoder with the same crate::decoding::Dictionary used
earlier to decompress output.as_slice() and assert the decompressed bytes equal
the original payload; update the test to call the existing project decode helper
(the same approach used in the companion test) or the appropriate decode API so
the test both verifies the advertised window and that the frame roundtrips
correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cb5c60f9-d99a-46a9-bdfb-a41d8cf6add8
📒 Files selected for processing (1)
zstd/src/encoding/frame_compressor.rs
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cli/src/main.rs (1)
132-148:⚠️ Potential issue | 🟠 MajorUse metadata length as a hint, not a hard pledge.
Line 133 reads an advisory size from filesystem metadata, and line 147 turns it into a fatal contract. This breaks readable pseudo-files (e.g.,
/proc/self/statusreports size 0 but reads 64+ bytes) and any input that changes while being compressed: the first extra byte fails withwrite would exceed pledged content size. For a path-based CLI,set_source_size_hint()is safer since it only optimises matcher parameters without enforcing exact byte count.🔧 Minimal fix
- encoder - .set_pledged_content_size(source_size as u64) - .wrap_err("failed to configure pledged content size")?; + encoder + .set_source_size_hint(source_size as u64) + .wrap_err("failed to configure source size hint")?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/src/main.rs` around lines 132 - 148, The code currently reads file metadata into source_size and then calls StreamingEncoder::set_pledged_content_size(...), which enforces an exact byte count and breaks on pseudo-files or growing inputs; change that to use StreamingEncoder::set_source_size_hint(source_size as u64) so the size is only a performance hint (remove the wrap_err on the pledged call). Locate the creation of encoder in the closure (structured_zstd::encoding::StreamingEncoder::new) and replace the set_pledged_content_size call with set_source_size_hint on the same encoder variable; keep source_file, source_size, encoder_input and create_temporary_output_file usage the same. Ensure any error handling around the old pledged call is removed or adjusted since the hint setter does not need to be treated as a fatal error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1525-1528: Rename the method hash_bits(&self, value: u64) to a
clearer name (e.g., compute_hash_bits or hash_index) to avoid collision with the
field hash_bits: usize; change the function signature accordingly and update
every call site that invokes the method (calls like self.hash_bits(value) or
hash_bits(self, value)) to the new name, and ensure the method body still
references self.hash_bits (the field) for the shift amount.
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 1019-1052: Add a small regression case after writing to the
StreamingEncoder in the existing source_size_hint_directly_reduces_window_header
test that calls StreamingEncoder::set_source_size_hint() once data has been
written and asserts it returns an error with io::ErrorKind::InvalidInput
(matching the existing late-call regression for set_pledged_content_size); place
this check after with_hint.write_all(...) and before finish(), referencing
StreamingEncoder::set_source_size_hint, write_all, and set_pledged_content_size
so the post-write rejection behavior is enforced and cannot drift.
---
Outside diff comments:
In `@cli/src/main.rs`:
- Around line 132-148: The code currently reads file metadata into source_size
and then calls StreamingEncoder::set_pledged_content_size(...), which enforces
an exact byte count and breaks on pseudo-files or growing inputs; change that to
use StreamingEncoder::set_source_size_hint(source_size as u64) so the size is
only a performance hint (remove the wrap_err on the pledged call). Locate the
creation of encoder in the closure
(structured_zstd::encoding::StreamingEncoder::new) and replace the
set_pledged_content_size call with set_source_size_hint on the same encoder
variable; keep source_file, source_size, encoder_input and
create_temporary_output_file usage the same. Ensure any error handling around
the old pledged call is removed or adjusted since the hint setter does not need
to be treated as a fatal error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3133f5b6-c129-493a-a61c-8929ccc2b8f8
📒 Files selected for processing (7)
README.mdcli/src/main.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rszstd/src/tests/roundtrip_integrity.rs
- use source_size_hint in CLI instead of hard pledged size from metadata - rename dfast hash helper to hash_index for clearer field/method separation - pin late set_source_size_hint InvalidInput behavior after write
|
@coderabbitai Fixed outside-diff note in |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 2196-2202: Replace the loose inequality check with a deterministic
assertion pinning the hinted Dfast table size: instead of asserting
hinted_tables < full_tables, assert that driver.dfast_matcher().short_hash.len()
equals the expected size computed by calling
dfast_hash_bits_for_window(MIN_WINDOW_LOG) (or the equivalent constant value for
MIN_WINDOW_LOG) so regressions in dfast_hash_bits_for_window() are caught.
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 104-109: Update the doc comment for set_source_size_hint to
describe its visible effects: state that besides tuning matcher parameters it
can also reduce the emitted frame window and influence block sizing (thus
changing the layout of emitted blocks/frames), and note it must be called before
the first write and that it is different from set_pledged_content_size;
reference set_source_size_hint, set_pledged_content_size, and write so reviewers
can locate the docs and the related behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a253bb40-d7e1-4ccf-9b54-bbb91762387f
📒 Files selected for processing (3)
cli/src/main.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
- keep dictionary source-size hint scoped to payload bytes in FrameCompressor - relax dictionary window assertions to roundtrip + nonincreasing-window guarantees - pin dfast hinted table size and document visible set_source_size_hint effects
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
zstd/src/encoding/frame_compressor.rs (1)
163-170:⚠️ Potential issue | 🟡 MinorUpdate the doc comment to match the new payload-only hint semantics.
Lines 168-170 still describe the old
dict + payloadbehavior, but Lines 190-193 now forward onlysize_hint, and the new dictionary tests expect a non-increasing advertised window. As written, this public contract promises a larger window than callers will actually get.📝 Suggested doc fix
- /// When dictionary priming is active, the effective matcher hint also - /// includes primed dictionary history bytes in addition to frame payload - /// bytes so the advertised window can cover both. + /// When dictionary priming is active, this hint still refers to frame + /// payload bytes only. Dictionary history is applied separately during + /// priming and does not expand the advertised frame window.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/frame_compressor.rs` around lines 163 - 170, The doc comment for the frame size hint still claims the hint includes primed dictionary bytes, but the implementation now forwards only size_hint; update the comment above the Provide a hint... block to state that the hint is payload-only (referring to size_hint) and does NOT include any primed dictionary history, and that the advertised window is computed from that payload-only size_hint (so callers may observe a non-increasing advertised window when a dictionary is present). Mention the symbol size_hint and the frame compressor behavior (advertised window selection) so readers know the contract matches the new forwarding logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 163-170: The doc comment for the frame size hint still claims the
hint includes primed dictionary bytes, but the implementation now forwards only
size_hint; update the comment above the Provide a hint... block to state that
the hint is payload-only (referring to size_hint) and does NOT include any
primed dictionary history, and that the advertised window is computed from that
payload-only size_hint (so callers may observe a non-increasing advertised
window when a dictionary is present). Mention the symbol size_hint and the frame
compressor behavior (advertised window selection) so readers know the contract
matches the new forwarding logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6ab25ddc-cfba-4e85-898b-9d03286e0989
📒 Files selected for processing (3)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
|
@coderabbitai Fixed duplicate note in |
|
Tip For best results, initiate chat on the files or code changes.
(ミ・ᴥ・ミ) 🐇 noted! ✏️ Learnings added
🧠 Learnings used |
Summary
CompressionLevel::Level(i32)variant andfrom_level()constructor for C zstd compatible numeric level APIset_source_size_hint()onMatchertrait,FrameCompressor, andStreamingEncoder— caps matcher/window parameters fromceil(log2(src_size))with aMIN_WINDOW_LOGfloor--storeflag for uncompressed framesTechnical Details
Each numeric level resolves to a
LevelParamsstruct containing the matcher backend, window log, hash fill step, lazy depth, and hash-chain config. Named variants map exactly to their numeric equivalents:Fastest=1,Default=3,Better=7,Best=11. Level 0 = default (level 3), matching C zstd semantics.For small inputs,
set_source_size_hint()adjustswindow_logand HC table sizes proportionally to source size, includingsrc_size == 0(clamped to minimum window), avoiding multi-MiB allocations for tiny payloads.StreamingEncoder::set_pledged_content_size()also forwards the pledged size as matcher hint.Window sizes are capped at 64 MiB (window_log=26) to stay within the decoder's 100 MiB limit.
Known Limitations
Test Plan
from_level(n)matches named variants for n=1,3,7,11MatchGeneratorDriver)cargo nextest run --workspace,cargo build --workspace,cargo fmt --allCloses #21
Summary by CodeRabbit
New Features
Documentation
Tests
Compatibility Note
CompressionLevel::Level(i32)expands the public enum shape and is semver-breaking for downstream crates that exhaustivelymatchonCompressionLevelwithout a wildcard arm.Note