feat(encoding): numeric compression levels (1-22) API#61
feat(encoding): numeric compression levels (1-22) API#61
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 8 minutes and 14 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds numeric Zstandard-compatible compression levels (signed i32, 1–22 and negative ultra-fast), plumbing to resolve level parameters, per-frame source-size hints propagated to matchers/encoders, routes Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI / User
participant Main as CLI main
participant Resolver as LevelResolver
participant Encoder as StreamingEncoder / FrameCompressor
participant Matcher as MatchGenerator
participant Backend as CompressionBackend
CLI->>Main: compress(file, level = n, --store?)
Main->>Resolver: CompressionLevel::from_level(n)
alt store == true
Main->>Encoder: call compress(..., level = Uncompressed, store = true)
Encoder->>Backend: write stored (uncompressed) blocks
Backend-->>Encoder: stored output
Encoder-->>Main: finished
else store == false
Main->>Encoder: call compress(..., level = Level(n), store = false)
Encoder->>Resolver: resolve level params (n)
Resolver-->>Encoder: LevelParams (backend, window_log, hashes, ...)
Encoder->>Matcher: set_source_size_hint(size?)
Encoder->>Matcher: reset(params)
Matcher->>Backend: configure tables (hash_bits, slice_size, ...)
Encoder->>Backend: compress_block_encoded(using matcher)
Backend-->>Encoder: compressed block
Encoder-->>Main: emit compressed output
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds a zstd-compatible numeric compression level API to the encoder and updates internal tuning to be level-driven, plus adjusts the CLI/docs to expose numeric levels.
Changes:
- Introduces
CompressionLevel::Level(i32)plusCompressionLevel::from_level()and derivesDebugfor better diagnostics. - Adds a per-level parameter table (levels 1–22 + negative “ultra-fast”) and routes matcher configuration through resolved
LevelParams. - Updates encoder/CLI paths to recognize numeric levels and expands tests/docs to cover the new API.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| zstd/src/encoding/mod.rs | Adds numeric level variant/constructor and documents level semantics/mappings. |
| zstd/src/encoding/match_generator.rs | Implements level parameter table and resolves numeric levels into backend/window/config choices. |
| zstd/src/encoding/streaming_encoder.rs | Treats CompressionLevel::Level(_) as a compressing mode in streaming paths. |
| zstd/src/encoding/frame_compressor.rs | Treats CompressionLevel::Level(_) as a compressing mode and updates debug formatting/tests. |
| zstd/src/tests/roundtrip_integrity.rs | Adds roundtrip + equivalence tests for numeric levels (including negative/out-of-range). |
| cli/src/main.rs | Switches CLI to accept numeric levels (including negative) and maps CLI level to library levels. |
| README.md | Documents numeric level usage and updates feature list accordingly. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/src/main.rs`:
- Around line 37-55: The CLI flag `level: i32` is currently interpreted so that
`compress()` special-cases 0 into `CompressionLevel::Uncompressed`, which makes
`--level 0` emit a raw frame instead of zstd level 0; change the logic so that
`level == 0` maps to the zstd level via `CompressionLevel::from_level(0)` (or
equivalent) and remove the special-case that converts 0 to
`CompressionLevel::Uncompressed`; if you need an uncompressed/raw mode, add a
separate flag (e.g., `--raw` or `--uncompressed`) and have `compress()` check
that flag to return `CompressionLevel::Uncompressed` instead of overloading the
numeric `level` field.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ed87d76e-03be-48a9-96ac-25559edee973
📒 Files selected for processing (7)
README.mdcli/src/main.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rszstd/src/tests/roundtrip_integrity.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/src/main.rs`:
- Around line 51-66: The CLI currently restricts --level to -5..=22 which
prevents using the full CompressionLevel numeric API; update the arg parser for
the level field in main.rs (the #[arg(...)] on level: i32) so it does not
hardcode -5: either remove the .range(...) and accept any i32 then
validate/convert with CompressionLevel::from_level() at compress-time, or set
the range to use the library constants
(CompressionLevel::MIN_LEVEL..=CompressionLevel::MAX_LEVEL) so the parser
accepts the full allowed range; also ensure the --store flag handling (store:
bool) continues to ignore the level value at runtime by short-circuiting before
validating/converting the level when store is true.
In `@zstd/src/encoding/match_generator.rs`:
- Around line 76-113: The resolver always picks a single preset from LEVEL_TABLE
for a given CompressionLevel, ignoring zstd's source-size classes; update the
API and logic so resolve_level_params can consider input size (e.g., add a
src_size parameter or a CompressionSizeHint) and use that hint to choose the
appropriate size-class variant of the level preset (either by consulting a small
mapping of clevel size classes or by selecting alternate entries/adjusted
window_log/target_len for small inputs) before returning LevelParams; modify
call sites to pass the source-size hint and update any tests to cover small vs
large input behavior (refer to LEVEL_TABLE and
resolve_level_params/CompressionLevel for where to apply the change).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a287a2da-3082-4842-9514-0272b8164c32
📒 Files selected for processing (2)
cli/src/main.rszstd/src/encoding/match_generator.rs
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cli/src/main.rs (1)
120-145:⚠️ Potential issue | 🟠 MajorForward the known input size into
StreamingEncoder.This path already reads the file length for progress reporting, but it never calls
set_source_size_hint()(orset_pledged_content_size()), so the CLI still skips the new size-aware window/table selection on the main compression path. Tiny files at higher levels will keep paying the full matcher footprint unnecessarily.Suggested fix
let mut encoder = structured_zstd::encoding::StreamingEncoder::new(temporary_output, compression_level); + encoder + .set_source_size_hint(source_size as u64) + .wrap_err("failed to configure source size hint")?; std::io::copy(&mut encoder_input, &mut encoder).wrap_err("streaming compression failed")?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/src/main.rs` around lines 120 - 145, In compress(), after computing source_size and before streaming data into the encoder, pass the known input size to the zstd encoder so it can use size-aware window/table selection: when you construct the StreamingEncoder (structured_zstd::encoding::StreamingEncoder::new(...)) call set_source_size_hint(source_size) (or set_pledged_content_size(source_size) if that API is used) on the encoder instance (the variable named encoder) before calling std::io::copy(&mut encoder_input, &mut encoder); this forwards the file length you already computed into the encoder and enables the new size-aware behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 121-142: The small-input path adjusts LevelParams in
adjust_params_for_source_size but only clamps HC table logs, leaving the DFAST
tables at full size; update either adjust_params_for_source_size to reduce the
effective hash table size for MatcherBackend::Dfast (e.g. derive a reduced dfast
hash bits value from src_log/params.window_log) or modify
DfastMatchGenerator::ensure_hash_tables to consult the adjusted LevelParams
(instead of always allocating 1 << DFAST_HASH_BITS) and allocate smaller
short/long tables when the source is tiny; reference
adjust_params_for_source_size, LevelParams, MatcherBackend::Dfast,
DfastMatchGenerator::ensure_hash_tables and DFAST_HASH_BITS so the dfast
allocations are proportional to the hinted small-input params.
- Around line 348-350: The resolved window_log can make max_window_size smaller
than the normal block/slice size, so clamp self.slice_size to not exceed
max_window_size when you compute params (use the computed max_window_size to set
self.slice_size = min(self.base_slice_size, max_window_size)) and ensure
commit_space() / get_next_space() honor this reduced slice size when allocating
or reusing pooled vectors; this prevents commit_space() from passing a too-large
block to MatchGenerator::reserve / DfastMatchGenerator::add_data /
HcMatchGenerator::add_data and avoids the assertion failure.
In `@zstd/src/tests/roundtrip_integrity.rs`:
- Around line 664-672: Update the out_of_range_level_clamped test to also
exercise the i32::MIN edge case: call roundtrip_at_level(&data,
CompressionLevel::from_level(i32::MIN)) and assert equality with data just like
the other cases so the test covers the overflow-saturation path in
CompressionLevel::from_level rather than only large negative values like
-200000; keep the existing positive out-of-range case too.
- Around line 698-730: The test only checks roundtrip and output length but must
also assert that the size hint changed the encoder's emitted window size; after
producing with_hint, parse the frame header (or use the decoder's/window
inspector API) to read the emitted window_size and assert it's smaller than the
full level-11 window (e.g. < 16 * 1024 * 1024) and different from the no_hint
frame; update source_size_hint_reduces_window_for_small_input to call
FrameCompressor::set_source_size_hint, compress, then extract and assert the
frame's window_size() (or decoder.window_size()) reflects the hint so the hint
plumbing is actually exercised.
---
Outside diff comments:
In `@cli/src/main.rs`:
- Around line 120-145: In compress(), after computing source_size and before
streaming data into the encoder, pass the known input size to the zstd encoder
so it can use size-aware window/table selection: when you construct the
StreamingEncoder (structured_zstd::encoding::StreamingEncoder::new(...)) call
set_source_size_hint(source_size) (or set_pledged_content_size(source_size) if
that API is used) on the encoder instance (the variable named encoder) before
calling std::io::copy(&mut encoder_input, &mut encoder); this forwards the file
length you already computed into the encoder and enables the new size-aware
behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d7de7cb-33f1-4d41-bcc0-47bcb28d6a76
📒 Files selected for processing (6)
cli/src/main.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rszstd/src/tests/roundtrip_integrity.rs
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 734f748 | Previous: bbd826e | Ratio |
|---|---|---|---|
compress/fastest/low-entropy-1m/matrix/c_ffi |
0.188 ms |
0.157 ms |
1.20 |
compress/default/large-log-stream/matrix/pure_rust |
108.565 ms |
92.164 ms |
1.18 |
compress/best/large-log-stream/matrix/c_ffi |
15.011 ms |
10.602 ms |
1.42 |
decompress/fastest/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.026 ms |
1.35 |
decompress/default/high-entropy-1m/c_stream/matrix/c_ffi |
0.036 ms |
0.027 ms |
1.33 |
decompress/better/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.026 ms |
1.35 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
- Add CompressionLevel::Level(i32) variant and from_level() constructor - Port C zstd level parameter table (levels 1-22) with per-level window size, hash/chain config, search depth, and lazy depth - Negative levels (-1..-131072) select ultra-fast mode via Simple backend with progressively coarser hash insertion - Level 0 maps to default (level 3), matching C zstd semantics - Named variants map exactly to their numeric equivalents: Fastest=1, Default=3, Better=7, Best=11 - Refactor MatchGeneratorDriver::reset() to use centralized LevelParams instead of per-variant hardcoded constants - Update CLI to accept zstd-compatible numeric levels (-5..22) - Add 11 roundtrip tests covering equivalence, all 22 levels, negative levels, monotonic ratio, streaming, and clamping Closes #21
- Use saturating_abs() instead of (-n) cast to prevent overflow when n == i32::MIN in negative level resolution - Use CompressionLevel::MIN_LEVEL constant instead of hardcoded 131072 - CLI level 0 now maps to default compression (level 3), consistent with CompressionLevel::from_level(0) and C zstd semantics - Add --store flag for uncompressed zstd frames (replaces old level 0 = Uncompressed behavior) - Remove "C zstd numbering" claim from CLI help text since the CLI extends standard numbering with --store
Source-size-aware selection (issue #21 requirement): - Add Matcher::set_source_size_hint() trait method (default no-op) - MatchGeneratorDriver stores hint and adjusts window_log + HC table sizes for small inputs, following C zstd source-size-class behavior - FrameCompressor::set_source_size_hint() passes through to matcher - StreamingEncoder::set_pledged_content_size() also sets the hint - StreamingEncoder::set_source_size_hint() for hint without enforcement - For small inputs, window and hash tables are capped proportionally to source size, avoiding multi-MB allocations for tiny payloads Review fixes: - Derive Level(0) index from DEFAULT_LEVEL constant (Copilot #5) - Tighten test doc comment to match strict assertion (Copilot #6) - Widen CLI level range to MIN_LEVEL..=MAX_LEVEL (CodeRabbit #7) - Use saturating_abs() for negative level overflow (Copilot #1, prev)
- pass source size from CLI into StreamingEncoder before writing - resize dfast hash tables based on hinted window and clamp slice size to window - strengthen roundtrip tests for i32::MIN clamping and hinted frame window assertions
|
@coderabbitai Fixed outside-diff item: |
301f37b to
625c1f0
Compare
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 21261 🏁 Script executed: Length of output: 757
let mut encoder =
structured_zstd::encoding::StreamingEncoder::new(temporary_output, compression_level);
std::io::copy(&mut encoder_input, &mut encoder).wrap_err("streaming compression failed")?;
encoder.finish().wrap_err("failed to finalize zstd frame")The Once the fix lands, the encoder block should look roughly like: let mut encoder =
structured_zstd::encoding::StreamingEncoder::new(temporary_output, compression_level);
encoder
.set_source_size_hint(source_size as u64)
.wrap_err("failed to set source size hint")?;
std::io::copy(&mut encoder_input, &mut encoder).wrap_err("streaming compression failed")?;
encoder.finish().wrap_err("failed to finalize zstd frame")Could you double-check whether the commit containing that change has been pushed to the PR branch? 🧠 Learnings used |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
zstd/src/tests/roundtrip_integrity.rs (1)
748-764:⚠️ Potential issue | 🟡 MinorThis test doesn't lock in the automatic hint plumbing yet.
If
set_pledged_content_size()stops forwarding toset_source_size_hint(), this still passes because level 11 roundtrips fine without the hint. Please assert an observable encoder outcome such as the advertisedwindow_size()versus a no-hint streaming frame.Suggested assertion shape
fn streaming_pledged_size_uses_source_hint() { use crate::encoding::StreamingEncoder; use crate::io::Write; let data = generate_compressible(9602, 2 * 1024); // 2 KiB + let no_hint = { + let mut encoder = StreamingEncoder::new(Vec::new(), CompressionLevel::from_level(11)); + encoder.write_all(&data).unwrap(); + encoder.finish().unwrap() + }; let mut encoder = StreamingEncoder::new(Vec::new(), CompressionLevel::from_level(11)); encoder.set_pledged_content_size(data.len() as u64).unwrap(); encoder.write_all(&data).unwrap(); let compressed = encoder.finish().unwrap(); + + let no_hint_window = crate::decoding::frame::read_frame_header(no_hint.as_slice()) + .unwrap() + .0 + .window_size() + .unwrap(); + let hinted_window = crate::decoding::frame::read_frame_header(compressed.as_slice()) + .unwrap() + .0 + .window_size() + .unwrap(); + assert!(hinted_window <= no_hint_window); + assert!(hinted_window < 16 * 1024 * 1024); let mut decoder = StreamingDecoder::new(compressed.as_slice()).unwrap(); let mut result = Vec::new(); decoder.read_to_end(&mut result).unwrap(); assert_eq!(data, result, "Pledged-size streaming must roundtrip");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/tests/roundtrip_integrity.rs` around lines 748 - 764, The test streaming_pledged_size_uses_source_hint() only checks roundtrip and can pass even if set_pledged_content_size() no longer forwards to set_source_size_hint(); update the test to assert an observable encoder behavior by capturing the encoder's advertised window size (call window_size() or equivalent after writing but before finish) and compare it to the window size produced when you omit the pledged-size hint (create a second StreamingEncoder without calling set_pledged_content_size() and write the same data) to ensure set_pledged_content_size() actually affects set_source_size_hint() and changes the encoder framing; reference StreamingEncoder::new, set_pledged_content_size(), set_source_size_hint(), and window_size() to locate the relevant code paths.cli/src/main.rs (1)
51-68:⚠️ Potential issue | 🟡 Minor
--storestill doesn't fully ignore--level.The help text says the level is ignored, but the clap range check still rejects inputs like
--store --level 23beforecompress()can short-circuit. Either move level validation out of the parser, or tighten the doc string so it only promises runtime ignore after successful parsing.One minimal fix is to make the help text match the current behavior
- /// When set, --level is ignored and the input is wrapped in a - /// raw zstd frame without any compression. + /// When set, the parsed `--level` value is ignored and the input is + /// wrapped in a raw zstd frame without any compression. + /// Values outside the supported range are still rejected by the CLI parser.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/src/main.rs` around lines 51 - 68, The CLI currently enforces a clap range validator on the level argument so values like `--store --level 23` are rejected during parsing even though the code promises to ignore level at runtime; update the argument handling so parsing does not fail when --store is set — e.g., make `level` an Option<i32> or remove the range validator and perform range validation inside your runtime logic (in compress() or the function that dispatches compression) and skip that validation when `store` is true; reference the `level` and `store` arg definitions and the `compress()` call to implement the runtime-only validation or adjust the help text to state parsing may still validate level.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@cli/src/main.rs`:
- Around line 51-68: The CLI currently enforces a clap range validator on the
level argument so values like `--store --level 23` are rejected during parsing
even though the code promises to ignore level at runtime; update the argument
handling so parsing does not fail when --store is set — e.g., make `level` an
Option<i32> or remove the range validator and perform range validation inside
your runtime logic (in compress() or the function that dispatches compression)
and skip that validation when `store` is true; reference the `level` and `store`
arg definitions and the `compress()` call to implement the runtime-only
validation or adjust the help text to state parsing may still validate level.
In `@zstd/src/tests/roundtrip_integrity.rs`:
- Around line 748-764: The test streaming_pledged_size_uses_source_hint() only
checks roundtrip and can pass even if set_pledged_content_size() no longer
forwards to set_source_size_hint(); update the test to assert an observable
encoder behavior by capturing the encoder's advertised window size (call
window_size() or equivalent after writing but before finish) and compare it to
the window size produced when you omit the pledged-size hint (create a second
StreamingEncoder without calling set_pledged_content_size() and write the same
data) to ensure set_pledged_content_size() actually affects
set_source_size_hint() and changes the encoder framing; reference
StreamingEncoder::new, set_pledged_content_size(), set_source_size_hint(), and
window_size() to locate the relevant code paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: dd4a6344-209b-45f2-9c75-97a89e7d4e62
📒 Files selected for processing (7)
README.mdcli/src/main.rszstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rszstd/src/tests/roundtrip_integrity.rs
- make pledged-size streaming test assert observable window-size effect - clarify CLI help: --store skips compression but parse-time level validation remains - add regression test for --store with out-of-range --level
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/src/main.rs`:
- Around line 419-455: Tests use hardcoded numeric limits (e.g., 23 and -131073)
that duplicate the production bounds; update the test inputs to derive
out-of-range and boundary values from CompressionLevel constants instead. In the
tests cli_rejects_unsupported_compression_level_at_parse_time,
cli_rejects_too_negative_compression_level, and
cli_store_still_validates_level_range_at_parse_time, replace the literal "23"
and "-131073" with expressions based on CompressionLevel::MAX_LEVEL and
CompressionLevel::MIN_LEVEL (e.g., (CompressionLevel::MAX_LEVEL + 1).to_string()
for an unsupported-high value and (CompressionLevel::MIN_LEVEL - 1).to_string()
for an unsupported-low value) so the parser-boundary test inputs track the
actual API limits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ed8f871b-6068-40ea-af4f-0248064f2eb6
📒 Files selected for processing (2)
cli/src/main.rszstd/src/tests/roundtrip_integrity.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/mod.rs`:
- Around line 127-129: from_level currently always returns
CompressionLevel::Level(level) which breaks the "3 equals Default" expectation;
change the const fn from_level(level: i32) to return CompressionLevel::Default
when level == 3 and CompressionLevel::Level(level) otherwise so that calls like
from_level(3) produce CompressionLevel::Default; update the match/if logic in
from_level to implement this check while preserving const fn semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d9e37710-37d8-44f2-a8f3-d4314d310d08
📒 Files selected for processing (1)
zstd/src/encoding/mod.rs
|
@coderabbitai resume and full review |
|
🧠 Learnings used✅ Actions performedReviews resumed. Full review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/mod.rs`:
- Around line 169-177: The set_source_size_hint hook is currently ignored by the
built-in non-driver matchers; update NoDictionaryMatcher and TinyMatcher to
override set_source_size_hint(&mut self, size: u64) and store/apply the hint so
FrameCompressor and StreamingEncoder forwards are honored; specifically, add an
implementation in the NoDictionaryMatcher and TinyMatcher types that records the
provided size (and adjusts any internal sizing decisions or exposes it to the
match selection logic used by find_matches/next_match), ensuring the matchers no
longer drop hints forwarded by FrameCompressor and StreamingEncoder.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 32870fe6-8472-477f-90cb-e1168ac43493
📒 Files selected for processing (2)
zstd/src/encoding/match_generator.rszstd/src/encoding/mod.rs
| fn get_next_space(&mut self) -> Vec<u8> { | ||
| self.vec_pool.pop().unwrap_or_else(|| { | ||
| let mut space = alloc::vec![0; self.slice_size]; | ||
| space.resize(space.capacity(), 0); | ||
| space | ||
| }) | ||
| if let Some(mut space) = self.vec_pool.pop() { | ||
| if space.len() > self.slice_size { | ||
| space.truncate(self.slice_size); | ||
| } | ||
| if space.capacity() > self.slice_size { | ||
| space.shrink_to(self.slice_size); | ||
| } | ||
| if space.len() < self.slice_size { | ||
| space.resize(self.slice_size, 0); | ||
| } | ||
| return space; | ||
| } | ||
| alloc::vec![0; self.slice_size] |
There was a problem hiding this comment.
get_next_space() now calls shrink_to(self.slice_size) when reusing a buffer from vec_pool. This can trigger reallocations (and memcpy of up to slice_size bytes) in a hot path, especially when alternating between hinted small windows and normal windows, which risks a measurable compression throughput regression. Consider avoiding unconditional shrinking here (e.g., only shrink when capacity is far above slice_size, or perform capacity trimming once at reset() when the slice size decreases) so the pool continues to amortize allocations.
|
Closing to reopen as a clean squashed PR with rewritten history. |
Summary
CompressionLevel::Level(i32)variant andfrom_level()constructor for C zstd compatible numeric level APIset_source_size_hint()onMatchertrait,FrameCompressor, andStreamingEncoder— caps matcher/window parameters fromceil(log2(src_size))with aMIN_WINDOW_LOGfloor--storeflag for uncompressed framesTechnical Details
Each numeric level resolves to a
LevelParamsstruct containing the matcher backend, window log, hash fill step, lazy depth, and hash-chain config. Named variants map exactly to their numeric equivalents:Fastest=1,Default=3,Better=7,Best=11. Level 0 = default (level 3), matching C zstd semantics.For small inputs,
set_source_size_hint()adjustswindow_logand HC table sizes proportionally to source size, includingsrc_size == 0(clamped to minimum window), avoiding multi-MiB allocations for tiny payloads.StreamingEncoder::set_pledged_content_size()also forwards the pledged size as matcher hint.Window sizes are capped at 64 MiB (window_log=26) to stay within the decoder's 100 MiB limit.
Known Limitations
Test Plan
from_level(n)matches named variants for n=1,3,7,11MatchGeneratorDriver)cargo nextest run --workspace,cargo build --workspace,cargo fmt --allCloses #21
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Compatibility Note
CompressionLevel::Level(i32)expands the public enum shape and is semver-breaking for downstream crates that exhaustivelymatchonCompressionLevelwithout a wildcard arm.