chore(release): bump version to 0.22.4#302
Merged
Merged
Conversation
Bump Rust workspace, Python package, commitizen version, and Cargo.lock workspace package versions to 0.22.4. Validation: - git diff --check - cargo fmt --all -- --check - cargo clippy --workspace --all-targets -- -D warnings (pre-commit, passed) - cargo check --workspace --offline cargo test --release skipped at commit time: all pegaflow-core tests panic on this host with cudarc 'libcudart.so: undefined symbol: cudaEventElapsedTime_v2' (env-only, unrelated to this version-string bump); will be exercised by CI.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
chore(release): bump version to 0.22.4
Bumps the Rust workspace, the
pegaflow-llmPython package, the commitizen version, and theCargo.lockworkspace package versions from0.22.3to0.22.4.Release notes — 0.22.3 → 0.22.4
18 PRs landed on
mastersincev0.22.3(2026-05-15). Grouped below for release notes.Highlights
PdConnector) plus a v2 RDMA transfer engine (pegaflow-transfer/src/v2). KV is pushed prefill→decode layer-by-layer via one-sided RDMA WRITE as each attention layer completes, overlapping transfer with the forward pass instead of pulling after prefill finishes (vLLM NIXL model). On H20 / Qwen3-8B the added TTFT is 2–4× lower than NIXL across 512–16k input lengths.Loading/Readyonly;Readycarriesnum_hit_blocksplus an opaque lease that transfers scheduler→worker and is released on cleanup/failure, with a TTL sweeper reclaiming abandoned leases.pegaflow.modeconfig;save_onlyskips Pega query/load while still advancing save metadata, so an instance can populate the cache without serving reads.Features
--qps-per-peer(default 2), round-robin at WQE level so one in-flight task saturates all QPs; handshake validates both sides agree on N (feat(rdma): per-peer N QPs with WQE-level round-robin #291)--node-stale-secs(feat(metaserver): add node lifecycle fencing #285, closes feat(metaserver): add server heartbeat liveness and graceful Bye RPC #222)Fixes
query lease is unknown or expired(fix(connector): allow query leases across workers #288)--nics, reject empty names, propagate RDMA init failures instead of silently disabling P2P (fix(server): fail on invalid rdma nics #283, fixes pegaflow-server:--nicssilently misparses comma-separated values #276)cache_lookup_reuselog from INFO to DEBUG to stop log spam under cache pressure (fix(connector): demote cache_lookup_reuse log to debug #280)Performance
query_prefetch_lease/32768~12.3 ms → ~6.1 ms,save_flush_unique/8192~21.3 ms → ~13.1 ms via reduced prefix-key cloning, ordered multi-layer save grouping, and RawBlock inline-segment allocation (perf: add cpu path benchmarks and optimize long-block saves #290)Internal / refactor / tests
build_bucketshelper (refactor(metrics): use build_buckets helper for histogram buckets #298)Notable behavior & config changes (upgrade notes)
Loading/Readyonly;Readyexposesnum_hit_blocks+ an opaque lease. Pin/unpin refcount semantics are gone (feat(connector): replace query pinning with leases #284, fix(connector): allow query leases across workers #288).FailedPreconditioninstead of being silently accepted (test(server): add mock vLLM RPC E2E coverage #289).--nics: now rejects empty entries (e.g.mlx5_0,,mlx5_1) and fails startup on RDMA init errors rather than silently falling back to no-P2P (fix(server): fail on invalid rdma nics #283).--qps-per-peer(default 2) (feat(rdma): per-peer N QPs with WQE-level round-robin #291),--node-stale-secsfor metaserver (feat(metaserver): add node lifecycle fencing #285).pegaflow.modewithread_write(default) /save_only(feat(connector): add save-only pegaflow mode #300).Full PR list