Skip to content

chore(release): bump version to 0.22.4#302

Merged
xiaguan merged 1 commit into
masterfrom
chore/bump-0.22.4
May 29, 2026
Merged

chore(release): bump version to 0.22.4#302
xiaguan merged 1 commit into
masterfrom
chore/bump-0.22.4

Conversation

@xiaguan
Copy link
Copy Markdown
Collaborator

@xiaguan xiaguan commented May 29, 2026

chore(release): bump version to 0.22.4

Bumps the Rust workspace, the pegaflow-llm Python package, the commitizen version, and the Cargo.lock workspace package versions from 0.22.3 to 0.22.4.


Release notes — 0.22.3 → 0.22.4

18 PRs landed on master since v0.22.3 (2026-05-15). Grouped below for release notes.

Highlights

  • Disaggregated prefill/decode over RDMA push (feat(pd): RDMA push connector for disaggregated prefill/decode #297) — a brand-new vLLM v1 KV connector (PdConnector) plus a v2 RDMA transfer engine (pegaflow-transfer/src/v2). KV is pushed prefill→decode layer-by-layer via one-sided RDMA WRITE as each attention layer completes, overlapping transfer with the forward pass instead of pulling after prefill finishes (vLLM NIXL model). On H20 / Qwen3-8B the added TTFT is 2–4× lower than NIXL across 512–16k input lengths.
  • Query leases replace query pinning (feat(connector): replace query pinning with leases #284, fix(connector): allow query leases across workers #288) — the query/load/release control path moved from pin refcounts to lease-backed ownership. Query results collapse to Loading/Ready only; Ready carries num_hit_blocks plus an opaque lease that transfers scheduler→worker and is released on cleanup/failure, with a TTL sweeper reclaiming abandoned leases.
  • Save-only connector mode (feat(connector): add save-only pegaflow mode #300) — new pegaflow.mode config; save_only skips Pega query/load while still advancing save metadata, so an instance can populate the cache without serving reads.

Features

Fixes

Performance

  • perf: CPU-path Criterion benchmarks + long-block save optimizations — e.g. query_prefetch_lease/32768 ~12.3 ms → ~6.1 ms, save_flush_unique/8192 ~21.3 ms → ~13.1 ms via reduced prefix-key cloning, ordered multi-layer save grouping, and RawBlock inline-segment allocation (perf: add cpu path benchmarks and optimize long-block saves #290)

Internal / refactor / tests

Notable behavior & config changes (upgrade notes)

Full PR list

#298 refactor(metrics): use build_buckets helper for histogram buckets
#300 feat(connector): add save-only pegaflow mode
#299 feat(storage): add sharded SSD cache support
#297 feat(pd): RDMA push connector for disaggregated prefill/decode
#290 perf: add cpu path benchmarks and optimize long-block saves
#295 fix(connector): preserve non-MLA kv layout registration
#293 fix(numa): allocate pinned pools on GPU-local NUMA nodes
#292 fix(connector): handle split physical kv blocks
#291 feat(rdma): per-peer N QPs with WQE-level round-robin
#285 feat(metaserver): add node lifecycle fencing
#287 refactor(core): make prefetch task terminal
#289 test(server): add mock vLLM RPC E2E coverage
#288 fix(connector): allow query leases across workers
#284 feat(connector): replace query pinning with leases
#283 fix(server): fail on invalid rdma nics
#282 fix(connector): remove scheduler save limit
#281 chore: tune transfer duration buckets
#280 fix(connector): demote cache_lookup_reuse log to debug

Bump Rust workspace, Python package, commitizen version, and Cargo.lock workspace package versions to 0.22.4.

Validation:
- git diff --check
- cargo fmt --all -- --check
- cargo clippy --workspace --all-targets -- -D warnings (pre-commit, passed)
- cargo check --workspace --offline

cargo test --release skipped at commit time: all pegaflow-core tests panic
on this host with cudarc 'libcudart.so: undefined symbol:
cudaEventElapsedTime_v2' (env-only, unrelated to this version-string bump);
will be exercised by CI.
Copy link
Copy Markdown
Contributor

@feifei-111 feifei-111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaguan xiaguan merged commit ae27de5 into master May 29, 2026
12 checks passed
@xiaguan xiaguan deleted the chore/bump-0.22.4 branch May 29, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants