[feat,CI] Improve KV API usability and KVBatchMeta interactions#57
Conversation
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
KVBatchMeta interactions
There was a problem hiding this comment.
Pull request overview
This PR improves the high-level key-value (KV) interface to be more user-friendly by returning metadata from put operations, adding “get by meta” helpers, and renaming the “fields selection” parameter for clarity.
Changes:
kv_put/kv_batch_put(and async variants) now returnKVBatchMetacontaining keys/tags/partition and cumulative field names.- Added
kv_batch_get_by_meta/async_kv_batch_get_by_metaconvenience helpers. - Renamed
kv_batch_get/async_kv_batch_getkeyword argument fromfieldstoselect_fields, and updated tutorial + e2e tests (including sync/async parametrization).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
tutorial/02_kv_interface.py |
Updates tutorial usage to select_fields=... for partial-field retrieval. |
transfer_queue/interface.py |
Updates KV API to return KVBatchMeta, adds get-by-meta helpers, and renames fields → select_fields for get calls. |
tests/e2e/test_kv_interface_e2e.py |
Expands e2e coverage for new return metadata semantics and runs the suite against both sync and async KV APIs via a wrapper fixture. |
Comments suppressed due to low confidence (1)
transfer_queue/interface.py:388
- In
kv_put's docstring, the Raises section mentions “length ofkeys”, but this API accepts a singlekey. Updating the wording would keep the documentation accurate and reduce confusion for users.
ValueError: If nested tensors are provided (use kv_batch_put instead)
RuntimeError: If retrieved BatchMeta size doesn't match length of `keys`
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| async def async_kv_batch_get( | ||
| keys: list[str] | str, partition_id: str, fields: Optional[list[str] | str] = None | ||
| keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None | ||
| ) -> TensorDict: |
There was a problem hiding this comment.
async_kv_batch_get also renames the public keyword argument from fields to select_fields, which will break callers using fields=.... Consider supporting fields as a deprecated alias here as well (and error if both are provided) for API compatibility.
| def kv_batch_get( | ||
| keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None | ||
| ) -> TensorDict: |
There was a problem hiding this comment.
kv_batch_get renamed the public keyword argument from fields to select_fields. This is a backwards-incompatible API change for callers using fields=.... Consider accepting both (fields as a deprecated alias) and raising if both are provided, to avoid breaking existing users while still promoting select_fields.
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
transfer_queue/interface.py:491
- The error message says "fields or tag" but this function parameter is
tags(plural). Update the message to avoid confusing users (especially now thatkv_batch_putis part of the public KV API).
if fields is None and tags is None:
raise ValueError("Please provide at least one parameter of fields or tag.")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def kv_batch_get( | ||
| keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None | ||
| ) -> TensorDict: |
There was a problem hiding this comment.
kv_batch_get renamed the keyword argument from fields to select_fields, which is a breaking change for any callers using fields=. Consider keeping fields as a deprecated alias (e.g., accept both params and error if both are set) to preserve backward compatibility.
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| retrieved = tq.kv_batch_get(keys="nonexistent_key", partition_id=partition_id) | ||
| retrieved = tq_api.kv_batch_get(keys="nonexistent_key", partition_id=partition_id) | ||
| # If it returns, it should be empty | ||
| assert retrieved.batch_size[0] == 0 | ||
| except RuntimeError as e: | ||
| # Or it might raise an error about keys not found | ||
| assert "not found" in str(e).lower() or "empty" in str(e).lower() |
There was a problem hiding this comment.
This test still catches RuntimeError, but kv_batch_get was updated to raise ValueError for missing keys/partitions. Update the assertion to expect ValueError (or broaden the exception type) so the test matches the new interface contract.
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
2 similar comments
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
transfer_queue/interface.py:490
- The error message says "fields or tag" but this function takes
tags(plural). Updating the message to referencetagswould make the API clearer for users.
if fields is None and tags is None:
raise ValueError("Please provide at least one parameter of fields or tag.")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def kv_batch_get( | ||
| keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None | ||
| ) -> TensorDict: |
There was a problem hiding this comment.
Renaming the public kwarg from fields to select_fields in kv_batch_get is a breaking change for any external callers using keyword arguments. Consider keeping fields as a deprecated alias (e.g., accept both and error if both are set) to preserve backwards compatibility while transitioning to select_fields.
| async def async_kv_batch_get( | ||
| keys: list[str] | str, partition_id: str, fields: Optional[list[str] | str] = None | ||
| keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None | ||
| ) -> TensorDict: |
There was a problem hiding this comment.
Same as the sync API: renaming the kwarg from fields to select_fields in async_kv_batch_get breaks callers that pass fields=. Consider supporting fields as a deprecated alias (and/or emitting a DeprecationWarning) to avoid a hard break.
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com> # Conflicts: # tests/e2e/test_kv_interface_e2e.py
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
CLA Signature Pass0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍 |
…cend#57) ## 📝 Description This PR aims to improve the usability and flexibility of the KV (Key-Value) API. It optimizes how `KVBatchMeta` is handled during read and write operations, refactors some internal naming for clarity, and adds unit test coverage for the asynchronous interfaces to ensure stability. ## 🔄 What's Changed 1. **Enhanced `kv_put`**: The `kv_put` operation now directly returns a `KVBatchMeta` object containing all the available fields after the put operation. 2. **Simplified `kv_get`**: Provide an additional API `(async_)kv_batch_get_by_meta` that supports taking a `KVBatchMeta` object directly as input, simplifying the calling logic. 3. **Flexible `KVBatchMeta` Fields**: Introduced public interfaces that allow for more flexible and customizable field selection within `KVBatchMeta`. 4. **Naming Refactor**: Refactored the `fields` variables to `select_fields` during `kv_batch_get` to semantically distinguish from the `fields` (real data TensorDict) in `kv_put`. 5. **Async UT Coverage**: Added UT specifically for the async KV interfaces to ensure robustness. --------- Signed-off-by: 0oshowero0 <o0shower0o@outlook.com> Signed-off-by: Chi Zhang <czhangseu@gmail.com>
…rollout and add CI (#63) ## Summary - **Rewrite `single_controller_demo.py`** to showcase a realistic agentic RLHF training loop: replaces the previous hardcoded tensor inputs and `AsyncvLLMServer` with a multi-turn `AgentLoop` that interleaves LLM generation with simulated tool calls, an OpenAI-style `MessageDataset` with interleaved image support, a proper `DataLoader`-based training loop, and an explicit `compute_reward` step that writes advantages back through TQ. - **Adopt the new KV APIs** introduced in #57 throughout the recipe — uses `kv_batch_get_by_meta`, the return value of `kv_batch_put` (cumulative `KVBatchMeta`), and removes all manual `kv_meta.fields.append(...)` calls. - **Add structured configuration** via `@dataclass` classes (`TrainerConfig`, `AgentLoopConfig`, `MessageDatasetConfig`) and `argparse` CLI, replacing the flat `OmegaConf.create` dict. Trainer config and TQ config are now cleanly separated. - **Add `recipe-check.yml` CI workflow** that runs the demo end-to-end on every push/PR with reduced parameters (`--num-samples 8 --global-batch-size 4 --rollout-agent-num-workers 1`) to keep CI fast. ## Key changes | Area | Before | After | |------|--------|-------| | Rollout | `AsyncvLLMServer` (Ray actor, single-turn) | `AgentLoop` (multi-turn with tool calls, per-sample async) | | Data | Hardcoded `[[1,2],[3,4],...]` tensors | `MessageDataset` + `DataLoader` with random multi-modal messages | | Reward | Simulated inline (`time.sleep`) | `compute_reward()` producing per-token advantages via TQ | | KV API | `kv_batch_get(keys=..., fields=...)` + manual field tracking | `kv_batch_get_by_meta(meta=...)` + `kv_batch_put` return value | | Config | Flat `OmegaConf` dict | Typed `@dataclass` hierarchy + `argparse` CLI | | CI | None | `recipe-check.yml` workflow | ## Test plan - [ ] Run the recipe locally: `python recipe/simple_use_case/single_controller_demo.py --num-samples 8 --global-batch-size 4` - [ ] Verify the new `recipe-check.yml` workflow passes in CI - [ ] Confirm existing tests (`pytest tests`) still pass --------- Signed-off-by: Chi Zhang <czhangseu@gmail.com>
📝 Description
This PR aims to improve the usability and flexibility of the KV (Key-Value) API. It optimizes how
KVBatchMetais handled during read and write operations, refactors some internal naming for clarity, and adds unit test coverage for the asynchronous interfaces to ensure stability.🔄 What's Changed
kv_put: Thekv_putoperation now directly returns aKVBatchMetaobject containing all the available fields after the put operation.kv_get: Provide an additional API(async_)kv_batch_get_by_metathat supports taking aKVBatchMetaobject directly as input, simplifying the calling logic.KVBatchMetaFields: Introduced public interfaces that allow for more flexible and customizable field selection withinKVBatchMeta.fieldsvariables toselect_fieldsduringkv_batch_getto semantically distinguish from thefields(real data TensorDict) inkv_put.