Skip to content

[feat,CI] Improve KV API usability and KVBatchMeta interactions#57

Merged
0oshowero0 merged 11 commits into
Ascend:mainfrom
0oshowero0:han/api
Mar 25, 2026
Merged

[feat,CI] Improve KV API usability and KVBatchMeta interactions#57
0oshowero0 merged 11 commits into
Ascend:mainfrom
0oshowero0:han/api

Conversation

@0oshowero0

@0oshowero0 0oshowero0 commented Mar 25, 2026

Copy link
Copy Markdown
Collaborator

📝 Description

This PR aims to improve the usability and flexibility of the KV (Key-Value) API. It optimizes how KVBatchMeta is handled during read and write operations, refactors some internal naming for clarity, and adds unit test coverage for the asynchronous interfaces to ensure stability.

🔄 What's Changed

  1. Enhanced kv_put: The kv_put operation now directly returns a KVBatchMeta object containing all the available fields after the put operation.
  2. Simplified kv_get: Provide an additional API (async_)kv_batch_get_by_meta that supports taking a KVBatchMeta object directly as input, simplifying the calling logic.
  3. Flexible KVBatchMeta Fields: Introduced public interfaces that allow for more flexible and customizable field selection within KVBatchMeta.
  4. Naming Refactor: Refactored the fields variables to select_fields during kv_batch_get to semantically distinguish from the fields (real data TensorDict) in kv_put.
  5. Async UT Coverage: Added UT specifically for the async KV interfaces to ensure robustness.

Copilot AI review requested due to automatic review settings March 25, 2026 02:03
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@0oshowero0 0oshowero0 changed the title [feat] User-friendly KV API [feat,CI] Improve KV API usability and KVBatchMeta interactions Mar 25, 2026
@0oshowero0 0oshowero0 changed the title [feat,CI] Improve KV API usability and KVBatchMeta interactions [feat,CI] Improve KV API usability and KVBatchMeta interactions Mar 25, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the high-level key-value (KV) interface to be more user-friendly by returning metadata from put operations, adding “get by meta” helpers, and renaming the “fields selection” parameter for clarity.

Changes:

  • kv_put / kv_batch_put (and async variants) now return KVBatchMeta containing keys/tags/partition and cumulative field names.
  • Added kv_batch_get_by_meta / async_kv_batch_get_by_meta convenience helpers.
  • Renamed kv_batch_get / async_kv_batch_get keyword argument from fields to select_fields, and updated tutorial + e2e tests (including sync/async parametrization).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
tutorial/02_kv_interface.py Updates tutorial usage to select_fields=... for partial-field retrieval.
transfer_queue/interface.py Updates KV API to return KVBatchMeta, adds get-by-meta helpers, and renames fieldsselect_fields for get calls.
tests/e2e/test_kv_interface_e2e.py Expands e2e coverage for new return metadata semantics and runs the suite against both sync and async KV APIs via a wrapper fixture.
Comments suppressed due to low confidence (1)

transfer_queue/interface.py:388

  • In kv_put's docstring, the Raises section mentions “length of keys”, but this API accepts a single key. Updating the wording would keep the documentation accurate and reduce confusion for users.
        ValueError: If nested tensors are provided (use kv_batch_put instead)
        RuntimeError: If retrieved BatchMeta size doesn't match length of `keys`


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 899 to 901
async def async_kv_batch_get(
keys: list[str] | str, partition_id: str, fields: Optional[list[str] | str] = None
keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None
) -> TensorDict:

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_kv_batch_get also renames the public keyword argument from fields to select_fields, which will break callers using fields=.... Consider supporting fields as a deprecated alias here as well (and error if both are provided) for API compatibility.

Copilot uses AI. Check for mistakes.
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py Outdated
Comment on lines +567 to +569
def kv_batch_get(
keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None
) -> TensorDict:

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kv_batch_get renamed the public keyword argument from fields to select_fields. This is a backwards-incompatible API change for callers using fields=.... Consider accepting both (fields as a deprecated alias) and raising if both are provided, to avoid breaking existing users while still promoting select_fields.

Copilot uses AI. Check for mistakes.
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

transfer_queue/interface.py:491

  • The error message says "fields or tag" but this function parameter is tags (plural). Update the message to avoid confusing users (especially now that kv_batch_put is part of the public KV API).
    if fields is None and tags is None:
        raise ValueError("Please provide at least one parameter of fields or tag.")


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +568 to +570
def kv_batch_get(
keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None
) -> TensorDict:

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kv_batch_get renamed the keyword argument from fields to select_fields, which is a breaking change for any callers using fields=. Consider keeping fields as a deprecated alias (e.g., accept both params and error if both are set) to preserve backward compatibility.

Copilot uses AI. Check for mistakes.
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py
Comment thread tests/e2e/test_kv_interface_e2e.py
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread transfer_queue/__init__.py
Comment on lines 604 to 610
try:
retrieved = tq.kv_batch_get(keys="nonexistent_key", partition_id=partition_id)
retrieved = tq_api.kv_batch_get(keys="nonexistent_key", partition_id=partition_id)
# If it returns, it should be empty
assert retrieved.batch_size[0] == 0
except RuntimeError as e:
# Or it might raise an error about keys not found
assert "not found" in str(e).lower() or "empty" in str(e).lower()

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test still catches RuntimeError, but kv_batch_get was updated to raise ValueError for missing keys/partitions. Update the assertion to expect ValueError (or broaden the exception type) so the test matches the new interface contract.

Copilot uses AI. Check for mistakes.
Comment thread transfer_queue/interface.py Outdated
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

2 similar comments
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

transfer_queue/interface.py:490

  • The error message says "fields or tag" but this function takes tags (plural). Updating the message to reference tags would make the API clearer for users.
    if fields is None and tags is None:
        raise ValueError("Please provide at least one parameter of fields or tag.")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread transfer_queue/interface.py Outdated
Comment thread transfer_queue/interface.py Outdated
Comment on lines +588 to +590
def kv_batch_get(
keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None
) -> TensorDict:

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming the public kwarg from fields to select_fields in kv_batch_get is a breaking change for any external callers using keyword arguments. Consider keeping fields as a deprecated alias (e.g., accept both and error if both are set) to preserve backwards compatibility while transitioning to select_fields.

Copilot uses AI. Check for mistakes.
Comment on lines 944 to 946
async def async_kv_batch_get(
keys: list[str] | str, partition_id: str, fields: Optional[list[str] | str] = None
keys: list[str] | str, partition_id: str, select_fields: Optional[list[str] | str] = None
) -> TensorDict:

Copilot AI Mar 25, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the sync API: renaming the kwarg from fields to select_fields in async_kv_batch_get breaks callers that pass fields=. Consider supporting fields as a deprecated alias (and/or emitting a DeprecationWarning) to avoid a hard break.

Copilot uses AI. Check for mistakes.
Comment thread transfer_queue/interface.py
Comment thread transfer_queue/interface.py
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>

# Conflicts:
#	tests/e2e/test_kv_interface_e2e.py
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
@ascend-robot

Copy link
Copy Markdown

CLA Signature Pass

0oshowero0, thanks for your pull request. All authors of the commits have signed the CLA. 👍

@0oshowero0 0oshowero0 merged commit 28b846b into Ascend:main Mar 25, 2026
6 checks passed
vermouth1992 pushed a commit to vermouth1992/TransferQueue that referenced this pull request Mar 28, 2026
…cend#57)

## 📝 Description
This PR aims to improve the usability and flexibility of the KV
(Key-Value) API. It optimizes how `KVBatchMeta` is handled during read
and write operations, refactors some internal naming for clarity, and
adds unit test coverage for the asynchronous interfaces to ensure
stability.

## 🔄 What's Changed
1. **Enhanced `kv_put`**: The `kv_put` operation now directly returns a
`KVBatchMeta` object containing all the available fields after the put
operation.
2. **Simplified `kv_get`**: Provide an additional API
`(async_)kv_batch_get_by_meta` that supports taking a `KVBatchMeta`
object directly as input, simplifying the calling logic.
3. **Flexible `KVBatchMeta` Fields**: Introduced public interfaces that
allow for more flexible and customizable field selection within
`KVBatchMeta`.
4. **Naming Refactor**: Refactored the `fields` variables to
`select_fields` during `kv_batch_get` to semantically distinguish from
the `fields` (real data TensorDict) in `kv_put`.
5. **Async UT Coverage**: Added UT specifically for the async KV
interfaces to ensure robustness.

---------

Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: Chi Zhang <czhangseu@gmail.com>
0oshowero0 pushed a commit that referenced this pull request Mar 28, 2026
…rollout and add CI (#63)

## Summary

- **Rewrite `single_controller_demo.py`** to showcase a realistic
agentic RLHF training loop: replaces the
previous hardcoded tensor inputs and `AsyncvLLMServer` with a multi-turn
`AgentLoop` that interleaves
LLM generation with simulated tool calls, an OpenAI-style
`MessageDataset` with interleaved image
support, a proper `DataLoader`-based training loop, and an explicit
`compute_reward` step that writes
  advantages back through TQ.
- **Adopt the new KV APIs** introduced in #57 throughout the recipe —
uses `kv_batch_get_by_meta`,
the return value of `kv_batch_put` (cumulative `KVBatchMeta`), and
removes all manual
  `kv_meta.fields.append(...)` calls.
- **Add structured configuration** via `@dataclass` classes
(`TrainerConfig`, `AgentLoopConfig`,
`MessageDatasetConfig`) and `argparse` CLI, replacing the flat
`OmegaConf.create` dict. Trainer config
  and TQ config are now cleanly separated.
- **Add `recipe-check.yml` CI workflow** that runs the demo end-to-end
on every push/PR with reduced
parameters (`--num-samples 8 --global-batch-size 4
--rollout-agent-num-workers 1`) to keep CI fast.

## Key changes

| Area | Before | After |
|------|--------|-------|
| Rollout | `AsyncvLLMServer` (Ray actor, single-turn) | `AgentLoop`
(multi-turn with tool calls, per-sample async) |
| Data | Hardcoded `[[1,2],[3,4],...]` tensors | `MessageDataset` +
`DataLoader` with random multi-modal messages |
| Reward | Simulated inline (`time.sleep`) | `compute_reward()`
producing per-token advantages via TQ |
| KV API | `kv_batch_get(keys=..., fields=...)` + manual field tracking
| `kv_batch_get_by_meta(meta=...)` + `kv_batch_put` return value |
| Config | Flat `OmegaConf` dict | Typed `@dataclass` hierarchy +
`argparse` CLI |
| CI | None | `recipe-check.yml` workflow |

## Test plan

- [ ] Run the recipe locally: `python
recipe/simple_use_case/single_controller_demo.py --num-samples 8
--global-batch-size 4`
- [ ] Verify the new `recipe-check.yml` workflow passes in CI
- [ ] Confirm existing tests (`pytest tests`) still pass

---------

Signed-off-by: Chi Zhang <czhangseu@gmail.com>
@0oshowero0 0oshowero0 deleted the han/api branch April 2, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants