[feat] Add StreamingTokenBudgetSampler for token-budget streaming fetch#3
Merged
Merged
Conversation
9147a0b to
739506f
Compare
Introduce StreamingTokenBudgetSampler and wire up the controller, client,
and storage managers to support a token-budget fetch mode for fully-async /
dynamic-batch consumers:
- get_metadata/get_meta accept token_budget (mutually exclusive with
batch_size); the controller polls the streaming sampler instead of waiting
for N ready samples.
- user_custom_meta is written before samples are marked ready, so streaming
consumers never observe a ready sample without its custom_meta.
- async_put accepts inline custom_meta that lands atomically with readiness,
avoiding the put/set_custom_meta round-trip and race.
Signed-off-by: 宁本哲 <ningbenzhe@xiaohongshu.com>
739506f to
f98d76a
Compare
5f2de10 to
7d5979c
Compare
- Track per-partition completion via the producer's is_last signal instead of pre-allocating to global_batch_size: add production_completed, actual_sample_count, pending_last_indexes/fields on DataPartitionStatus - Thread is_last through put -> get_meta(insert) -> get_metadata; accumulate the true inserted sample count on the insert path only - Flip production_completed only once the is_last batch is produced for the PRODUCER's own fields (not downstream-backfilled columns), avoiding a completion deadlock when advantages/ref backfill extra columns - Add check_stream_drained (production_completed AND all inserted samples consumed) and check_production_completed (producer-side gate) ZMQ ops + client wrappers - Pad/grow the lazily-sized per-task consumption tensor in is_stream_drained (read-only) and mark_consumed (write) so a high global index never overruns it (was: IndexError that killed the controller request thread) - Re-run the completion check right after the insert sets has_pending_last, in case the is_last batch's production NOTIFY arrived before the insert RPC - At EOS slice each DP bucket by token budget over successive batch_index rounds (not one oversized pop -> OOM); cache an explicit empty result for an empty DP so batch_index alignment is not frozen and residue is never stranded --- - production_completed gating, backfilled-field independence, OOB consumption tensor, notify-before-is_last race, EOS multi-round drain within token budget Signed-off-by: 宁本哲 <ningbenzhe@xiaohongshu.com>
7d5979c to
a447841
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce StreamingTokenBudgetSampler and wire up the controller, client, and storage managers to support a token-budget fetch mode for fully-async / dynamic-batch consumers: