[wip][nightly] RAPIDS 26.08* / Ray 3* / Dynamo 1.3* + bump transformers 5 + data-designer 0.61 by praateekmahajan · Pull Request #2065 · NVIDIA-NeMo/Curator

praateekmahajan · 2026-06-11T00:03:36Z

Tracjs dynamo/ray/RAPIDS nightly along with transformers and data-designer at their newest releases, so a weekly benchmark surfaces upstream breakage early. Migrates Curator to the new APIs and clears accumulated CVE constraint/override tech-debt.

Dependencies (pyproject.toml, uv.lock):

RAPIDS cudf/cuml/cugraph/raft/rmm/rapidsmpf -> 26.08 nightly (a*) from the rapids-nightly index; transitive nightly libs listed explicitly so prerelease="if-necessary-or-explicit" stays scoped (no stray PyPI betas).
Add cudf-streaming-cu12 (partition_and_pack/unpack_and_concat moved here out of rapidsmpf in 26.08).
transformers>=5,<6 override (defeats nemo-toolkit[asr]'s 4.57 pin), huggingface-hub>=1.5,<2, packaging>=25, pandas>=3 overrides.
data-designer 0.5.5 -> 0.6.1.
Drop the huggingface-hub<1.0 override and the numpy<=2.2 / protobuf<7 caps.
Remove all 12 CVE constraint floors (verified redundant: the nightly stack already resolves at/above every CVE fix).

transformers 5:

batch_encode_plus -> call (text/models/tokenizer.py, text/embedders/vllm.py, text/io/writer/megatron_tokenizer.py).
data_designer: add deepcopy so Xenna pipeline_spec deepcopy survives hf-hub>=1.0 caching an unpickleable DuckDBPyConnection.

cuml 26.08:

semantic dedup KMeans -> cuml.cluster.kmeans_mg.KMeansMG (single-GPU KMeans dropped handle=; private _fit(multigpu=True) removed -> KMeansMG.fit()).

rapidsmpf 26.08 (deduplication/shuffle_utils/rapidsmpf_shuffler.py):

imports -> memory.{buffer,buffer_resource,spill} + integrations.ray.RapidsMPFActor.
BufferResource(memory_limits={DEVICE:int}, statistics=...); Statistics(enable=) (dropped mr); direct Shuffler(comm,0,nparts,br); insert_finished() once; wait()+local_partitions(); inline cudf<->pylibcudf helpers (utils.cudf removed), re-exported and repointed lsh.py.

cugraph 26.08:

connected_components: symmetrize=False -> True (cugraph honors the flag literally; the one-directional dedup edge-list must be symmetrized).

Tests: test_kmeans _fit->fit mock; test_minhash values_host->to_numpy.

Build/test via main docker/Dockerfile (CURATOR_EXTRA=all --all-groups); full pytest cpu+gpu together.

Description

Usage

# Add snippet demonstrating usage

Checklist

I am familiar with the Contributing Guide.
New or Existing tests cover these changes.
The documentation is up to date with these changes.

….6.1 Extends the dynamo/ray/vLLM-cu129 nightly baseline to also track RAPIDS, transformers and data-designer at their newest releases, so a weekly benchmark surfaces upstream breakage early. Migrates Curator to the new APIs and clears accumulated CVE constraint/override tech-debt. Dependencies (pyproject.toml, uv.lock): - RAPIDS cudf/cuml/cugraph/raft/rmm/rapidsmpf -> 26.08 nightly (a*) from the rapids-nightly index; transitive nightly libs listed explicitly so prerelease="if-necessary-or-explicit" stays scoped (no stray PyPI betas). - Add cudf-streaming-cu12 (partition_and_pack/unpack_and_concat moved here out of rapidsmpf in 26.08). - transformers>=5,<6 override (defeats nemo-toolkit[asr]'s 4.57 pin), huggingface-hub>=1.5,<2, packaging>=25, pandas>=3 overrides. - data-designer 0.5.5 -> 0.6.1. - Drop the huggingface-hub<1.0 override and the numpy<=2.2 / protobuf<7 caps. - Remove all 12 CVE constraint floors (verified redundant: the nightly stack already resolves at/above every CVE fix). transformers 5: - batch_encode_plus -> __call__ (text/models/tokenizer.py, text/embedders/vllm.py, text/io/writer/megatron_tokenizer.py). - data_designer: add __deepcopy__ so Xenna pipeline_spec deepcopy survives hf-hub>=1.0 caching an unpickleable DuckDBPyConnection. cuml 26.08: - semantic dedup KMeans -> cuml.cluster.kmeans_mg.KMeansMG (single-GPU KMeans dropped handle=; private _fit(multigpu=True) removed -> KMeansMG.fit()). rapidsmpf 26.08 (deduplication/shuffle_utils/rapidsmpf_shuffler.py): - imports -> memory.{buffer,buffer_resource,spill} + integrations.ray.RapidsMPFActor. - BufferResource(memory_limits={DEVICE:int}, statistics=...); Statistics(enable=) (dropped mr); direct Shuffler(comm,0,nparts,br); insert_finished() once; wait()+local_partitions(); inline cudf<->pylibcudf helpers (utils.cudf removed), re-exported and repointed lsh.py. cugraph 26.08: - connected_components: symmetrize=False -> True (cugraph honors the flag literally; the one-directional dedup edge-list must be symmetrized). Tests: test_kmeans _fit->fit mock; test_minhash values_host->to_numpy. Build/test via main docker/Dockerfile (CURATOR_EXTRA=all --all-groups); full pytest cpu+gpu together. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Praateek <praateekm@gmail.com>

copy-pr-bot · 2026-06-11T00:03:40Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…refresh nightlies Follow-up to the nightly bump. docker/Dockerfile: - Stub ray/dashboard/client/build (its own layer, after uv sync). The ray nightly wheel omits the prebuilt dashboard frontend, so the dashboard process died with FrontendNotFoundError and its HTTP/API server never registered — breaking every ray.util.state call (cosmos-xenna uses it) with "Could not read 'dashboard' from GCS". This was blocking ALL xenna pipeline e2e tests (semantic dedup, data-designer, nemotron-cc NDD). No-op on stable wheels that ship client/build. data_designer: - Add __getstate__/__setstate__ (mirror of the __deepcopy__ added in the bump) so Ray can pickle the stage to its actors. The live DataDesigner caches an unpickleable duckdb.DuckDBPyConnection under hf-hub>=1.0; rebuild it on unpickle via __post_init__. Synthetic/NDD suite is green (70/70) with this + the dashboard fix. pyproject.toml (from all-extras-cu129): - Route the ray nightly wheel via [tool.uv.sources] per (python, arch) for cp311/12/13 instead of an inline URL in dependencies; the dependency stays a clean ray[default,data]>=2.55.1 (PyPI fallback for non-x86_64). uv.lock: - Re-locked with the ray-source change plus a targeted refresh of the nightly packages only (cudf a633->a634, libcudf->a635, rapidsmpf a37->a38, ...). ai-dynamo held at dev20260608: its latest nightlies require an exact ai-dynamo-runtime==<same> that isn't published, so the refresh upgrades ai-dynamo only and lets uv backtrack to the latest consistent pair. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Praateek <praateekm@gmail.com>

praateekmahajan · 2026-06-11T00:11:14Z

/ok to test faf4108

copy-pr-bot Bot temporarily deployed to public June 11, 2026 00:11 Inactive

copy-pr-bot Bot temporarily deployed to test June 11, 2026 00:11 Inactive