Dynamo Server Fixes + Nemotron Parsing PDF Benchmark changes by praateekmahajan · Pull Request #2050 · NVIDIA-NeMo/Curator

praateekmahajan · 2026-06-05T02:13:36Z

Description

Usage

# Add snippet demonstrating usage

Checklist

I am familiar with the Contributing Guide.
New or Existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2026-06-05T02:13:39Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-06-05T02:18:16Z

Greptile Summary

This PR extends Dynamo PDF parsing to support a "server mode" by wiring three related changes: a subprocess_env passthrough on DynamoServerConfig, --no-<flag> emission for boolean-False engine kwargs, and conditional omission of --kv-events-config when the hybrid KV cache manager is explicitly enabled. It also enriches per-task metrics in the Nemotron-Parse inference stage and updates the benchmark to aggregate token/page throughput from those task stats.

Dynamo backend: adds subprocess_env: dict[str, str] to DynamoServerConfig and merges it into base_env before worker launch, enabling callers to inject arbitrary env vars (e.g. DYN_TCP_REQUEST_TIMEOUT) across all Dynamo subprocesses.
CLI flag generation: engine_kwargs_to_cli_flags now emits --no-<flag> for boolean False values (previously silently dropped), enabling explicit opt-out for vLLM BooleanOptionalAction arguments like --no-disable-hybrid-kv-cache-manager.
KV events / HMA compatibility: _launch_vllm_worker skips --kv-events-config when disable_hybrid_kv_cache_manager: False is explicitly set, avoiding a vLLM incompatibility.

Confidence Score: 4/5

Safe to merge; changes are well-tested and localized to metrics collection, CLI flag generation, and a new env-passthrough knob.

The core logic changes are each covered by new unit tests. The main quality concerns are that engine_kwargs_to_cli_flags emits --no-<flag> for any boolean False kwarg unconditionally, and the benchmark metric prefix is hardcoded to the stage default name, making both brittle in edge cases.

nemo_curator/core/serve/dynamo/infra.py (unconditional --no-<flag> behavior) and benchmarking/scripts/nemotron_parse_pdf_benchmark.py (hardcoded metric prefix).

Important Files Changed

Filename	Overview
nemo_curator/core/serve/dynamo/infra.py	Adds `--no-<flag>` emission for boolean False engine kwargs; applies unconditionally to all booleans rather than restricting to known BooleanOptionalAction flags.
nemo_curator/core/serve/dynamo/backend.py	Merges `subprocess_env` into `base_env`; reserved keys `ETCD_ENDPOINTS` and `NATS_SERVER` are silently overridden if present in user-supplied env.
nemo_curator/core/serve/dynamo/vllm.py	Skips `--kv-events-config` when `disable_hybrid_kv_cache_manager: False` is set explicitly; adds `explicit_hybrid_kv_cache_manager_enabled` helper with correct `is False` identity check.
nemo_curator/stages/interleaved/pdf/nemotron_parse/inference.py	Adds per-task vLLM metrics (tokens, pages, retries, timing) via `_vllm_metrics_from_outputs`; refactors `_infer_vllm` to return raw outputs and retry count alongside decoded texts.
benchmarking/scripts/nemotron_parse_pdf_benchmark.py	Replaces Parquet-based metrics with token/page throughput aggregated from task metrics; hardcodes a metric prefix that couples tightly to the stage default name field.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant DynamoBackend
    participant vllm_worker as _launch_vllm_worker
    participant dynamo_vllm as dynamo.vllm subprocess

    Caller->>DynamoBackend: "serve(backend_cfg) subprocess_env={DYN_TCP_REQUEST_TIMEOUT:180}"
    DynamoBackend->>DynamoBackend: merge subprocess_env + ETCD_ENDPOINTS + NATS_SERVER
    DynamoBackend->>vllm_worker: launch(model_config, base_env)
    vllm_worker->>vllm_worker: explicit_hybrid_kv_cache_manager_enabled?
    alt disable_hybrid_kv_cache_manager is False
        vllm_worker-->>dynamo_vllm: skip kv-events-config, emit no-disable-hybrid-kv-cache-manager
    else normal path
        vllm_worker-->>dynamo_vllm: kv-events-config enabled or disabled
    end
    dynamo_vllm-->>DynamoBackend: worker ready

Comments Outside Diff (1)

nemo_curator/core/serve/dynamo/infra.py, line 107-127 (link)

--no-<flag> emitted for all False booleans, not only BooleanOptionalAction args

The docstring says this only applies to vLLM BooleanOptionalAction arguments, but the implementation applies the --no-<flag> pattern to every boolean False value unconditionally. If a caller passes a kwarg like {"disable_log_stats": False} where the underlying vLLM flag only accepts the positive form (no --no-disable-log-stats variant), the subprocess will receive an unrecognized argument and fail to start. This is a correctness contract that the implementation currently cannot enforce — callers must remember which kwargs are safe to set to False.

_{Reviews (1): Last reviewed commit: "Support Dynamo PDF parsing server mode" | Re-trigger Greptile}

greptile-apps · 2026-06-05T02:18:20Z

+        base_env = {
+            **backend_cfg.subprocess_env,
+            "ETCD_ENDPOINTS": etcd_endpoint,
+            "NATS_SERVER": nats_url,
+        }


Reserved keys in subprocess_env are silently overridden

ETCD_ENDPOINTS and NATS_SERVER are placed after the spread of backend_cfg.subprocess_env, so any user-supplied values for those two keys are silently discarded. Since subprocess_env is a pass-through for custom env vars (e.g., DYN_TCP_REQUEST_TIMEOUT), a user who inadvertently includes one of these reserved keys will observe it having no effect and may spend time debugging. A guard that raises (or at minimum warns) when a reserved key is present in subprocess_env would make the failure mode explicit.

greptile-apps · 2026-06-05T02:18:22Z

+    metric_prefix = "task_nemotron_parse_inference_custom"
+
+    num_valid_pages = task_metrics.get(f"{metric_prefix}.num_valid_pages_sum", 0.0)
+    total_output_tokens = task_metrics.get(f"{metric_prefix}.total_output_tokens_sum", 0.0)


Hardcoded metric prefix assumes the stage's name field is unchanged

metric_prefix = "task_nemotron_parse_inference_custom" encodes the stage name ("nemotron_parse_inference") directly. If a user ever instantiates NemotronParseInferenceStage with a custom name=, or if the class's default is later renamed, task_metrics.get(...) will return 0.0 for every key and the benchmark will silently report zero throughput without any error. Deriving the prefix from the stage object (or a shared constant) would be more robust.

Support Dynamo PDF parsing server mode

94ad641

praateekmahajan requested a review from a team as a code owner June 5, 2026 02:13

praateekmahajan requested review from ayushdg and removed request for a team June 5, 2026 02:13

praateekmahajan changed the title ~~Support Dynamo PDF parsing server mode~~ Dynamo Server Fixes + Nemotron Parsing PDF Benchmark changes Jun 5, 2026

greptile-apps Bot reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamo Server Fixes + Nemotron Parsing PDF Benchmark changes#2050

Dynamo Server Fixes + Nemotron Parsing PDF Benchmark changes#2050
praateekmahajan wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
praateekmahajan:praateek/pdf-parsing-dynamo

praateekmahajan commented Jun 5, 2026

Uh oh!

copy-pr-bot Bot commented Jun 5, 2026

Uh oh!

greptile-apps Bot commented Jun 5, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 5, 2026

Uh oh!

greptile-apps Bot Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

praateekmahajan commented Jun 5, 2026

Description

Usage

Checklist

Uh oh!

copy-pr-bot Bot commented Jun 5, 2026

Uh oh!

greptile-apps Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 5, 2026 •

edited

Loading