[Runtime] Add NVIDIA-Nemotron-3-Super-120B-A12B-FP8 runtime by TJ5 · Pull Request #610 · ome-projects/ome

TJ5 · 2026-05-12T20:02:54Z

What this PR does

Adds OME configuration for serving NVIDIA Nemotron 3 Super 120B A12B FP8 with a 1M context window:
Adds the nvidia-nemotron-3-super-120b-a12b-fp8 ClusterBaseModel pointing to hf://nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8.
Adds the vllm-nvidia-nemotron-3-super-120b-a12b-fp8 ClusterServingRuntime with SMG router + vLLM settings for NemotronHForCausalLM, FP8 KV cache, 4-way tensor parallelism, H100 scheduling, chunked prefill, Nemotron v3 reasoning parsing, and Qwen3 Coder tool-call parsing.
Registers the model and runtime in the kustomizations.
Adds a sample InferenceService for the NVIDIA Nemotron namespace.

Why we need it

Enables serving for NVIDIA-Nemotron-3-Super-120B-A12B-FP8.

Fixes #

How to test

Checklist

Tests added/updated (if applicable)
Docs updated (if applicable)
make test passes locally

This reverts commit 0f5d64e.

init

98c4c88

github-actions Bot added runtime Runtime configuration changes models Model configuration changes config Configuration changes labels May 12, 2026

TJ5 marked this pull request as ready for review May 12, 2026 20:15

TJ5 requested review from CatherineSue, XinyueZhang369 and slin1237 as code owners May 12, 2026 20:15

YouNeedCryDear reviewed May 12, 2026

View reviewed changes

Comment thread config/runtimes/vllm/nvidia/nvidia-nemotron-3-super-120b-a12b-fp8-rt.yaml

TJ5 added 3 commits May 12, 2026 14:58

feedback

baa1df5

use tp2 for fp8

0f5d64e

Revert "use tp2 for fp8"

f9ad9e6

This reverts commit 0f5d64e.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtime] Add NVIDIA-Nemotron-3-Super-120B-A12B-FP8 runtime#610

[Runtime] Add NVIDIA-Nemotron-3-Super-120B-A12B-FP8 runtime#610
TJ5 wants to merge 4 commits into
ome-projects:mainfrom
TJ5:nvidia-nemotron-super-fp8-runtime

TJ5 commented May 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TJ5 commented May 12, 2026

What this PR does

Why we need it

How to test

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants