Reduce downstream Megatron patching for RL use cases

Some Megatron Core features are difficult to use from external RL training loops without copying or monkey-patching GPTModel.forward, GPT postprocess, MTP postprocess, or 1F1B schedule plan.

This usually happens when the training loop owns data or semantics that Megatron Core should not model directly: selected-token labels, loss masks, packed sequence metadata, old/reference logprobs, KL/entropy terms, or custom fused logprob/loss computation.

**Current downstream symptoms**

- veRL patches/copies GPT/MTP postprocess logic:
    - verl/models/mcore/mtp_patch.py
    - verl/models/mcore/model_forward_fused.py
    - verl/models/mcore/model_forward_1f1b_overlap.py
- NeMo RL has patch_gpt_model_forward_for_linear_ce_fusion(...) in nemo_rl/distributed/model_utils.py, which monkey-patches GPTModel.forward to return selected-token logprobs from hidden states and output weights.

This indicates that external training loops need a stable, objective-neutral extension point at the GPT postprocess boundary.

**Proposed direction**

Add a small optional GPT output/postprocess hook, keyword-only. The hook should run after decoder hidden states are available and before the default output-layer logits/loss path. This should avoid adding PPO/GRPO/RL-specific arguments to GPTModel.forward.

Schedule-plan support

Thread the same optional processor/context through build_schedule_plan and the 1f1b schedule-plan PostProcessNode. 

MTP follow-up

Handle MTP separately if needed. First investigate whether MTP can expose a narrow callable for custom loss/logprob computation while Megatron Core continues to own MTP shifting, packed-sequence handling, scaling, and logging behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce downstream Megatron patching for RL use cases #4590

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce downstream Megatron patching for RL use cases #4590

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions