Skip to content

Test pr2#1506

Open
HAOCHENYE wants to merge 2 commits intoInternLM:mainfrom
HAOCHENYE:test-pr2
Open

Test pr2#1506
HAOCHENYE wants to merge 2 commits intoInternLM:mainfrom
HAOCHENYE:test-pr2

Conversation

@HAOCHENYE
Copy link
Collaborator

No description provided.

ghstack-source-id: 3b810e7
Pull-Request: InternLM#1484
@HAOCHENYE
Copy link
Collaborator Author

@claude

@HAOCHENYE
Copy link
Collaborator Author

@claude


# Like Black, automatically detect the appropriate line ending.
line-ending = "auto"
111
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Critical: This line (111) appears to be a test artifact from the first commit. It will break TOML parsing since it's not valid TOML syntax. This must be removed before merging.

Suggested change
111

router_logits: dict[str, torch.Tensor] | None = None
balancing_loss: torch.Tensor | None = None
z_loss: torch.Tensor | None = None
tokens_per_expert_global: torch.Tensor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Critical: tokens_per_expert_global is declared as required (no default value), but the original TypedDict had it as NotRequired[torch.Tensor].

In _micro_batch_forward (line ~463), tokens_per_expert_global is only added to the output dict inside the if all_router_logits: block. If that condition is false, constructing MoEModelOutputs(**output) will raise a Pydantic ValidationError because this required field is missing.

This should be optional to match the original behavior:

Suggested change
tokens_per_expert_global: torch.Tensor
tokens_per_expert_global: torch.Tensor | None = None

Comment on lines +87 to +89
def free(self):
super().free()
self.router_logits = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Warning: MoEModelOutputs.free() only clears router_logits but not balancing_loss, z_loss, or tokens_per_expert_global. For consistency with ModelOutputs.free() (which clears all its tensor fields), all MoE-specific tensor fields should also be freed:

Suggested change
def free(self):
super().free()
self.router_logits = None
def free(self):
super().free()
self.router_logits = None
self.balancing_loss = None
self.z_loss = None
self.tokens_per_expert_global = None

Comment on lines +213 to +214
# TODO: Only for avoid BC. Should be removed later.
def __contains__(self, key):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Warning: __contains__ uses model_fields_set which only tracks fields explicitly passed during construction. This works correctly for the current code patterns (where fields are conditionally added to the output dict before construction), but note this is semantically different from checking hasattr or checking if a value is non-None.

For example, MoEModelOutputs(loss=loss, extra_info=info, tokens_per_expert_global=tpe, balancing_loss=None) would report "balancing_loss" in output as True (because it was explicitly passed), even though the value is None. This may be surprising to callers who expect in to indicate a meaningful value is present. Consider documenting this behavior or checking getattr(self, key) is not None instead.

Comment on lines 198 to +201

# TODO: yanhuida, should be removed
if not isinstance(extra_info, ModelForwardExtraLogInfo):
extra_info = ModelForwardExtraLogInfo(extra_info)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Warning: This TODO-guarded workaround wraps extra_info in ModelForwardExtraLogInfo if it isn't already one. Two concerns:

  1. This introduces a circular-style dependency: loss/base_loss_ctx.py now imports from model/utils/misc.py. If extra_info can arrive as a plain dict here, the root cause should be fixed at the source (where extra_info is produced), not patched at the consumer.
  2. The # TODO: yanhuida, should be removed comment doesn't explain when or under what condition this should be removed, making it easy to forget. Consider adding a reference to a tracking issue.

Comment on lines 11 to +13
types: [submitted]
pull_request_target:
types: [opened, synchronize]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Warning — Security: pull_request_target runs in the context of the base branch and has access to repository secrets, unlike the safer pull_request event. While the Claude Code action uses restricted allowed_tools, the pull_request_target trigger is a known attack vector for PRs from untrusted forks — the PR author could craft malicious content in the PR title, body, or diff that influences the Claude agent's behavior in a context where secrets are available.

Consider whether pull_request (which does NOT have access to secrets) is sufficient for code review, or add explicit if: github.event.pull_request.head.repo.full_name == github.repository guards to limit this to same-repo PRs only.

prompt: |
REPO: ${{ github.repository }}
PR NUMBER: ${{ github.event.pull_request.number }}
PR NUMBER: ${{ github.event.pull_request.number || github.event.issue.number}}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: Nit: Missing space before closing }}:

Suggested change
PR NUMBER: ${{ github.event.pull_request.number || github.event.issue.number}}
PR NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}

@claude
Copy link

claude bot commented Feb 28, 2026

Claude: ## Summary

This PR converts ModelOutputs from TypedDict to Pydantic BaseModel and adds __getitem__/__contains__ for backward compatibility with dict-style access used throughout the codebase.

Issues

Critical

  • [pyproject.toml:133] 111 appended to the file — test artifact that will break TOML parsing.
  • [xtuner/v1/model/moe/moe.py:85] tokens_per_expert_global changed from NotRequired to required (no default). Will raise ValidationError when the field is not provided in _micro_batch_forward.

Warning

  • [xtuner/v1/model/moe/moe.py:87-89] MoEModelOutputs.free() only clears router_logits but not balancing_loss, z_loss, or tokens_per_expert_global.
  • [xtuner/v1/model/base.py:213-214] __contains__ uses model_fields_set — semantically different from None-checking; may surprise callers.
  • [xtuner/v1/loss/base_loss_ctx.py:198-201] TODO workaround should be fixed at source, not patched at consumer.
  • [.github/workflows/claude.yml:11-13] pull_request_target has access to secrets — security risk for fork PRs.

Nit

  • [.github/workflows/claude.yml:44] Missing space before }}.

Verdict

REQUEST_CHANGES

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant