Skip to content

target/mips: don't pass cpu_gpr[0] to TCG ops for $zero writes#6

Merged
lacraig2 merged 1 commit into
mainfrom
fix/mips-zero-reg-hypercall-guard
Jun 10, 2026
Merged

target/mips: don't pass cpu_gpr[0] to TCG ops for $zero writes#6
lacraig2 merged 1 commit into
mainfrom
fix/mips-zero-reg-hypercall-guard

Conversation

@lacraig2

Copy link
Copy Markdown
Contributor

Summary

The Penguin guest-hypercall hook in target/mips/tcg/translate.c replaced two upstream guards that treated writes to $zero as NOPs. On MIPS cpu_gpr[0] is NULL ($zero is special-cased and never allocated as a TCG global — see mips_tcg_init()), so both changed paths now hand a NULL TCGv to the code generator:

  • gen_cond_move() — the if (rd == 0) return; NOP guard was narrowed to only the movz $0,$0,$0 hypercall trigger, so movz/movn $0, rs, rt falls through to tcg_gen_movcond_tl(..., cpu_gpr[0], ...).
  • gen_cp0() / OPC_MFC0 — the if (rt == 0) return; NOP guard was removed, so mfc0 $0, rd falls through to gen_mfc0(ctx, cpu_gpr[0], ...).

A NULL TCGv does not fault at translation time; it produces a garbage temp index, so the translated block silently corrupts unrelated TCG temps / guest registers. The result is intermittent wild control-flow transfers in the guest — observed as random SIGSEGVs in large dynamically-linked programs (e.g. CPython), with the trigger being the guest kernel's privileged mfc0 $0 CP0 hazard reads.

Diagnosis

Captured guest core dumps (mipsel and mips64el) showed a consistent signature: PC jumped to an unmapped address in the main executable's region, ra clobbered to page-aligned garbage, while t9/gp were still valid. The crash never reproduces under upstream qemu-user (which retains the guards). Only MIPS is affected: RISC-V and LoongArch route their hypercall results through dest_gpr()/gen_set_gpr() helpers that already handle the zero register, whereas MIPS indexes cpu_gpr[] directly.

Fix

Restore the NOP guards after the hypercall check (matches pre-hook upstream behavior). 18 lines, MIPS-only.

The Penguin guest-hypercall hook replaced two upstream guards that treated
writes to $zero as NOPs:

  - gen_cond_move(): the 'if (rd == 0) return;' NOP guard was narrowed to
    only the movz $0,$0,$0 hypercall trigger, so movz/movn $0, rs, rt now
    falls through to tcg_gen_movcond_tl(..., cpu_gpr[0], ...).
  - gen_cp0()/OPC_MFC0: the 'if (rt == 0) return;' NOP guard was removed, so
    mfc0 $0, rd now falls through to gen_mfc0(ctx, cpu_gpr[0], ...).

On MIPS cpu_gpr[0] is NULL ($zero is special-cased and never allocated as a
TCG global), so both paths hand a NULL TCGv to the code generator. That does
not fault at translation time; it yields a garbage temp index and silently
corrupts unrelated TCG temps / guest registers, producing intermittent wild
control-flow transfers in the guest (observed as random SIGSEGVs in large
dynamically-linked programs such as CPython, triggered by the kernel's
privileged 'mfc0 $0' CP0 hazard reads).

Restore the NOP guards after the hypercall check. RISC-V and LoongArch route
their hypercall results through dest_gpr()/gen_set_gpr() helpers that already
handle the zero register, so only MIPS (which indexes cpu_gpr[] directly) is
affected.
@lacraig2 lacraig2 merged commit 268e68e into main Jun 10, 2026
1 check passed
@lacraig2 lacraig2 deleted the fix/mips-zero-reg-hypercall-guard branch June 10, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant