You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MojoShader vs_3_0 → GLSL: flattened fallback ternary computes inversesqrt(0)=Inf, then 0 * Inf = NaN poisons the selected (else) value
Component: MojoShader (HLSL bytecode → GLSL translator) as shipped in MonoGame DesktopGL. Severity: Correctness — a normalize() whose argument comes from a "length>0 ? normalized : fallback" ternary returns NaN on OpenGL, silently culling geometry. The byte-identical DirectX (vs_4_0) build is correct. Status:Reproduced with a 4-line minimal standalone shader (below), root-caused down to the generated GLSL.
1. TL;DR
This extremely common HLSL idiom — normalize a vector but guard the zero-length case with a ternary —
float2 d = seed * flag; // can be the zero vectorfloat len = length(d);
float2 dir = (len > 1e-6) ? (d / len) : seed; // guard div-by-zerofloat2 n = normalize(dir);
returns n = (NaN, NaN) on the OpenGL (vs_3_0/MojoShader) backend whenever the else arm is
taken (i.e. when d is the zero vector), even though dir and n are perfectly finite on
DirectX (vs_4_0) from the same .fx.
Root cause (visible in the generated GLSL, §5): MojoShader flattens the ternary into a
branchless select that evaluates both arms unconditionally. The then arm d / len is emitted as d * inversesqrt(dot(d,d)). When d == 0, dot(d,d) == 0, so inversesqrt(0) == +Inf, and 0 * Inf == NaN. The select is mask*then + (1-mask)*else; with mask == 0 it should yield the else value, but IEEE-754 makes 0 * NaN == NaN, so the NaN from the unused then arm poisons the
result. DirectX/fxc does not hit this (it guards or predicates the reciprocal).
2. Environment
Item
Value
MonoGame.Framework.DesktopGL
3.8.0.1641 (NuGet)
MonoGame.Framework.WindowsDX (control)
3.8.0.1641 (NuGet)
dotnet-mgfxc (effect compiler)
3.8.0.1641
GL profile
vs_3_0 / ps_3_0
DX profile (same .fx)
vs_4_0 / ps_4_0
OS
Windows 11
Reproduces independent of GPU/driver — it is a translator codegen issue (the length() of the same
normalized vector reads back correct on the very same draw; only the components are NaN).
All files are in repro/. The trigger is 4 lines of HLSL; the rest is a 60-line MonoGame
harness that renders one triangle and reads back the center pixel.
3.1 repro.fx (the shader — full file)
#if OPENGL
#define VS_SHADERMODEL vs_3_0
#define PS_SHADERMODEL ps_3_0
#else #define VS_SHADERMODEL vs_4_0
#define PS_SHADERMODEL ps_4_0
#endifstruct VSIn { float3 pos : POSITION0; float2 seed : TEXCOORD0; float flag : TEXCOORD1; };
struct VSOut { float4 pos : SV_POSITION; float4 color : COLOR0; };
VSOut MainVS(VSIn input)
{
VSOut o;
o.pos = float4(input.pos, 1.0);
// THE TRIGGER. flag==0 -> dA is the zero vector -> lenA==0 -> dir takes the ELSE arm.float2 dA = input.seed * input.flag;
float lenA = length(dA);
float2 dirA = (lenA > 1e-6) ? (dA / lenA) : input.seed;
float2 B1 = normalize(dirA);
// Encode B1 so the rendered pixel reveals it: R=B1.x, G=B1.y, B=length(B1). NaN -> 0.
o.color = float4(saturate(B1.x * 0.5 + 0.5), saturate(B1.y * 0.5 + 0.5), saturate(length(B1)), 1.0);
return o;
}
float4MainPS(VSOut input) : COLOR0 { return input.color; }
technique T { pass P0 {
VertexShader = compile VS_SHADERMODEL MainVS();
PixelShader = compile PS_SHADERMODEL MainPS();
} }
Renders one full-screen triangle whose 3 vertices all use flag=0, seed=(0.6,0.8), reads the
center pixel of a 64×64 RenderTarget, and decodes it. Two project files build the same Program.cs
against the two backends (repro/harness_gl/Repro.GL.csproj, repro/harness_dx/Repro.DX.csproj):
dotnet run --project repro/harness_dx/Repro.DX.csproj # control
dotnet run --project repro/harness_gl/Repro.GL.csproj # the bug
3.4 Observed output (verbatim)
==================== DX ====================
[REPRO] backend = DirectX (vs_4_0 / fxc)
[REPRO] center pixel RGBA = (204,230,255,255)
[REPRO] R -> B1.x = 0.600 (finite)
[REPRO] G -> B1.y = 0.804 (finite)
[REPRO] B -> length = 1.000 (==1 (normalize ran OK))
[REPRO] RESULT: no bug (components finite)
==================== GL ====================
[REPRO] backend = OpenGL (vs_3_0 / MojoShader)
[REPRO] center pixel RGBA = (0,0,255,255)
[REPRO] R -> B1.x = -1.000 (ZERO => NaN)
[REPRO] G -> B1.y = -1.000 (ZERO => NaN)
[REPRO] B -> length = 1.000 (==1 (normalize ran OK))
[REPRO] RESULT: BUG REPRODUCED (length==1 but B1.x/B1.y read as NaN)
DX:B1 = (0.60, 0.80) — correct (the normalized seed).
GL:B1 = (NaN, NaN) — but length(B1) == 1.0. Both components are NaN; the length (a
separate dot/sqrt instruction sequence over the same register) is correct.
4. The exact characterization
Quantity
DX (vs_4_0)
GL (vs_3_0/MojoShader)
dir (ternary result), else arm taken
finite (== seed)
finite for length, NaN for components
B1.x
0.600
NaN
B1.y
0.804
NaN
length(B1)
1.000
1.000
The "length correct but components NaN" split is the tell: the NaN is produced in the
ternary-select instruction sequence (which both arms feed), not in the application's data.
5. Root cause in the generated GLSL (smoking gun)
mgfxc /Profile:OpenGL embeds the MojoShader GLSL in the .mgfxo. The full vertex shader for the
minimal repro is attached as repro_gl_vertex.glsl.txt. The relevant
lines (register names are MojoShader's; vs_v1=seed, vs_v2=flag):
The select on the line marked *** is the classic flattened-ternary mistake: result = mask*then + else, with mask == 0. Because the then value is NaN (from the
unconditionally-evaluated 0 * inversesqrt(0)), and IEEE-754 defines 0 * NaN = NaN, the result is NaN even though the mask selected the else arm. The application's guard
(len > 1e-6 ? … : …) is exactly meant to avoid the div-by-zero, but MojoShader's branchless
lowering evaluates the guarded arm anyway and then lets its NaN leak through the multiply-select.
(The same pattern at larger scale is in our real strip/ribbon shader — attached mojoshader_strip_vs.glsl.txt, 7 inversesqrt in one flattened VS
— where one ribbon segment whose previous-neighbour edge is degenerate vanishes on GL; that is what
led us here.)
6. Why DirectX is fine
fxc compiling the same HLSL to vs_4_0 does not produce the 0 * Inf hazard — it predicates the
reciprocal / uses a guarded path, so the then arm is not evaluated to Inf when len == 0. So the
defect is specifically in the MojoShader vs_3_0 → GLSL lowering of the conditional select over a rcp/rsqrt-bearing expression, not in the HLSL or the application.
7. Suggested fix (translator side)
The select lowering mask*then + else is unsafe for non-finite then/else. Options:
Use a NaN-safe select when lowering ternaries (mix(else, then, mask) with mix is also
unsafe for the same reason; a true branch or a mask != 0 ? then : else that does not multiply
the dead arm is needed — e.g. emit an actual if, or use bitwise select).
Lower normalize/x / length(x) with a guarded reciprocal (len > 0 ? 1/len : 0) so the dead
arm cannot produce Inf/NaN in the first place.
Either removes the whole class (it is the same root as our second, earlier instance of "a VS value
that should be finite comes back NaN/garbage on GL only").
8. Application-side workaround we use
Because shader-side rewrites on MojoShader proved unreliable, we precompute the affected vertex
values on the CPU and pass them through a trivial passthrough vertex shader on the GL backend
(we already do CPU-side vertex replication), bypassing the MojoShader codegen path for those values.
A pure-HLSL mitigation that may work: avoid the x/length(x) form inside a ternary — e.g. float2 dir = normalize(lenA > 1e-6 ? dA : seed); (normalize once, after the select) — but this
depends on how MojoShader lowers that and we have not exhaustively verified it across our shaders.
MojoShader
vs_3_0→ GLSL: flattened fallback ternary computesinversesqrt(0)=Inf, then0 * Inf = NaNpoisons the selected (else) valueComponent: MojoShader (HLSL bytecode → GLSL translator) as shipped in MonoGame DesktopGL.
Severity: Correctness — a
normalize()whose argument comes from a "length>0 ? normalized : fallback" ternary returns NaN on OpenGL, silently culling geometry. The byte-identical DirectX (vs_4_0) build is correct.Status: Reproduced with a 4-line minimal standalone shader (below), root-caused down to the generated GLSL.
1. TL;DR
This extremely common HLSL idiom — normalize a vector but guard the zero-length case with a ternary —
returns
n = (NaN, NaN)on the OpenGL (vs_3_0/MojoShader) backend whenever theelsearm istaken (i.e. when
dis the zero vector), even thoughdirandnare perfectly finite onDirectX (
vs_4_0) from the same.fx.Root cause (visible in the generated GLSL, §5): MojoShader flattens the ternary into a
branchless select that evaluates both arms unconditionally. The
thenarmd / lenis emitted asd * inversesqrt(dot(d,d)). Whend == 0,dot(d,d) == 0, soinversesqrt(0) == +Inf, and0 * Inf == NaN. The select ismask*then + (1-mask)*else; withmask == 0it should yield theelsevalue, but IEEE-754 makes0 * NaN == NaN, so the NaN from the unusedthenarm poisons theresult. DirectX/fxc does not hit this (it guards or predicates the reciprocal).
2. Environment
vs_3_0/ps_3_0.fx)vs_4_0/ps_4_0Reproduces independent of GPU/driver — it is a translator codegen issue (the
length()of the samenormalized vector reads back correct on the very same draw; only the components are NaN).
3. Minimal reproduction (complete, self-contained)
All files are in
repro/. The trigger is 4 lines of HLSL; the rest is a 60-line MonoGameharness that renders one triangle and reads back the center pixel.
3.1
repro.fx(the shader — full file)3.2 Build
3.3 Run (the harness —
repro/Program.cs)Renders one full-screen triangle whose 3 vertices all use
flag=0,seed=(0.6,0.8), reads thecenter pixel of a 64×64 RenderTarget, and decodes it. Two project files build the same
Program.csagainst the two backends (
repro/harness_gl/Repro.GL.csproj,repro/harness_dx/Repro.DX.csproj):3.4 Observed output (verbatim)
B1 = (0.60, 0.80)— correct (the normalizedseed).B1 = (NaN, NaN)— butlength(B1) == 1.0. Both components are NaN; the length (aseparate
dot/sqrtinstruction sequence over the same register) is correct.4. The exact characterization
vs_4_0)vs_3_0/MojoShader)dir(ternary result), else arm takenlength, NaN for componentsB1.xB1.ylength(B1)The "length correct but components NaN" split is the tell: the NaN is produced in the
ternary-select instruction sequence (which both arms feed), not in the application's data.
5. Root cause in the generated GLSL (smoking gun)
mgfxc /Profile:OpenGLembeds the MojoShader GLSL in the.mgfxo. The full vertex shader for theminimal repro is attached as
repro_gl_vertex.glsl.txt. The relevantlines (register names are MojoShader's;
vs_v1=seed,vs_v2=flag):The select on the line marked
***is the classic flattened-ternary mistake:result = mask*then + else, withmask == 0. Because thethenvalue isNaN(from theunconditionally-evaluated
0 * inversesqrt(0)), and IEEE-754 defines0 * NaN = NaN, the result isNaNeven though the mask selected theelsearm. The application's guard(
len > 1e-6 ? … : …) is exactly meant to avoid the div-by-zero, but MojoShader's branchlesslowering evaluates the guarded arm anyway and then lets its NaN leak through the multiply-select.
(The same pattern at larger scale is in our real strip/ribbon shader — attached
mojoshader_strip_vs.glsl.txt, 7inversesqrtin one flattened VS— where one ribbon segment whose previous-neighbour edge is degenerate vanishes on GL; that is what
led us here.)
6. Why DirectX is fine
fxc compiling the same HLSL to
vs_4_0does not produce the0 * Infhazard — it predicates thereciprocal / uses a guarded path, so the
thenarm is not evaluated to Inf whenlen == 0. So thedefect is specifically in the MojoShader
vs_3_0→ GLSL lowering of the conditional select over arcp/rsqrt-bearing expression, not in the HLSL or the application.7. Suggested fix (translator side)
The select lowering
mask*then + elseis unsafe for non-finitethen/else. Options:mix(else, then, mask)withmixis alsounsafe for the same reason; a true branch or a
mask != 0 ? then : elsethat does not multiplythe dead arm is needed — e.g. emit an actual
if, or use bitwise select).normalize/x / length(x)with a guarded reciprocal (len > 0 ? 1/len : 0) so the deadarm cannot produce
Inf/NaNin the first place.Either removes the whole class (it is the same root as our second, earlier instance of "a VS value
that should be finite comes back NaN/garbage on GL only").
8. Application-side workaround we use
Because shader-side rewrites on MojoShader proved unreliable, we precompute the affected vertex
values on the CPU and pass them through a trivial passthrough vertex shader on the GL backend
(we already do CPU-side vertex replication), bypassing the MojoShader codegen path for those values.
A pure-HLSL mitigation that may work: avoid the
x/length(x)form inside a ternary — e.g.float2 dir = normalize(lenA > 1e-6 ? dA : seed);(normalize once, after the select) — but thisdepends on how MojoShader lowers that and we have not exhaustively verified it across our shaders.
9. Where to report
https://github.com/MonoGame/MonoGame/issues
(precedent: MojoShader codegen issues are filed here, e.g. MonoGame#1813.)
MonoGame's MojoShader fork — https://github.com/MonoGame/mojoshader — has no Issues tab.
https://github.com/icculus/mojoshader/issues
(precedent: [FXC] saturate() on vectors is broken in FXC debug mode, and mojoshader doesn't work around it #10 — MojoShader already special-cases per-component vector codegen
for an FXC quirk; this is the same lowering area.)
Recommendation: file in MonoGame/MonoGame first, cross-link icculus/mojoshader.
10. Attachments (all under
docs/bugreports/)repro/— the complete buildable repro:repro.fx,Program.cs,harness_gl/Repro.GL.csproj,harness_dx/Repro.DX.csproj, and the two compiled*.mgfxo.repro_gl_vertex.glsl.txt— the MojoShader-generated GLSL of theminimal repro VS (the smoking gun in §5).
mojoshader_strip_vs.glsl.txt— the generated GLSL of the realstrip shader where we first hit this (larger, same pattern).