Skip to content

Optimize switch code generation: preserve union-level cases from desugar to backends#1612

Open
Eugeny48 wants to merge 12 commits into
masterfrom
java-switch-merge-field-accessors
Open

Optimize switch code generation: preserve union-level cases from desugar to backends#1612
Eugeny48 wants to merge 12 commits into
masterfrom
java-switch-merge-field-accessors

Conversation

@Eugeny48
Copy link
Copy Markdown
Contributor

@Eugeny48 Eugeny48 commented May 14, 2026

Problem

When a Flow switch uses union names in cases, the desugarer previously expanded each union case into individual struct cases, duplicating the case body for every struct in the union. For nested switches over large unions, this caused O(n×m) code explosion — the same body was repeated dozens of times in generated Java and JavaScript output.

Example

import server/ssql/ssql_i;

main() {
	t : STableUnion = STable(["field"], []);
	q : SSql = t;

	switch (q) {
		SValue(): {
			switch (q) {
				SSimpleValue(): {
					println("The query is a simple value: " + q.structname);
				}
				STableUnion(): {
					println("The query is a table with fields: [" + strGlue(q.fieldNames, ", ") + "]");
				}
				default: {
					println("The query is a value, but not simple: " + toString(q));
				}
			}
		}
		default: {
			println("The query is not a value: " + q.structname);
		}
	}

	println(t.fieldNames);
}

flowc translates this flow code to such JS code

function main(){
  var t=c$F3(["field"],[]),q=t;
    (function(){var s$;switch(q._id){
    case 272:{s$=(function(){var s$;switch(q._id){
        case 272:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 198:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 248:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 181:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 228:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 284:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 152:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 214:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 211:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 289:{s$=console.log(HaxeRuntime.toString((("The query is a table with fields: ["+A9__strGlue(q.fieldNames,", "))+"]"), true));break;}
        case 290:{s$=console.log(HaxeRuntime.toString((("The query is a table with fields: ["+A9__strGlue(q.fieldNames,", "))+"]"), true));break;}
        default:{s$=console.log(HaxeRuntime.toString(("The query is a value, but not simple: "+A9__toString(q)), true));break;}
        }return s$;}());break;}
    case 198:{s$=(function(){var s$;switch(q._id){
        case 272:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 198:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 248:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 181:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 228:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 284:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 152:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 214:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 211:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 289:{s$=console.log(HaxeRuntime.toString((("The query is a table with fields: ["+A9__strGlue(q.fieldNames,", "))+"]"), true));break;}
        case 290:{s$=console.log(HaxeRuntime.toString((("The query is a table with fields: ["+A9__strGlue(q.fieldNames,", "))+"]"), true));break;}
        default:{s$=console.log(HaxeRuntime.toString(("The query is a value, but not simple: "+A9__toString(q)), true));break;}
        }return s$;}());break;}
...

Solution

Introduce FiCaseUnion — a new IR node that preserves union-level switch cases throughout the compilation pipeline, instead of expanding them eagerly during desugaring.

FiSwitchCase ::= FiCase, FiCaseUnion;
FiCase(struct: string, argNames: [string], body: FiExp, start: int);
FiCaseUnion(union: string, structs: [string], body: FiExp, start: int);

Commits

  1. Java backend: merge switch cases with shared fields using Field_ accessor interfaces — When multiple struct cases in a switch access the same fields, merge them into a single case using Field_ accessor interfaces instead of duplicating the body.

  2. Introduce FiCaseUnion — Core infrastructure: new FiSwitchCase union type, desugar produces single FiCaseUnion per union case (body desugared once), type checker handles union cases, all 50+ backend and manipulation files adapted using fiExpandSwitchCases helper for backward compatibility.

  3. Java backend: native FiCaseUnion support — Emits Java fall-through case labels for each struct in the union, with a single shared body. Uses Field_ accessor interfaces (via generalizeVarTypeFiTypeFlow()) for field access across struct types.

  4. JavaScript backend: native FiCaseUnion support — Emits JS fall-through case labels similarly. Adds fiJsSwitchCaseLabels helper. Adapts isIf optimization and >=50 case heuristic for FiCaseUnion.

  5. Update flowc.jar with FiCaseUnion support

  6. Fix overlapping union/struct cases in switches — Switches can have both a union case (e.g. CharacterStyle()) and a struct case for a member of that union (e.g. SetRTL()). Added dedup in solve_expectations and struct subtraction in fc2fi to handle overlaps.

  7. Fix type checker bugs for FiCaseUnion switch cases — Fix union case type parameter instantiation in desugar, fix names2unions expansion for union names, fix bestunion lookup for union cases.

  8. Fix type checker: common parent union checks and union-struct narrowing — When unnamed unions contain structs from a common parent union, suppress false type errors. Fix FUnion ≤ FStruct and FStruct ≤ FUnnamedUnion containment paths.

  9. Fix type checker: detect incompatible type annotations as errors — Type annotations like x : [SubType] = expr where expr has type [SuperType] now produce errors instead of silently accepting the incompatible narrowing.

  10. Fix type checker: tyvar contamination through polymorphic calls and union case switch type — Fix union case switch to use unionType instead of varType. Remove gl2 guard in funifyBounded to prevent type variable contamination through grow_right in non-bounded FUnifyLeft path.

  11. Fix type checker: restore gl1 guard for bounded FUnifyLeft path — The gl1 guard in funifyBounded is needed for the bounded+FUnifyLeft path (fixes Maybe/Some narrowing in switch cases). Only gl2 removal was correct.

  12. Update flowc.jar

Key design decisions

  • Clean semantic separation: FiCase = single struct with field bindings; FiCaseUnion = union with list of structs, no field bindings, body uses union-typed variable
  • Backward compatible: backends that don't need native support call fiExpandSwitchCases() to get [FiCase] as before
  • No expand-then-merge: removed the old fiMergeExpandedUnionCases pass that did wasteful double work (expand all cases, then try to re-merge identical bodies)

The same example after the PR

function main(){
  var t=c$F3(["field"],[]),q=t;
    (function(){var s$;switch(q._id){
    case 272:
    case 198:
    case 248:
    case 181:
    case 228:
    case 284:
    case 152:
    case 214:
    case 211:
    case 282:
    case 289:
    case 290:{s$=(function(){var s$;switch(q._id){
        case 272:
        case 198:
        case 248:
        case 181:
        case 228:
        case 284:
        case 152:
        case 214:
        case 211:{s$=console.log(HaxeRuntime.toString(("The query is a simple value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
        case 289:
        case 290:{s$=console.log(HaxeRuntime.toString((("The query is a table with fields: ["+A9__strGlue(q.fieldNames,", "))+"]"), true));break;}
        default:{s$=console.log(HaxeRuntime.toString(("The query is a value, but not simple: "+A9__toString(q)), true));break;}
        }return s$;}());break;}
    default:{s$=console.log(HaxeRuntime.toString(("The query is not a value: "+(HaxeRuntime.getStructName(q._id))), true));break;}
    }return s$;}());
    return console.log(HaxeRuntime.toString(t.fieldNames, true));
}

Code size reduction

educator.js compiled with flag debug=1: 61.0 MB → 59.4 MB (−1.6 MB, −2.7%).

Adding native FiCaseUnion support to other backends

Currently only Java and JS backends emit compact fall-through code for FiCaseUnion. All other backends (C++, bytecode, D, Lisp, ML, Nim, Wasm, Wise) use fiExpandSwitchCases() to expand union cases back to individual struct cases before code generation — they produce correct but unoptimized output.

To add native support to another backend, follow the Java/JS pattern:

  1. In the FiSwitch code generation, iterate [FiSwitchCase] directly instead of calling fiExpandSwitchCases
  2. Handle FiCaseUnion by emitting fall-through case labels for each struct in FiCaseUnion.structs, with the body emitted once
  3. Handle FiCase as before (single struct with field bindings)
  4. For field access in FiCaseUnion bodies: the switch variable has union type, so use the target language's equivalent of interface-based dispatch (Java uses Field_ accessor interfaces; other backends may use casts to a common base type or dynamic dispatch)

…ssor interfaces

When Flow union types are expanded into individual struct cases during desugaring,
the Java backend previously duplicated the body for each struct case, causing O(n*m)
code explosion for nested switches.

The existing optimization merged consecutive cases with identical bodies using Java's
fall-through, but cases accessing shared fields (e.g., q.fieldNames on STable and
STableLazy) could not merge because codegen used struct-specific casts driven by
per-case narrowed types.

This change generalizes the switch variable's type to FiTypeFlow() in merged case
bodies, which makes the codegen emit Field_ accessor interfaces (e.g.,
((Field_fieldNames)x).get_fieldNames()) instead of struct-specific field access
(e.g., ((Struct_STable)x).f_fieldNames). This enables merging of all cases with
structurally equal bodies, regardless of field access patterns.

Changes:
- Remove fiExpUsesVarTypeDependent check from canMergeSwitchCases
- Add generalizeVarInFiExp() to replace variable types in FiExp trees
- Apply type generalization in merged-group codegen path
@Eugeny48 Eugeny48 marked this pull request as draft May 14, 2026 17:56
Eugeny48 added 4 commits May 15, 2026 15:38
Instead of expanding union cases into per-struct cases during desugaring
(causing O(n*m) code explosion for nested switches), introduce
FiSwitchCase ::= FiCase, FiCaseUnion union type. Desugar now produces a
single FiCaseUnion per union case with the body desugared once.

Key changes:
- fiexp.flow: FiSwitchCase ::= FiCase, FiCaseUnion
- fi_helpers.flow: 8 helper functions + fiExpandSwitchCases for backends
- desugar.flow: single FcCase with union name instead of N struct cases
- fc2fi.flow: thread FcGlobalNames, produce FiCaseUnion for union names
- typechecker.flow: handle union cases in type checking
- All backends use fiExpandSwitchCases to expand at code generation time
- All manipulation passes updated to use FiSwitchCase helpers
- Removed old expand-then-merge pass (fiMergeExpandedUnionCases)
Instead of expanding FiCaseUnion to individual struct cases, emit Java
fall-through case labels for each struct in the union. Uses
generalizeVarType to set FiTypeFlow() on the switch variable so field
accesses use Field_ accessor interfaces instead of struct-specific casts.

Removes old groupMergeableSwitchCases/canMergeSwitchCases dead code.
… labels

Instead of expanding FiCaseUnion to individual struct cases, emit JS
fall-through case labels for each struct in the union. Adds
fiJsSwitchCaseLabels and fiGetStructIdByName helpers.

The isIf optimization and >=50 case heuristic are adapted to account
for FiCaseUnion (totalStructCases counts expanded structs).
@Eugeny48 Eugeny48 marked this pull request as ready for review May 15, 2026 12:56
@Eugeny48 Eugeny48 requested a review from alstrup May 15, 2026 12:56
When a switch has both a union case (e.g. CharacterStyle()) and explicit
struct cases for members of that union (e.g. SetRTL(__)), two fixes:

1. solve_expectations: deduplicate expanded names in exhaustiveness check
2. fc2fi: FiCaseUnion.structs excludes structs claimed by explicit FiCase
   entries or earlier FiCaseUnion entries in the same switch
@Eugeny48 Eugeny48 marked this pull request as draft May 15, 2026 14:09
@Eugeny48 Eugeny48 changed the title Java backend: merge switch cases with shared fields using Field_ accessor interfaces Optimize switch code generation: preserve union-level cases from desugar to backends May 15, 2026
Bug 1: Union names in unnamed union matching - expand union case names
to leaf struct names before names2unions; sub-union expansion in
namedOverlap computation in funify_cases.flow.

Bug 2: Missing type parameter constraints for polymorphic union cases -
add FcLessOrEqual(varType, un) constraint to link fresh tyvars to
concrete type params.

Bug 3: Non-polymorphic union constraint guard - restrict Bug 2 fix to
polymorphic unions only (ntv > 0) to avoid false errors on concrete
struct-typed switch variables.

Bug 4: Bounded type resolution for same-name unions - in funifyBounded,
FUnifyLeft branches used original lower bound, causing makeFBoundedEnv
to collapse Maybe<SubUnion> to Maybe<SubMember> instead of widening.
Fix uses grown lower bound only for same-name FUnion types.
@alstrup
Copy link
Copy Markdown
Contributor

alstrup commented May 18, 2026

@Eugeny48 If it works, then LGTM.

@Eugeny48
Copy link
Copy Markdown
Contributor Author

@Eugeny48 If it works, then LGTM.

It does not work yet, it require one more fix in type checker. I'm working on it.

Eugeny48 added 3 commits May 19, 2026 10:04
In funify_cases.flow:
- When a struct is not in a named union but shares a common parent union,
  expand to unnamed union instead of erroring (e.g. IsEqualMathsStrict +
  MathRoundingOption both in IsEqualMathsOption)
- When comparing union+ vs struct- where the struct is a member, return
  the union to preserve type range instead of narrowing to just the struct
- Suppress containment errors between unnamed unions whose members share
  a common parent union (partial unions from array/if-else type param
  propagation that will resolve during finalization)
- Add unnamedUnionsShareCommonUnion helper using struct2unions map

In combine_types.flow:
- Fix copy-paste bug in union-vs-union path: was checking pInN twice,
  now correctly checks both pInN and nInP
- Add common parent union checks for struct-vs-union, union-vs-struct,
  and union-vs-union paths in combinePositiveAndNegative
Type annotations like 'x : [SubType] = expr' where expr has type
[SuperType] were silently accepted as unchecked downcasts. Now the
type checker detects these as errors using post-solve annotation
checking with contamination cross-referencing to avoid false positives
from shared tyvar widening.

New FcCheckAnnotation constraint is recorded alongside FcLessOrEqual
for annotated let-bindings. After type solving, checkAnnotationConstraints
expands both annotation and expression types to leaf struct sets and
verifies the expression set is a subset of the annotation set. When
multiple annotations share a tyvar (e.g. lambda return type used at
different call sites), cross-referencing identifies contamination from
other annotations and suppresses false positives.
…nion case switch type

Bug 6 fix: Remove gl1/gl2 guards in funifyBounded's boundedLeft and !boundedLeft
FUnifyLeft paths. Tree<int,StructA> was widened to Tree<int,MyUnion> because Tree
is FUnion in FType, matching the guard. The guards were intended for Bug 4 but
never activated (Bug 4's lower bound is FStruct, always hitting the default case).

Switch union case fix: Use unionType (switch annotation) instead of varType
(source variable type) for FcLessOrEqual constraint on union cases, consistent
with how struct cases already work.
@Eugeny48 Eugeny48 force-pushed the java-switch-merge-field-accessors branch from 34f5122 to b3364ba Compare May 20, 2026 07:30
Eugeny48 added 2 commits May 20, 2026 19:59
The previous commit (b3364ba) removed both gl1 and gl2 guards in
funifyBounded. Removing gl2 correctly fixes tyvar contamination for
the non-bounded path (Bug 6). But removing gl1 breaks Bug 4: switch
result types like Maybe<TacticResult> fail to widen to
Maybe<GuidanceResult> when the constraint requires it.

Restore gl1 for the boundedLeft + FUnifyLeft path: when both the
lower bound and the grown result are FUnion with the same name, use
the grown type to preserve widened type params. Otherwise keep the
original lower bound to avoid container type contamination.
@Eugeny48 Eugeny48 force-pushed the java-switch-merge-field-accessors branch from 2177e83 to 8a057ba Compare May 20, 2026 17:05
@Eugeny48 Eugeny48 marked this pull request as ready for review May 20, 2026 17:26
@Eugeny48
Copy link
Copy Markdown
Contributor Author

@alstrup Now PR is ready. The type checker is updated. It’s still not perfect, just a bit better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants