Skip to content

GraalVM Support for Valhalla#2

Draft
MichaelHaas99 wants to merge 579 commits into
masterfrom
mh/GR-57655
Draft

GraalVM Support for Valhalla#2
MichaelHaas99 wants to merge 579 commits into
masterfrom
mh/GR-57655

Conversation

@MichaelHaas99

Copy link
Copy Markdown
Owner

No description provided.

…s a dead node for a node that was already process.
… case a placeholder is deleted at a call target occurring multiple times, the chained placeholder will occur in the call target. It shouldn't be deleted.
…We shouldn't rely on an InlineType node as it can be canonicalized to constant null.
… increased to correctly pop the return address in x64.
…stamp as the original node to not lose any information. Add comment.
@MichaelHaas99

MichaelHaas99 commented Jan 21, 2026

Copy link
Copy Markdown
Owner Author

We need a way to propagate the Scalarization nodes for method arguments during inlining or inserted during merge processing in PEA through the graph. If that is not easily possible, I think we should move to the implementation of C2 and remove all changes from PEA. My solution would be:

Value objects from outside a compilation unit can be encountered with the following nodes with stamp value object:
Parameter (non OSR) nodes
LoadStatic node
LoadField node
Invoke node

They are non-larval and can be scalarized. Value objects within a compilation unit are normally virtual during PEA but will be materialized in case the constructor is not inlined. So value objects can also be scalarized if they are receiver of an Invoke node with a constructor call. If all constructors are inlined, it is virtual anyway during PEA.
What I mean with scalarization during PEA is that we alias the corresponding node with a virtual alias. The field values are directly loaded below the definition of the node.
When we load and merge a virtual object, we should think about adjusting the scalarization depth automatically in the function of the closure. Adjusting the scalarization depth means creating a new virtual object with a similar object state, but a different depth, such that all virtual nodes of a certain type, have the same depth.
Scalarization happens in PEReadEliminationClosure.

A special case are OSR arguments as they may still be larval.
In Graal they are marked with the EntryProxy node and replace by an OSRLocal node after parsing. So how can we scalarize, or in other words know that they are non-larval?
We associate an alias to the OSRLocal node with an object state being materialized and tracking the larval state. Whenever we encounter a node that requires the node to be non-larval and we have not encountered a constructor call or StoreField node in between, so the object state indicates larval, we know that the OSR node must have been non-larval from the beginning onwoards. As the Scalarization node is floating we can set the scalarized values in all object states, as we scalarize directly at the beginning.
When we reach a StoreField node we set the field to initialized in the object state.

Scalarizarion nodes should be made floating and anchored at the earliest possible position. We should rely on location identities as this is not necessary and allows GVN. Value objects are immutable, so it does not make any difference when a field is read, as long as it is non-larval.

As a consequence the input to a Scalarization node inserted during parsing should always be virtual. We don't need any local aliases as discussed below, as we can just update the materialized state and insert the field values.

Progress can be found in the following branch:
https://github.com/MichaelHaas99/graal/tree/mh/GR-57655-PEA

--------------------------------old stuff----------------------------------
As PEA runs on the full method, the only time a OSR parameter is larval is, when it points to a AllocatedObject node, so when it was created during the compilation unit. In this case we create a new virtual object with a materualized object state pointing to the OSR parameter, and track the larval state. A EntryProxy node is GVNed so we shouldn't have the problem of multiple OSR arguments actually representing the same input. The start state can either be all fields uninitialized or all fields intialized. In case the input is virtual the object state non-larval, we know that all fields are initialized. This is because in the JVM new instances need to be initialized with a constructor call. So in case the input is larval, no constructor call happened yet. When the last store or constructor call makes it non-larval, we can scalarize it directly.

--------------------------------old stuff----------------------------------
OSR argument nodes that have the same input, need to share the array which tracks the larval state in the object state.
they can be scalarized as:
obj and val input to LoadField or argument of Invoke node
val input to StoreField or StoreStatic node
In Graal this shouldn't be a problem as we always parse the whole graph and in the OnStackReplacementPhase extract the stuff we are interested in. So in case the EntryProxy node has a virtual input which is non-larval, we can scalarize the EntryProxy node.

--------------------------------old stuff----------------------------------
I propose whenever PEA encounters one of the following nodes above, it will scalarize the inputs of this node, then process it, and also scalarize the result of the node.
As Invoke nodes can be inlined and removed before PEA, we need to insert nodes during parsing that indicate to PEA that there was an Invoke before and the arguments can be scalarized.

As the Invoke node may get inlined we need a new node special for OSR arguments that indicates that it can be scalarized at this position.
One way would be to introduce a new node called Scalarizeable. This node has multiple nodes of type value class as input, indicating that they are non-larval and can be scalarized. This type of scalarization is an optimization, we don't need the field values yet, but maybe later on. Without PEA the node will just be removed during lowering. It has no inputs and should remove an input, if the input has exactly one usage, the Scalarizeable node itself, to not keep nodes unnecessarily alive. Due to inlining the Invoke node disappears before PEA. An Invoke node would tell us that the argument nodes are non-larval. All other places where we know that the node is non-larval, can be detected implicitly by the node type during PEA and needs no Scalarizeable node.
The existing Scalarization node should be used when at a certain bytecode we always immediately need the field values of a node, e.g. for passing a scalarized method argument. The Scalarization node should not be used in combination with an InlineType node anymore.

All in all this helps us to keep the compilation for economy mode fast and simple. All the magic of further optimizations happen during PEA. With local alias we mean when we do merge processing during PEA, we need to merge the local aliases as well, like we also merge object states
We insert

a Scalarizeable node at the following position:

  • method arguments not scalarized for the caller and callee

a Scalarization node for:

  • method arguments scalarized for a caller and callee
  • storing a value object in a flat field
  • storing a value object in flat array element

a InlineType node for:

  • the values from a flat field or element load
  • scalarized method arguments

PEA should do the following actions on the following nodes (new nodes e.g. for scalarization, are inserted after the currently processed node):
InlineType node:

  • scalarize value class oop fields recursivley (by inserting Scalarization nodes)
  • replace with a virtual object

Scalarization node:

  • if the input is virtual, do what we already do
  • else (in PEReadEliminationClosure) scalarize value class oop fiels recursively, create a virtual object and set it as local alias to the input

Scalarizeable node:

  • if the input is virtual, set it as local alias
  • else (in PEReadEliminationClosure) scalarize value class oop fiels recursively, create a virtual object and set it as local alias to the input

If node with IsNull node:

  • scalarize value object input operand

Invoke node with init call:

  • scalarize value object receiver after call

Invoke with non-scalarized value object return

  • scalarize return

LoadField node with value class field:

  • if object is virtual, load the field and adjust scalarization depth of field value
  • else scalarize the object and the loaded value, but do not remove the LoadField node

StoreFieldNode:

  • if value is not virtual, scalarize

When we scalarize inputs of the nodes above we should do that before processing this node with virtualize.

Further optimizations are to make Scalarization nodes part of ReadElimination phase. In case a LoadField is above, a MultiValue node can be replaced with it.
Update FloatingRead phase to make Scalarization nodes floating as well to GVN. The Locationidentity of Scalarization nodes should be immutable to have no anti dependency.

--------------------------------old stuff----------------------------------
One way to solve this is to insert marker nodes during parsing. These marker nodes e.g. called Scalarizeable have no usages but indicate to PEA that the value object can be scalarized recursivly (what we already do during merging). Or in other words are inserted when we know the value object is non-larval. This should be only useful to detect method arguments of earlier inlined methods. We should also make the Scalarization node floating during the FloatingRead phase. The input to the Scalarizeable node will be replaced by the new virtual object, but is only valid locally. When we do merge processing during PEA, we need to merge the local aliases as well, like we also merge object states. Scalarization nodes can be made floating as long as they don't float to a position where the object is larval, scheduling them at different positions does not make a difference for non-larval memory.
The difference between a Scalarizeable node and a Scalarization node is, that the Scalarizedable node does nothing without PEA, while the Scalarization node will at least load the fields it needs and forward them to its projection nodes. During PEA they should both scalarize recursivley.
This in general will also reduce code size.

E.g. for the following code (suppose read elimination is not executed), we get the following non-optimal code. This is because the scalarized version is currently not propagated.

int notInlined (ValueClass scalarizedObject) { }

int lateInlined (ValueClass nonScalarizedObject) {
notInlined(nonScalarizedObject);
return o.field1;
}

int demo (ValueClass nonScalarizedObject) {
lateInlined(nonScalarizedObject);
return nonScalarizedObject.field1;
}

non-optimal code
int demo (ValueClass nonScalarizedObject) {
Object o = scalarize(nonScalarizedObject);
notInlined (o);
return nonScalarizedObject.field1;
}

optimal code should be
int demo (ValueClass nonScalarizedObject) {
Object o = scalarize(nonScalarizedObject);
notInlined (o);
return o.field1;
}

}

@Override
public boolean handleLoadIndexed(GraphBuilderContext b, ValueNode array, ValueNode index, GuardingNode boundsCheck, JavaKind elementKind) {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a new node similar to LoadFlatField node which encapsulates the load indexed logic, instead of already inserting null checks etc. what normally is done during lowering in JIT mode.

* deleting this node.
*/
@NodeInfo(cycles = CYCLES_UNKNOWN, cyclesRationale = "We don't know statically how many, and which, objects we are gonna scalarize.", size = SIZE_UNKNOWN, sizeRationale = "We don't know statically how much code for which scalarization has to be generated.")
public class ScalarizationNode extends FixedWithNextNode implements MemoryAccess, Virtualizable, MultiValue, IterableNodeType, Simplifiable {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this node floating and without a memory access, scalarization of value objects accesses immutable data, which does not depend on when we load the fields, as long as the value object is non-larval. Making this node floating allows it to be GVNed and reduces code size.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if (node instanceof OSRLocal) {
  alias node with object state where the materialized value points to the node
} else if (node instanceof ParameterNode) {
  scalarize
} else if (node instanceof Invoke invoke && StampTool.isInlineType(invoke.getReceiver()) && invoke.getTargetMethod().isConstructor()) {
  // NewInstance node should have been virtualized
  assert getAlias(node.getReceiver()) instanceof VirtualInstanceNode;
  scalarize receiver and set entries in existing object state, anchor the Scalarization node to the Invoke node
}

@MichaelHaas99 MichaelHaas99 Feb 2, 2026

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (StampTool.isNullableInlineType(node)) {
  scalarize
  // although the scalarized values are inserted below this node, as we process in reverse post-order we are allowed to create a virtual object and alias this node with the new virtual object.
}

@MichaelHaas99 MichaelHaas99 Feb 2, 2026

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (StampTool.isNullableInlineType(node)) {
  if (getAlias(node) instanceof VirtualInstanceNode) {
  set field initialized in object state
  if object state becomes non-larval, scalarize
  }
}

@MichaelHaas99 MichaelHaas99 Feb 2, 2026

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if (node instanceof LoadFieldNode load) {
  if (getAlias(load.object()) instanceof VirtualInstanceNode) {
    scalarize load.object();
    call virtualize
  }
} else if (node instanceof StoreFieldNode store) {
  scalarize store.value()
} else if (node instanceof IfNode ifNode &&ifNode.condition() instanceof IsNullNode isNullNode && StampTool.isNullableInlineTypeNode(isNullNode.getValue())) {
  scalarize isNullNode.getValue()
}

}

private boolean mergeObjectStates(int resultObject, int[] sourceObjects, PartialEscapeBlockState<?>[] states, int currentScalarizationDepth,
List<JavaType> visited) {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for a visited list as in C2, at some point we will terminate, as value object fields are final so we can't land in a recursion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant