Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves performance of the dataflow JavaExpression hierarchy by caching hashCode() results for commonly-used expression types, reducing repeated Objects.hash(...) varargs allocations when these expressions are used as map/set keys.
Changes:
- Add per-instance
hashCodecaching fields to multipleJavaExpressionsubclasses. - Update
hashCode()implementations to memoize the computed value.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
dataflow/src/main/java/org/checkerframework/dataflow/expression/ValueLiteral.java |
Cache hashCode() result for literal expressions. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/UnaryOperation.java |
Cache hashCode() result for unary operations. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/MethodCall.java |
Cache hashCode() result for method calls (including constructor special-case). |
dataflow/src/main/java/org/checkerframework/dataflow/expression/LocalVariable.java |
Cache hashCode() result for local variables. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/FormalParameter.java |
Cache hashCode() result for formal parameters. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/FieldAccess.java |
Cache hashCode() result for field accesses. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/BinaryOperation.java |
Cache hashCode() result for binary operations. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/ArrayCreation.java |
Cache hashCode() result for array creations. |
dataflow/src/main/java/org/checkerframework/dataflow/expression/ArrayAccess.java |
Cache hashCode() result for array accesses. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| return Objects.hash(operationKind, operand); | ||
| if (hashCodeCache == 0) { | ||
| hashCodeCache = Objects.hash(operationKind, operand); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
hashCodeCache uses 0 as the "not computed" sentinel. If the real hash code computes to 0, this will recompute (and re-allocate via Objects.hash(...)) on every call. Prefer a separate boolean (or nullable Integer) so a computed 0 is still cached.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| VarSymbol vs = (VarSymbol) element; | ||
| return Objects.hash( | ||
| index, | ||
| vs.name.toString(), | ||
| TypeAnnotationUtils.unannotatedType(vs.type).toString(), | ||
| vs.owner.toString()); | ||
| if (hashCodeCache == 0) { | ||
| VarSymbol vs = (VarSymbol) element; | ||
| hashCodeCache = | ||
| Objects.hash( | ||
| index, | ||
| vs.name.toString(), | ||
| TypeAnnotationUtils.unannotatedType(vs.type).toString(), | ||
| vs.owner.toString()); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
hashCodeCache uses 0 as the "not computed" sentinel. If the computed hash ends up 0, Objects.hash(...) will run on every call. Use a separate computed flag (or nullable Integer) so 0 can be cached too.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| return Objects.hash(array, index); | ||
| if (hashCodeCache == 0) { | ||
| hashCodeCache = Objects.hash(array, index); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
hashCodeCache uses 0 as the "not computed" sentinel. If Objects.hash(array, index) computes to 0, this will recompute every call (and re-allocate) rather than caching. Use a separate boolean/nullable cache to distinguish "not computed" from a computed 0.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| return Objects.hash(value, type.toString()); | ||
| if (hashCodeCache == 0) { | ||
| hashCodeCache = Objects.hash(value, type.toString()); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
hashCodeCache uses 0 as the "not computed" sentinel. If the real hash code computes to 0, this will recompute (and re-allocate via Objects.hash(...)) on every call, defeating the caching on that path. Use a separate boolean (or an Integer cache) to distinguish "not computed" from a computed 0.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| VarSymbol vs = (VarSymbol) element; | ||
| return Objects.hash(vs.pos, vs.name, vs.owner); | ||
| if (hashCodeCache == 0) { | ||
| VarSymbol vs = (VarSymbol) element; | ||
| hashCodeCache = Objects.hash(vs.pos, vs.name, vs.owner); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
hashCodeCache uses 0 as the "not computed" sentinel. If Objects.hash(vs.pos, vs.name, vs.owner) evaluates to 0, the hash code will be recomputed every call. Use a separate computed flag (or nullable cache) to cache 0 as a valid value.
| hashCodeCache = Objects.hash(operationKind, left, right); | ||
| } | ||
| return hashCodeCache; | ||
| } | ||
|
|
There was a problem hiding this comment.
equals treats some operations as commutative (it considers (left,right) equal to (right,left)), but hashCode() is order-dependent (Objects.hash(operationKind, left, right)). This violates the equals/hashCode contract and can break use as a map/set key. Make hashCode() commutative for commutative operators (e.g., combine operand hashes in an order-independent way), or make equals order-sensitive.
| hashCodeCache = Objects.hash(operationKind, left, right); | |
| } | |
| return hashCodeCache; | |
| } | |
| hashCodeCache = computeHashCode(); | |
| } | |
| return hashCodeCache; | |
| } | |
| /** | |
| * Computes a hash code consistent with equals(Object). | |
| * | |
| * <p>For commutative operations, the operand contribution must be order-independent because | |
| * equals(Object) considers (left, right) equal to (right, left). For non-commutative | |
| * operations, preserve the ordered hash. | |
| */ | |
| private int computeHashCode() { | |
| if (isCommutative()) { | |
| int leftHash = left.hashCode(); | |
| int rightHash = right.hashCode(); | |
| int first = Math.min(leftHash, rightHash); | |
| int second = Math.max(leftHash, rightHash); | |
| return Objects.hash(operationKind, first, second); | |
| } | |
| return Objects.hash(operationKind, left, right); | |
| } |
| public int hashCode() { | ||
| return Objects.hash(getField(), getReceiver()); | ||
| if (hashCodeCache == 0) { | ||
| hashCodeCache = Objects.hash(getField(), getReceiver()); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
equals treats this and super receivers as equivalent in some cases, but hashCode() hashes the concrete receiver via Objects.hash(getField(), getReceiver()). That violates the equals/hashCode contract and can break lookups in hash-based collections. Adjust hashCode() to mirror the equals receiver normalization (e.g., hash a canonical receiver value when it’s ThisReference/SuperReference).
| public int hashCode() { | ||
| if (method.getKind() == ElementKind.CONSTRUCTOR) { | ||
| // No two constructor instances have the same hashcode. | ||
| return System.identityHashCode(this); | ||
| if (hashCodeCache == 0) { | ||
| if (method.getKind() == ElementKind.CONSTRUCTOR) { | ||
| // No two constructor instances have the same hashcode. | ||
| hashCodeCache = System.identityHashCode(this); | ||
| } else { | ||
| hashCodeCache = Objects.hash(method, receiver, arguments); | ||
| } | ||
| } | ||
| return Objects.hash(method, receiver, arguments); | ||
| return hashCodeCache; |
There was a problem hiding this comment.
equals treats calls on this and super as equal (isComparingSuperWithThis), but hashCode() includes the concrete receiver in Objects.hash(method, receiver, arguments). This violates the equals/hashCode contract and can break map/set usage. Update hashCode() to apply the same receiver equivalence as equals (e.g., canonicalize ThisReference/SuperReference before hashing). Also, using 0 as the cache sentinel means a computed 0 (including a possible identityHashCode of 0) won’t actually be cached.
| /** Cache the hashCode. Recomputed if zero. */ | ||
| private int hashCodeCache = 0; | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| return Objects.hash(dimensions, initializers, getType().toString()); | ||
| if (hashCodeCache == 0) { | ||
| hashCodeCache = Objects.hash(dimensions, initializers, getType().toString()); | ||
| } | ||
| return hashCodeCache; |
There was a problem hiding this comment.
This caches the hash code, but dimensions/initializers are stored and returned as mutable lists. If a caller mutates either list after hashCode() is first computed, the cached value becomes stale and breaks hash-based collections. Consider defensive copies (and returning unmodifiable views) to make these fields effectively immutable, or avoid caching here if mutation is expected. Also, using 0 as the cache sentinel means a computed 0 won’t be cached.
|
@copilot Can you measure whether this has any performance benefit compared to CI running on master? |
I measured this against CI on master using Actions runs:
For the 13 matching completed jobs so far, the
Largest swings are mixed (e.g., |
The dataflow expressions are in many maps, so the hash is computed frequently.
Objects.hashCode allocates an array for the varargs, which is bad on the hot path.
The additional memory overhead seems low, and outweighed by the array allocations.