Skip to content

investigation: repeated NewClassTree visits in Nullness checker#1345

Closed
zyf265600 wants to merge 7 commits intoeisop:masterfrom
zyf265600:performance-optimize-001
Closed

investigation: repeated NewClassTree visits in Nullness checker#1345
zyf265600 wants to merge 7 commits intoeisop:masterfrom
zyf265600:performance-optimize-001

Conversation

@zyf265600
Copy link
Copy Markdown

@zyf265600 zyf265600 commented Jul 18, 2025

Summary
This PR started as an attempt to optimize repeated NewClassTree processing in the Nullness checker. During the investigation we instrumented and benchmarked the visitNewClass path, including an additional if (type.hasEffectiveAnnotation(NONNULL)) fast-path.

Benchmark setup

  • A jtreg-based performance harness (NewClassPerf.java) was created.

  • It generates synthetic Java sources with thousands of new expressions (objects, generics, arrays, anonymous classes).

  • Two variants were compared:

    • Variant A: fast-path enabled (default behavior with the extra if).
    • Variant B: fast-path disabled (-Dcf.skipNonnullFastPath=true).
  • Runs were repeated with both AB and BA interleaved protocols to cancel order bias, and also with separate warmup runs.

Results
Across multiple runs (10×400 groups, fixed 2 GB heap), Variant B was consistently as fast or slightly faster (≈ 2–3% median). The extra if check does not bring measurable improvement. In fact, addMissingAnnotation already checks for existing annotations, so the fast-path only adds overhead.

Conclusion
After extensive exploration, we conclude that:

  • The observed repeated visitNewClass calls are expected, since multiple sub-checkers traverse the same AST.
  • This is not a defect but normal framework behavior.
  • There is no straightforward safe optimization to eliminate these visits.
  • Therefore this PR does not provide a performance benefit and can be closed.

Cache NewClassTree processing in NullnessNoInitAnnotatedTypeFactory
to prevent duplicate type annotation calculations.
Comment thread docs/examples/NewObject.java Outdated
Copy link
Copy Markdown
Member

@wmdietl wmdietl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting to look into performance improvements!
On the highest level, it would be useful to add a better description of your motivation to the PR description. How many invocations of this method did you observe and why does caching it improve performance? Is it sound to cache the result or could a later execution compute a different result?

For this particular case, it is rather odd that computing the type of a new class tree is slow, as they are always non-null.

Your added test case should illustrate the performance gain, ideally using a jtreg test with timeout, on a complicated case, see e.g. these tests.

A few more comments inline.

private final ExecutableElement mapGet =
TreeUtils.getMethod("java.util.Map", "get", 1, processingEnv);

// High-level optimization: cache entire addComputedTypeAnnotations processing flow
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadocs should document what is being cached, in particular if the key/value is a String. What format is the String in?
For a cache you would also expect to see the value it is mapped to, so why is a Set enough?

Who cleans up this cache? There are many allocations in large programs and we don't want a Set that becomes arbitrarily large.
Maybe it should be cleared every time a new root is set? See here.

@zyf265600
Copy link
Copy Markdown
Author

zyf265600 commented Aug 13, 2025

Thanks for the review!

Motivation (measured on a tiny example):
public class NewObject {
void test() {
Object nn = new Object();
}
}

Observed NewClassTree visits in a single compile:
NullnessNoInit: 14
KeyFor: 12
Init-FieldAccess: 22
Total: 48

Reason: multiple subcheckers traverse the same AST; AnnotatedTypeFactory intentionally does not cache NewClassTree to avoid phase/context mismatches.

Soundness: Global/phase-crossing caching for NewClassTree is not sound (later executions can differ due to dataflow refinement, initialization state, and type-argument inference). I removed the earlier cross-call cache.

What I changed (safe optimizations, semantics unchanged):
Early-return if the type already has @nonnull in NullnessPropagationTreeAnnotator.visitNewClass.
Only add @nonnull when missing (arrays too) in visitNewArray.

@zyf265600 zyf265600 requested a review from wmdietl August 13, 2025 01:52
Copy link
Copy Markdown
Member

@wmdietl wmdietl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into performance issues. Please also adapt the title of your PRs or open new PRs if they change substantially.

if (type.hasEffectiveAnnotation(NONNULL)) {
return null;
}
type.addMissingAnnotation(NONNULL);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addMissingAnnotation already checks whether the annotation is present, so this change doesn't achieve anything.

Do you actually see any performance differences?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed discussion.

I added a small perf harness to measure the effect of the extra hasEffectiveAnnotation(NONNULL) fast-path.

  • Test file: checker/jtreg/nullness/perf/NewClassPerf.java
    This generates a synthetic source with thousands of new expressions and compares two variants:

    • A = fast-path enabled (extra if)
    • B = fast-path disabled (-Dcf.skipNonnullFastPath=true)
  • How to run:

    ../jtreg/bin/jtreg \
      -workDir:checker/build/jtreg/perf/work \
      -reportDir:checker/build/jtreg/perf/report \
      -verbose:summary -samevm '-keywords:!ignore' \
      "-javacoptions:-classpath checker/dist/checker.jar" \
      "-vmoptions:-classpath checker/dist/checker.jar -Xms2g -Xmx2g \
        -Dperf.runs=10 -Dperf.groups=400" \
      -timeoutFactor:4 \
      checker/jtreg/nullness/perf/NewClassPerf.java
  • Results (10×400 runs, fixed 2 GB heap, JDK 17, macOS arm64):
    Across both AB and BA interleaved protocols, variant B was consistently as fast or slightly faster (≈2–3% median). In other words, the new if does not bring a measurable benefit, as you mentioned addMissingAnnotation already checks for existing annotations, so the extra condition only adds overhead.

Final conclusion: After all these investigations, it shows that the repeated visitNewClass calls are expected: multiple sub-checkers walk the same AST. This is not a defect, and at this point we don’t see a straightforward or sound optimization to reduce it. I’ll therefore retitle this PR as an investigation and propose to close it.

@zyf265600 zyf265600 force-pushed the performance-optimize-001 branch from d684a53 to da837a3 Compare September 11, 2025 04:54
@zyf265600 zyf265600 changed the title perf: add tree processing cache to avoid redundant computations investigation: repeated NewClassTree visits in Nullness checker Sep 11, 2025
@zyf265600 zyf265600 force-pushed the performance-optimize-001 branch from df7497d to 5bf54a6 Compare September 11, 2025 05:29
Copy link
Copy Markdown
Member

@wmdietl wmdietl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough experiment!
So even though the particular change isn't working out, would it be worth it to extract the test harness? Did you start the code from scratch? Would be a waste to not remember how you did this.

If you could split it up into a "Driver" and a "CodeGenerator" then we could run experiments in the future by providing different generators and flags for comparison.

@thisisalexandercook What do you think?

Comment thread checker/jtreg/nullness/perf/NewClassPerf.java Outdated
@thisisalexandercook
Copy link
Copy Markdown
Collaborator

A "CodeGenerator" and "Driver" based on this could be a really nice abstraction to add to our test suite. I could see it being useful even when implementing a new flagged feature to AB test the impact it has (RE: our discussions around implementing scoped checking that actually removes out of scope checks). Maybe you could even test performance across JDKs with this.

Another thought: it might be interesting to use this to compare different tools (e.g., NullAway vs. CF) that perform similar tasks. From that perspective, this type of harness might be better suited to live outside of CF, perhaps in the javac-diagnostics-wrapper. That’s outside the scope of what the wrapper currently does, but maybe something to consider.

@wmdietl
Copy link
Copy Markdown
Member

wmdietl commented Mar 20, 2026

#1424 contains the benchmarking suite. This PR can be closed.

@wmdietl wmdietl closed this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants