Add event-level comparison tool for productions by GernotMaier · Pull Request #2181 · gammasim/simtools

GernotMaier · 2026-05-11T14:35:17Z

Add a new application to compare multiple simulation productions on event level using reduced event-data inputs. Calculate also KS/Chi2 statistics to compare distributions.

This adds a lot of plots - attached some examples (gammas at 300 GeV)

Plotting adds a lot of code. Note sure how to make this more maintainable.

v0 plots (ignore; kept for archival reasons; incorrect scatter radius): plots.tar.gz
plots_v2.tar.gz

Copilot

Pull request overview

Adds a new CLI application to compare multiple simulation productions at the event level using reduced event-data inputs, plus supporting metric-collection utilities, plotting routines, documentation, and tests.

Changes:

Introduce simtools-compare-productions application that parses production descriptors and dispatches to an event-level comparison implementation.
Add event-level metric aggregation (production_comparison) and a new visualization module generating multi-production comparison plots.
Add unit/integration tests and documentation entries for the new application and modules.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/simtools/applications/compare_productions.py`	New CLI entry point for production comparisons; currently implements the `events` comparison level.
`src/simtools/sim_events/production_comparison.py`	New utilities to parse production inputs and aggregate event-level metrics (including per-telescope-type subsets).
`src/simtools/visualization/plot_event_level_production_comparison.py`	New plotting module producing event-level comparison figures across productions and per telescope type.
`tests/unit_tests/applications/test_compare_productions_on_event_level.py`	Unit test for application orchestration from parsed args → metrics collection → plotting call.
`tests/unit_tests/sim_events/test_production_comparison.py`	Unit tests for production argument parsing and metric aggregation.
`tests/unit_tests/visualization/test_plot_event_level_production_comparison.py`	Unit tests covering plot outputs and several helper/branch behaviors.
`tests/integration_tests/config/compare_productions_on_event_level_run.yml`	Integration workflow config to run the new application and assert selected plot outputs exist.
`pyproject.toml`	Registers `simtools-compare-productions` console script entry point.
`docs/source/user-guide/applications/simtools-compare-productions.rst`	User-guide page for the new application (autodoc).
`docs/source/user-guide/applications.md`	Adds the new application to the user-guide toctree.
`docs/source/api-reference/sim_events.md`	Adds API reference entry for `production_comparison`.
`docs/source/api-reference/visualization.md`	Adds API reference entry for `plot_event_level_production_comparison`.
`docs/source/conf.py`	Extends nitpick ignore list for `collections.Counter` references in autodoc.
`docs/changes/2181.feature.md`	Changelog fragment for the new feature.

Comments suppressed due to low confidence (1)

src/simtools/visualization/plot_event_level_production_comparison.py:639

_plot_triggered_vs_quantity() is effectively unreachable in normal runs because _TRIGGERED_FRACTION_QUANTITIES is initialized as an empty set. If triggered-fraction-vs-quantity plots are intended outputs, populate this set (e.g., with the supported quantity names) and update tests/integration expectations; otherwise remove the dead branch and the test that relies on patching this constant.

_HISTOGRAM_STYLE_QUANTITIES = {"energy"}
_TRIGGERED_FRACTION_QUANTITIES = set()
_SPECIAL_TRIGGER_SUBSETS = {"single_telescope", "mixed_type"}

orelgueta · 2026-05-13T11:29:16Z

@orelgueta look at the plots attached here and check what is missing or redundant.

Event level comparison statistics

tobiaskleiner

Thanks @GernotMaier, I have added a few comments.

tobiaskleiner · 2026-05-19T13:23:37Z

+        if len(patterns) == 0:
+            raise ValueError(f"Production '{label}' has no event_data_file pattern.")
+
+        resolved_files = [str(path) for path in resolve_file_patterns(patterns)]


Quite a lot of (untested) code here in this application. Consider moving it to modules and add tests.

Agree. Fixed.

tobiaskleiner · 2026-05-19T13:25:55Z

+    @property
+    def trigger_fraction(self):
+        """Return triggered/simulated fraction."""
+        if self.simulated_event_count <= 0:


it can go to negative?

Probably not. But the difference of testing for zeros and <=0 is identical?

tobiaskleiner · 2026-05-19T13:32:22Z

+    total2 = float(np.sum(counts2))
+    if total1 <= 0 or total2 <= 0:
+        return None, None, False, "insufficient_data"
+    support_mask = counts1 > 0


Please check that this is intended. You drop all bins later for comparison where the baseline histogram is 0, which I think skews the chi^2. Maybe better to merge empty bins in this case?

Very good point! I've spent now almost two hours on it:

merging of empty bins introduce a lot of complicated code and makes things very hard to understand

considered replacing zero bins by almost-zero bins. But that would mess up the Chi2 statistics.

My conclusion was now to completely remove the Chi2 statistics as it is hard to interpret in these cases. We will have to discuss which methods are best for comparison and I've opened an issue on this: #2192

ctao-sonarqube · 2026-05-19T15:37:15Z

Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
97.4% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube

GernotMaier · 2026-05-19T15:59:37Z

@tobiaskleiner - thanks a lot for the review! You are absolutely right with the Chi2 tests and that needs much more thought. I pushed it therefore into an issue. Let me know if you are fine with this or further changes are required.

GernotMaier added 3 commits May 11, 2026 16:33

Add event-level production comparison CLI

9e070e3

Remove unrelated planning files from PR

6caa7c8

Add changelog for PR 2181

500d4a5

GernotMaier self-assigned this May 11, 2026

GernotMaier added 6 commits May 12, 2026 11:31

Merge branch 'main' into event-level-comparision

ec4a513

improved plots

8990ae1

compare productions

0e8060a

unit tests

05fe632

complexity

4229f04

missed application

7ad1a01

github-code-quality Bot found potential problems May 13, 2026

View reviewed changes

Comment thread src/simtools/visualization/plot_event_level_production_comparison.py Fixed

GernotMaier added 6 commits May 13, 2026 11:02

simplify

8d6a0d2

sonar

176a0b9

simplification

b050ae7

sonar and sphinx, my best friends

e453194

obsolete test

5f27531

sonar

44d6b3b

GernotMaier requested a review from Copilot May 13, 2026 10:05

unit test

831c6f7

Copilot started reviewing on behalf of GernotMaier May 13, 2026 10:05 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

GernotMaier added 2 commits May 13, 2026 12:22

Merge branch 'main' into event-level-comparision

3f5d84e

copilot

de39536

GernotMaier added 6 commits May 13, 2026 13:29

Merge branch 'main' into event-level-comparision

d19cb81

report comparision statistics [skip ci]

e2b0342

chi2/n [skip ci]

3bfdc5f

fixed binning

8841720

Merge pull request #2186 from gammasim/event-level-comparison-statistics

52446c8

Event level comparison statistics

statistics module

dd275df

GernotMaier added 5 commits May 18, 2026 11:34

unit tests

ca3582f

unit tests

b1971aa

sonar

167c1b2

sonar

8f8c29a

docs

765e76f

GernotMaier marked this pull request as ready for review May 18, 2026 09:55

This was linked to issues May 18, 2026

Long integration tests for array triggers #1868

Open

Event-level comparison tools for long-integration tests #2179

Open

GernotMaier requested a review from tobiaskleiner May 18, 2026 14:14

tobiaskleiner reviewed May 19, 2026

View reviewed changes

GernotMaier added 3 commits May 19, 2026 17:04

remove chi2s; unit tests

f4d6422

unit tests

0348eeb

Merge branch 'main' into event-level-comparision

64b82c2

Conversation

GernotMaier commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

orelgueta commented May 13, 2026

Uh oh!

tobiaskleiner left a comment

Choose a reason for hiding this comment

Uh oh!

tobiaskleiner May 19, 2026

Choose a reason for hiding this comment

Uh oh!

GernotMaier May 19, 2026

Choose a reason for hiding this comment

Uh oh!

tobiaskleiner May 19, 2026

Choose a reason for hiding this comment

Uh oh!

GernotMaier May 19, 2026

Choose a reason for hiding this comment

Uh oh!

tobiaskleiner May 19, 2026

Choose a reason for hiding this comment

Uh oh!

GernotMaier May 19, 2026

Choose a reason for hiding this comment

Uh oh!

ctao-sonarqube Bot commented May 19, 2026

Quality Gate passed

Uh oh!

GernotMaier commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

GernotMaier commented May 11, 2026 •

edited

Loading