Skip to content

Fix merge peaks field detection#1006

Draft
cfuselli wants to merge 7 commits intomasterfrom
fix-merge-peaks-field-detection
Draft

Fix merge peaks field detection#1006
cfuselli wants to merge 7 commits intomasterfrom
fix-merge-peaks-field-detection

Conversation

@cfuselli
Copy link
Member

This pull request improves the handling of optional waveform fields (data_top and data_start) in the peak merging and waveform storage logic. The main changes ensure that code only attempts to access or store these fields if they exist in the data structure, making the code more robust and flexible for different data types.

Handling of optional waveform fields:

  • In merge_peaks and _merge_peaks (strax/processing/peak_merging.py), added logic to check for the existence of data_top and data_start fields in the input data type, and propagate this information through the merging process. This prevents errors when these fields are missing. [1] [2] [3]
  • Modified the waveform merging and downsampling logic to only process data_top and data_start if those fields exist, avoiding unnecessary or invalid operations. [1] [2]

Waveform storage improvements:

  • Updated store_downsampled_waveform (strax/processing/peak_building.py) to use a safer and clearer approach for storing the data_start field, avoiding direct length checks on potentially missing fields and always storing up to 200 samples or the available length.

cfuselli and others added 7 commits February 13, 2026 17:27
Previously, _merge_peaks() hardcoded both store_data_top=True and
store_data_start=True when calling store_downsampled_waveform(), which
caused crashes when peaks had one field but not the other.

This fix:
- Checks which optional waveform fields exist in merge_peaks() (before
  numba compilation)
- Passes has_data_top and has_data_start flags to _merge_peaks()
- Conditionally fills buffer_top only if data_top exists
- Passes detected flags to store_downsampled_waveform()

Benefits:
- Fixes crash when peaks missing data_start field
- Makes fields truly independent (can use data_top without data_start)
- Backward compatible with existing code
- Enables RAM optimization in straxen online DAQ processing
The has_data_top and has_data_start flags are already passed as parameters
from the outer merge_peaks() function. The dtype checks inside _merge_peaks()
were leftover from the first attempt and cause numba compilation errors
since string operations are not allowed in numba-compiled code.
The inner condition 'if p_length > len(p["data_start"])' was trying to
access p["data_start"] even when the field doesn't exist, causing a
numba compilation error.

Simplified the logic to just store min(p_length, available_space) samples,
which works whether the field is larger or smaller than p_length.
Cannot call len(p['data_start']) inside numba function because numba
compiles for the specific dtype - if dtype doesn't have data_start field,
compilation fails even though the code is inside 'if store_data_start:'.

Solution: Use hardcoded 200 (standard data_start size in strax) instead
of accessing the field to get its length.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant