Skip to content

Feature/parameter component enhancement#1008

Merged
sandorkertesz merged 22 commits into
developfrom
feature/parameter-component-enhancement
Jun 12, 2026
Merged

Feature/parameter component enhancement#1008
sandorkertesz merged 22 commits into
developfrom
feature/parameter-component-enhancement

Conversation

@pawel-wolff

@pawel-wolff pawel-wolff commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Description

Field

This PR extends parameter component of a Field by adding the following keys:

  • chem: Optional[str] (replaces chem_variable)
  • chem_long_name : Optional[str]
  • wavelength: Optional[Union[int, float]]
  • wavelength_bounds: Optional[Union[tuple[int, int], tuple[float, float]]]
  • wavelength_units: Optional["Units"]
  • wave_direction: Optional[float]
  • wave_direction_index: Optional[int]
  • wave_direction_bounds: Optional[tuple[float, float]]
  • wave_direction_units: Optional["Units"]
  • wave_frequency: Optional[float]
  • wave_frequency_index: Optional[int]
  • wave_frequency_bounds: Optional[tuple[float, float]]
  • wave_frequency_units: Optional["Units"]

The accessors to the keys representing physical quantities (wavelength, wavelength_bounds, wave_direction, wave_frequency, etc.) have now an optional argument units; if provided, a unit conversion is performed:

f.parameter.wavelength() == 550 # in native units f.parameter.wavelength_units() == "nm"
f.parameter.wavelength("um") == 0.55

Note that the following is not yet possible

f.get("parameter.wavelength[um]")

but can be considered in the future.

The unit conversion is made via earthkit-utils, where the appropriate feature was added in the version 1.0.0rc2. For this reason the required version of earthkit-utils was changed.

Xarray engine

Four new built-in dimensions, related to the above Field metadata keys, were added to the Xarray engine:

  • chem_variable
  • wavelength
  • wave_direction
  • wave_frequency

CF-compliant attributes of the corresponding coordinate variables were proposed (src/earthkit/data/xr_engine/profiles/defaults.yaml)

Coordinate sorting was improved: a fallback on sorting string representations is now the very last option, behind sorting homogenous types; see


The previous implementation was wrong on sorting float coordinates, because float were converted to str early (e.g. "5.0" > "11.0").

Mapping parameter keys from GRIB

The metadata key chem is mapped from the GRIB key parameter.chemShortName (previously this was chemShortName which can have "unknown" value instead of None when undefined).

For mapping the other keys from GRIB see: src/earthkit/data/field/grib/parameter.py

Mapping parameter keys to GRIB

Currently a limited set of parameter keys is mapped back to GRIB after a user modification (via Field.set(), for example):

  • variable -> shortName
  • chem -> chemShortName
  • If wavelength_bounds is set, then their values (in meters) are mapped to (firstWavelength, secondWavelength) and typeOfWavelengthInterval is set to 2 (which means "Between first and second limit."). If only wavelength is set, then its value (in meters) is mapped to firstWavelength. (The complexity of this mapping comes from the fact that a more direct GRIB key mars.wavelength is read-only).
  • wave_direction_index + 1 -> directionNumber
  • wave_frequency_index + 1 -> frequencyNumber

Other keys are not mapped to GRIB for the moment.

Mapping parameter keys from xarray and mars

Will be addressed in another PR.

Docs, tests

Documentation, example notebooks and tests were updated, accordingly.

Contributor Declaration

By opening this pull request, I affirm the following:

  • All authors agree to the Contributor License Agreement.
  • The code follows the project's coding standards.
  • I have performed self-review and added comments where needed.
  • I have added or updated tests to verify that my changes are effective and functional.
  • I have run all existing tests and confirmed they pass.

@pawel-wolff pawel-wolff requested a review from sandorkertesz June 2, 2026 14:34
@codecov-commenter

codecov-commenter commented Jun 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.07113% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.44%. Comparing base (00757ef) to head (669a426).

Files with missing lines Patch % Lines
tests/grib/test_grib_parameter.py 92.96% 4 Missing and 5 partials ⚠️
tests/xr_engine/test_xr_engine_dims.py 91.66% 4 Missing and 4 partials ⚠️
tests/field/test_parameter_component.py 98.32% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1008      +/-   ##
===========================================
+ Coverage    81.80%   82.44%   +0.63%     
===========================================
  Files          236      237       +1     
  Lines        15922    16620     +698     
  Branches       768      801      +33     
===========================================
+ Hits         13025    13702     +677     
- Misses        2671     2681      +10     
- Partials       226      237      +11     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sandorkertesz

sandorkertesz commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@pawel-wolff, thanks for this improvement. Please consider to following issues/questions:

  • in Parameter wavelength, wave_direction and wave_frequency have no units
  • there is no code to write the new keys back to GRIB if they are modified in the field with set(). This applies to the Xarray to GRIB conversion too.
  • is there an extra cost of getting these extra keys every time a Parameter component is built?

…f Field

For GRIB data, "chem_variable" key maps to the GRIB-key "parameter.chemShortName" which does not use "unknown" value, contrary to "chemShortName"
…variables added

Docs on xr_engine dimensions updated and how-to notebooks on chem/optical/2d wave spectra dimensions added.
…tFieldList.unique_values: it now applies the plain `sorted()` whenever all items are either:

* int/float, or
* datetime.date (which includes datetime.datetime), or
* datetime.time, or
* datetime.timedelta.
Otherwise, it falls back to `sorted(*, key=str)`
Tests added

`chem_variable` attributes updated
How-to notebook on extra_dims updated

xr_engine tests for new dimensions added
…s moved from `create_parameter()` to `SimpleFieldComponent.__init_subclass__()` hook

Validation of keys for a component `__init__()` factored out into `SimpleFieldComponent._create_component()`

TODO: use the above pattern for other components
…th_bounds, wave_direction_index, wave_direction_bounds, wave_frequency_index, wave_frequency_bounds
…tional argument `units`

earthkit-utils>=1.0.0rc2

Tests updated
…, wavelength, wavelength_bounds, wave_direction_index, wave_frequency_index

Tests added
@pawel-wolff pawel-wolff force-pushed the feature/parameter-component-enhancement branch from df5bcdb to 0da557d Compare June 10, 2026 17:35
@pawel-wolff

pawel-wolff commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@sandorkertesz , thanks for the comments.

  • in Parameter wavelength, wave_direction and wave_frequency have no units

Now each of these keys have a corresponding key returning a units object, for example: wavelength_units. Moreover, it is now possible to access f.parameter.wavelength(units="um") to get a wavelength in, say, "um". If desired, this pattern can be reused in Vertical component as well (for level).

  • there is no code to write the new keys back to GRIB if they are modified in the field with set(). This applies to the Xarray to GRIB conversion too.

A reasonable support for mapping the new keys back to GRIB is now implemented.

  • is there an extra cost of getting these extra keys every time a Parameter component is built?

With

fl = ekd.from_source("file", "a-lots-of-small-GRIB-messages.grib").to_fieldlist()

the following code

for f in fl:
    f.parameter  # this triggers building a Parameter component for a field f

run on GRIB data containing regular fields (no chemical/optical/waves) is less than 10% slower comparing to develop branch (i.e. no new parameter keys). This seems to be the cost of querying for a few more GRIB keys (it's already a bit optimised - e.g. if directionNumber is not found in a GRIB message, no other related keys are accessed).

def __init_subclass__(cls):
super().__init_subclass__()
sig = inspect.signature(cls.__init__)
cls._ALLOWED_CREATE_KEYS = tuple(key for key in sig.parameters.keys() if key not in {"self", "args", "kwargs"})

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work e.g. for ForecastTime where we can use more keys than the constructor has.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the problem...

This was just an attempt to reduce (in the future) the code duplication within the field/component subpackage (by providing a generic validator for keys passed to a concrete component class __init__, instead of passing an explicit list of keys to SimpleFieldComponent._normalise_create_kwargs() and SimpleFieldComponent._normalise_set_kwargs()).

But now I see that it would not be useful for TimeBase subclasses, at least. So I will revert the changes and go back to the original design.

chem: str = None,
chem_long_name: str = None,
) -> None:
Parameter.__init__(self, variable=variable, standard_name=standard_name, long_name=long_name, units=units)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not it be super().init(?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will change to super().

@sandorkertesz

Copy link
Copy Markdown
Collaborator

@pawel-wolff, this is a massive improvement! Many thanks!

@sandorkertesz sandorkertesz merged commit 6e783d9 into develop Jun 12, 2026
148 checks passed
@sandorkertesz sandorkertesz deleted the feature/parameter-component-enhancement branch June 12, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants