Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
545e65d
wip: first rough draft of a GribJumpSource
andreas-grafberger May 2, 2025
134086f
wip: experimental tests for easier development
andreas-grafberger May 2, 2025
bb467a2
format changes using pre-commit hooks and fix small bug
andreas-grafberger May 5, 2025
47d0f05
add prototype for the GribJumpSource based on SimpleFieldList
andreas-grafberger May 5, 2025
824421c
tests: add a few more simple tests and add NO_GRIBJUMP flag for pytest
andreas-grafberger May 5, 2025
5d2bb85
tidy: small cleanup, improve variable naming and fix type hints
andreas-grafberger May 5, 2025
6fab49d
Merge branch 'develop' into feature/gribjump
andreas-grafberger May 14, 2025
8a4fe31
use original type in request dictionaries to make .sel more intuitive
andreas-grafberger May 15, 2025
ee15824
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger May 16, 2025
86249d8
use SimpleFieldList.to_xarray method for GribJumpSource.
andreas-grafberger May 19, 2025
2f8c1b0
assign grid index to each value in to_xarray
andreas-grafberger Jun 3, 2025
0f34de5
Merge branch 'develop' into feature/gribjump
andreas-grafberger Jun 3, 2025
e274ae7
tidy: add some more error handling and improve tests
andreas-grafberger Jun 3, 2025
711da54
refactor: introduce ExtractionRequest wrapper that combines pygribjum…
andreas-grafberger Jun 30, 2025
9c31181
feat(test): modify (now failing) test to expect latitude and longitud…
andreas-grafberger Jun 30, 2025
162877d
feat: wip: allow reference lat/lons to be loaded from an fdb referenc…
andreas-grafberger Jun 30, 2025
08194b2
refactor: move hardcoded test fixtures into pytest fixtures
andreas-grafberger Jun 30, 2025
941269a
test: add failing test showing bug with geography for gridded extracts
andreas-grafberger Jun 30, 2025
1016cef
docs: add notebook draft with example usage of gribjump source
andreas-grafberger Jun 30, 2025
f660a11
tidy: move validation that extract request share the same ranges
andreas-grafberger Jul 1, 2025
617cb4e
docs: add documentation for gribjump source
andreas-grafberger Jul 4, 2025
767f2a6
docs: small fixes of markdown syntax
andreas-grafberger Jul 4, 2025
5e6e475
feat: wip experiment to verify gridspec of reference field
andreas-grafberger Jul 4, 2025
e477a6f
fix: force flattened array in xarray dataset creation
andreas-grafberger Jul 16, 2025
305b251
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Jul 16, 2025
bc4ffc4
refactor: tidy up the metadata enrichment a bit
andreas-grafberger Jul 16, 2025
f8f4630
refactor: create ExtractionRequestCollection
andreas-grafberger Jul 16, 2025
7b1548e
refactor: use FDBRetriever to load reference metadata
andreas-grafberger Jul 16, 2025
90a6d6e
docs: add example for masks and indices to notebook
andreas-grafberger Jul 17, 2025
de5a8fc
feat: enforce that masks are 1D boolean arrays
andreas-grafberger Jul 17, 2025
e5447d4
refactor: simplify by condensing request splitting utilities into one…
andreas-grafberger Jul 17, 2025
c0b8a3a
feat: remove verifiation functionality for now, to be added later
andreas-grafberger Jul 17, 2025
3821bb3
tidy: small renamings and docstrings
andreas-grafberger Jul 17, 2025
ab3a831
tidy: comments
andreas-grafberger Jul 17, 2025
5f80b6d
fix: type hint and name
andreas-grafberger Jul 17, 2025
741dbe5
feat: convert masks to ranges once for significant speedups
andreas-grafberger Jul 17, 2025
3e803de
test: add another test for mask_to_ranges
andreas-grafberger Jul 17, 2025
82b5577
tidy: small comment/docstring changes
andreas-grafberger Jul 17, 2025
7e398f4
tidy: make warning about missing validation in docs more explicit
andreas-grafberger Jul 17, 2025
7eb1dac
docs: improve wording of warning
andreas-grafberger Jul 17, 2025
1d543c8
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Jul 24, 2025
e373e14
fix: allow fdb and gribjump to be configured via FDB5_CONFIG and GRIB…
andreas-grafberger Jul 25, 2025
90b1f95
chore: update docstring
andreas-grafberger Jul 25, 2025
d6b506a
feat: pass log context to gribjump
andreas-grafberger Jul 25, 2025
a806a71
refactor: simplify gribjump log context
andreas-grafberger Jul 25, 2025
23afbfc
test: add t_gribjump.grib test data with expver xxxx
andreas-grafberger Jul 25, 2025
91f3f3e
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Aug 28, 2025
8203df0
add pygribjump as an optional dependency
andreas-grafberger Aug 28, 2025
e158e2b
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Sep 16, 2025
85d9513
docs: clarify gribjump install instructions and dependency handling
andreas-grafberger Sep 16, 2025
e3a68d4
docs: move warning before parameters section
andreas-grafberger Sep 18, 2025
c03c3ee
docs: clarify parameter description and types
andreas-grafberger Sep 18, 2025
972b97b
add pyfdb as a gribjump group dependency and update docs
andreas-grafberger Sep 18, 2025
8d031d3
last docs and typo fixes
andreas-grafberger Sep 18, 2025
be5269f
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Sep 18, 2025
f2c5f0e
docs: change notebook to also set FDB_HOME
andreas-grafberger Sep 18, 2025
88bca26
docs: reference gribjump example notebook in missing locations
andreas-grafberger Sep 19, 2025
c5fa528
Merge remote-tracking branch 'origin/develop' into feature/gribjump
andreas-grafberger Sep 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
972 changes: 972 additions & 0 deletions docs/examples/gribjump.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Data sources
polytope_feature.ipynb
s3.ipynb
wekeo.ipynb
gribjump.ipynb

GRIB
++++++
Expand Down
81 changes: 81 additions & 0 deletions docs/guide/sources.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ We can get data from a given source by using :func:`from_source`:
- retrieve data from `WEkEO`_ using the WEkEO grammar
* - :ref:`data-sources-wekeocds`
- retrieve `CDS <https://cds.climate.copernicus.eu/>`_ data stored on `WEkEO`_ using the `cdsapi`_ grammar
* - :ref:`data-sources-gribjump`
- retrieve data from the `FDB (Fields DataBase)`_ using the `gribjump`_ library
* - :ref:`data-sources-zarr`
- load data from a `Zarr <https://zarr.readthedocs.io/en/stable/>`_ store

Expand Down Expand Up @@ -1231,6 +1233,85 @@ wekeocds
- :ref:`/examples/wekeo.ipynb`


.. _data-sources-gribjump:

gribjump
--------

.. py:function:: from_source("gribjump", request, *, ranges=None, mask=None, indices=None, fetch_coords_from_fdb=False, fdb_kwargs=None, **kwargs)
:noindex:

The ``gribjump`` source enables fast retrieval of GRIB message subsets from the `FDB (Fields DataBase)`_ using the `gribjump <https://github.com/ecmwf/gribjump/>`_ library.
Both `pygribjump <https://pypi.org/project/pygribjump/>`_ and `pyfdb`_ must be installed. The `pygribjump`_ package uses `findlibs <https://github.com/ecmwf/findlibs>`_ to locate an installation of the `gribjump`_ library.
If the library is not available on your system, you can install it via the `gribjumplib <https://pypi.org/project/gribjumplib/>`_ wheel from PyPI.
Installing `gribjumplib` from PyPI will also automatically install `fdb5lib <https://pypi.org/project/fdb5lib/>`_ and other dependencies, which may take priority over any existing installations on your system.

.. warning::
⚠️ This source is **experimental** and may change in future versions without
warning. It performs **no validation** that the specified grid indices,
masks, or ranges correspond to the fields' actual underlying grids.
**Incorrect usage may silently return wrong data points.**
The provided ranges or masks might correspond to unexpected points on the
grid. This source is also currently **not thread-safe**.

Exactly one of the parameters ``ranges``, ``mask`` or ``indices`` must be specified at a time.

:param request: the FDB request as a dictionary. GribJump requires strict value formatting
(e.g., hdates as "YYYYMMDD", not "YYYY-MM-DD"). Format errors may result in "DataNotFound" errors.
:type request: dict
:param ranges: a list of tuples specifying the ranges of 1D grid indices to retrieve in the form
[(start1, end1), (start2, end2), ...]. Ranges are exclusive, meaning that the end index is not included in the range.
:type ranges: list[tuple[int, int]], optional
:param mask: a 1D boolean mask specifying which grid points to retrieve
:type mask: numpy.array, optional
:param indices: a 1D array of grid indices to retrieve
:type indices: numpy.array, optional
:param fetch_coords_from_fdb: if ``True``, loads the first field's metadata from
the FDB to extract the coordinates at the specified indices. If ``False``, the
coordinates are not loaded and no separate FDB request is made.
Default is ``False``. Please note that no validation is performed to
ensure that all fields in the requests share the same grid.
:type fetch_coords_from_fdb: bool, optional
:param fdb_kwargs: only used when ``fetch_coords_from_fdb=True``. A dict of
keyword arguments passed to the `pyfdb.FDB` constructor. This allows to
specify the FDB configuration, user configuration, etc. If not provided,
the default configuration is used. These arguments are only passed to the
FDB when fetching coordinates and are not used by GribJump for the
extraction itself.
:type fdb_kwargs: dict, optional


The following example retrieves a subset from a GRIB message in the FDB using a boolean mask:

.. code-block:: python

import earthkit.data as ekd
import numpy as np

request = {
"class": "od",
"type": "fc",
"stream": "oper",
"expver": "0001",
"repres": "gg",
"levtype": "sfc",
"param": "2t",
"date": "20250703",
"time": 0,
"step": list(range(0, 24, 6)),
"domain": "g",
}

ranges = [(0, 10), (20, 30)]

source = ekd.from_source("gribjump", request, ranges=ranges)
ds = source.to_xarray()

Further examples:

- :ref:`/examples/gribjump.ipynb`
Comment thread
andreas-grafberger marked this conversation as resolved.



.. _data-sources-zarr:

Expand Down
7 changes: 7 additions & 0 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ Alternatively, you can install the following components:
- covjsonkit: provides access to CoverageJSON data served by the :ref:`data-sources-polytope` source
- s3: provides access to non-public :ref:`s3 <data-sources-s3>` buckets (new in version *0.11.0*)
- geotiff: adds GeoTIFF support (new in version *0.11.0*). Please note that this is not included in the ``[all]`` option and has to be invoked separately.
- gribjump: provides access to the :ref:`data-sources-gribjump` source
- zarr: provides access to the :ref:`data-sources-zarr` source (new in version *0.15.0*). Please note that this is not included in the ``[all]`` option and has to be invoked separately.

E.g. to add :ref:`data-sources-mars` support you can use:
Expand Down Expand Up @@ -85,3 +86,9 @@ FDB
+++++

For FDB (Fields DataBase) access FDB5 must be installed on the system. See the `FDB documentation <https://fields-database.readthedocs.io/en/latest/>`_ for details.


GribJump
++++++++++++

For FDB access with GribJump, both FDB5 and GribJump must be installed on the system. See the `GribJump project <https://github.com/ecmwf/gribjump>`_ for details.
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ dependencies = [
"xarray>=0.19",
]
optional-dependencies.all = [
"earthkit-data[cds,covjsonkit,ecmwf-opendata,fdb,geo,geopandas,mars,odb,polytope,projection,s3,wekeo]",
"earthkit-data[cds,covjsonkit,ecmwf-opendata,fdb,geo,geopandas,gribjump,mars,odb,polytope,projection,s3,wekeo]",
]
optional-dependencies.cds = [ "cdsapi>=0.7.2" ]
optional-dependencies.ci = [ "numpy" ]
Expand All @@ -70,6 +70,7 @@ optional-dependencies.fdb = [ "pyfdb>=0.1" ]
optional-dependencies.geo = [ "earthkit-geo>=0.2" ]
optional-dependencies.geopandas = [ "geopandas" ]
optional-dependencies.geotiff = [ "pyproj", "rasterio", "rioxarray" ]
optional-dependencies.gribjump = [ "pyfdb>=0.1", "pygribjump" ]
optional-dependencies.mars = [ "ecmwf-api-client>=1.6.1" ]
optional-dependencies.odb = [ "pyodc" ]
optional-dependencies.polytope = [ "polytope-client>=0.7.6" ]
Expand Down
Loading
Loading