Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
104820b
Add snakemake rules for habitat maps
mdales Feb 5, 2026
265dabf
Update food map scripts for snakemake and newer yirgacheffe
mdales Feb 5, 2026
aa01940
WiP: fractional food map but hitting alignment with diff issues
mdales Feb 10, 2026
8e6758b
WiP: do crop diff on demand
mdales Feb 10, 2026
6894dc6
WiP: improvements to food map generation.
Feb 11, 2026
cee07a0
Updates to get fractional pipeline
mdales Feb 25, 2026
318b9ff
Remove old area map script
mdales Feb 25, 2026
930ac50
Update run.sh for new fractional maps
mdales Feb 25, 2026
60222be
Improve performance for combine stage
mdales Feb 26, 2026
9087be8
Update generate food script
mdales Feb 26, 2026
29bd930
End to end run complete
mdales Mar 19, 2026
0a21f91
Fix linter/mypy warnings
mdales Mar 20, 2026
7fefa8b
WiP snakemake improvements
mdales Apr 17, 2026
83557c6
Add more snakemake files
mdales Apr 17, 2026
10ee844
Lock down aoh and yirgacheffe dependancies
mdales Apr 17, 2026
8fc5f41
First end to end with snakemake
mdales Apr 24, 2026
3430fb5
Snakefmt run
mdales Apr 24, 2026
eb6ee7b
Fix bug in fractional coverage when both farm types go up
mdales Apr 27, 2026
46e122c
Fix snakemake reformatting mistake
mdales Apr 28, 2026
bd9fd2b
Remove deprecated shell scripts
mdales Apr 28, 2026
6436165
Improve handling of preserve codes
mdales Apr 29, 2026
14ef006
Make the add_land_cover flow a little more obvious
mdales Apr 29, 2026
dac3901
Simplify snakemake rules
mdales Apr 30, 2026
c5e0eca
Remember to tap when warping
mdales May 15, 2026
2eb31b6
Add additional restore scenarios and land cover analysis
mdales May 19, 2026
1e1a44e
Fix diff map logic
mdales Jun 4, 2026
5a466d3
Updated snakefmt
mdales Jun 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 11 additions & 6 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ on:
jobs:
build:
runs-on: ubuntu-latest
container: ghcr.io/osgeo/gdal:ubuntu-small-3.11.0
container: ghcr.io/osgeo/gdal:ubuntu-small-3.12.2
strategy:
fail-fast: false
matrix:
Expand All @@ -23,28 +23,33 @@ jobs:
run: |
apt-get update -qqy
apt-get install -y git python3-pip libpq5 libpq-dev shellcheck

- uses: actions/checkout@v4
with:
submodules: 'true'

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install gdal[numpy]==3.11.0
python -m pip install gdal[numpy]==3.12.2
python -m pip install -r requirements.txt

- name: Lint with pylint
run: |
python3 -m pylint deltap prepare_layers prepare_species usecases utils local

- name: Type checking
run: |
python3 -m mypy deltap prepare_layers prepare_species usecases utils local

- name: Tests
run: |
python3 -m pytest ./tests
- name: Script checks
run: |
shellcheck ./scripts/run.sh
shellcheck ./scripts/generate_food_map.sh

- name: Snakemake format check
run: snakefmt --check workflow/
84 changes: 83 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,89 @@ This repository implements the LIFE extinction risk methodology as published in

## Running the code

The methodology is explained in more detail in [method.md](method.md), but there is also a script set up to just do an entire run of the pipeline in `./scripts/run.sh`.
The methodology is explained in more detail in [method.md](method.md). The pipeline is implemented as a [Snakemake](https://snakemake.readthedocs.io/) workflow in `workflow/`.

### Prerequisites

The following tools must be available on `PATH`:

- `reclaimer` — downloads inputs from Zenodo ([quantifyearth/reclaimer](https://github.com/quantifyearth/reclaimer))
- `aoh-calc`, `aoh-habitat-process`, `aoh-collate-data`, `aoh-species-richness`, `aoh-endemism`, `aoh-validate-prevalence` — from the `aoh` Python package
- `gdalwarp` — part of GDAL

The following environment variables must be set:

| Variable | Purpose |
|----------|---------|
| `DATADIR` | Root directory for all input/output data |
| `DB_HOST`, `DB_NAME`, `DB_USER`, `DB_PASSWORD` | PostgreSQL credentials for IUCN species database |

For GBIF occurrence validation (optional):

| Variable | Purpose |
|----------|---------|
| `GBIF_USERNAME`, `GBIF_EMAIL`, `GBIF_PASSWORD` | GBIF account credentials |

### Running the pipeline

```bash
# Full pipeline: delta P maps for all scenarios + model validation
snakemake --cores N all

# Delta P maps only, without validation
snakemake --cores N life

# Species richness and endemism maps (not included in 'all')
snakemake --cores N summaries

# Model validation only
snakemake --cores N validation

# GBIF occurrence validation (expensive — downloads gigabytes of data)
snakemake --cores N occurrence_validation
```

Other useful targets for incremental runs:

```bash
snakemake --cores N prepare # Download and warp all base layers
snakemake --cores N species_data # Extract species data from the database
snakemake --cores N aohs # Generate all AOH rasters and collated CSVs
```

### Precious layers

The habitat and elevation layers are expensive to generate and are treated as **precious**: they will only be rebuilt if their output files are explicitly deleted, even if upstream inputs or code have changed. This applies to:

- `habitat_layers/current/` — food-enhanced current habitat map at ~1.67km resolution
- `habitat_layers/{scenario}/` — per-scenario fractional habitat maps
- `habitat_layers/pnv/` — potential natural vegetation fractional maps
- `elevation-max.tif`, `elevation-min.tif` — warped elevation layers

To force a rebuild of any of these, delete the relevant `.sentinel` file (or the output file itself) before re-running.

### Configuration

Pipeline parameters are in `config/config.yaml`:

- `taxa` — taxonomic classes to process (default: AMPHIBIA, AVES, MAMMALIA, REPTILIA)
- `scenarios` — habitat change scenarios to evaluate (default: arable, restore)
- `curve` — extinction curve exponent for delta P (default: `"0.25"`)
- `pixel_scale` — output raster resolution in degrees (default: ~5 arc-seconds)

### Inspecting the pipeline graph

The rule graph can be generated with:

```bash
snakemake --forceall --rulegraph 2>/dev/null | dot -Tsvg > graph.svg
```

Note that rules whose inputs are determined by the species-extraction checkpoint (`generate_aoh`, `calculate_delta_p`) will not appear in the static rule graph — this is a known snakemake limitation with checkpoint-based expansion. To see the complete job graph after species data has been extracted:

```bash
snakemake --forceall --dag 2>/dev/null | unflatten -f -l 5 | dot -Tsvg > dag.svg
```

## Credits

Expand Down
52 changes: 52 additions & 0 deletions config/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# LIFE Pipeline Configuration
# ===========================

# Taxonomic classes to process
taxa:
- AMPHIBIA
- AVES
- MAMMALIA
- REPTILIA

# Pixel scale
pixel_scale: 0.016666666666667

# Hyde projection data
hyde_projection: 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'
hyde_pixel_scale: 0.08333333333333333

# Scenarios to run - accepted values are
# - arable
# - restore
# - restore_all
# - restore_agriculture
scenarios:
- arable
- restore
- restore_all
- restore_agriculture

# Z-curve value for delta P calculation
curve: "0.25"

# Projection for species data extraction
projection: "EPSG:4326"

# Zenodo configuration for downloading raw habitat
zenodo:
jung_habitat:
zenodo_id: 4058819
filename: "iucn_habitatclassification_composite_lvl2_ver004.zip"
jung_habitat_updates:
zenodo_id: 4058819
filename: "lvl2_changemasks_ver004.zip"
jung_pnv:
zenodo_id: 4038749
filename: "pnv_lvl1_004.zip"
elevation:
zenodo_id: 5719984
filename: "dem-100m-esri54017.tif"

# Optional inputs
optional_inputs:
species_overrides: "overrides.csv"
68 changes: 33 additions & 35 deletions deltap/delta_p_scaled.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,67 +4,65 @@
from pathlib import Path

import pandas as pd
import yirgacheffe.operators as yo
from yirgacheffe.layers import RasterLayer
import yirgacheffe as yg
from snakemake_argparse_bridge import snakemake_compatible # type: ignore

SCALE = 1e6

def delta_p_scaled_area(
input_path: Path,
diff_area_map_path: Path,
totals_path: Path,
species_totals_path: Path,
output_path: Path,
):
os.makedirs(output_path.parent, exist_ok=True)

per_taxa = [
RasterLayer.layer_from_file(os.path.join(input_path, x))
for x in sorted(input_path.glob("*.tif"))
yg.read_raster(x) for x in sorted(input_path.glob("*.tif"))
]
if not per_taxa:
sys.exit(f"Failed to find any per-taxa maps in {input_path}")

area_restore = RasterLayer.layer_from_file(diff_area_map_path)
species_total_counts = pd.read_csv(species_totals_path)

total_counts = pd.read_csv(totals_path)
with yg.read_raster(diff_area_map_path) as diff_area:
diff_area_rescaled = yg.where(diff_area < SCALE, float('nan'), diff_area / SCALE)

area_restore_filter = yo.where(area_restore < SCALE, float('nan'), area_restore) / SCALE

with RasterLayer.empty_raster_layer_like(
area_restore,
filename=output_path,
nodata=float('nan'),
bands=len(per_taxa) + 1
) as result:

species_count = int(total_counts[total_counts.taxa=="all"]["count"].values[0])

result._dataset.GetRasterBand(1).SetDescription("all") # pylint: disable=W0212
summed_layer = per_taxa[0]
for layer in per_taxa[1:]:
summed_layer = summed_layer + layer

scaled_filtered_layer = yo.where(
area_restore_filter != 0,
((summed_layer / area_restore_filter) * -1.0) / species_count,
# Process all species in total
total_species_count = int(species_total_counts[species_total_counts.taxa=="all"]["count"].values[0])
summed_layer = yg.sum(per_taxa)
all_scaled_filtered_layer = yg.where(
diff_area_rescaled != 0,
((summed_layer / diff_area_rescaled) * -1.0) / total_species_count,
float('nan')
)
scaled_filtered_layer.parallel_save(result, band=1)

for idx, inlayer in enumerate(per_taxa):
assert inlayer.name
bands = [all_scaled_filtered_layer]
labels = ["all"]

# Now per taxa
for inlayer in per_taxa:
# get the taxa from the filename
_, name = os.path.split(inlayer.name)
taxa = name[:-4]
species_count = int(total_counts[total_counts.taxa==taxa]["count"].values[0])
result._dataset.GetRasterBand(idx + 2).SetDescription(taxa) # pylint: disable=W0212
scaled_filtered_layer = yo.where(
area_restore_filter != 0,
((inlayer / area_restore_filter) * -1.0) / species_count,
labels.append(taxa)

taxa_species_count = int(species_total_counts[species_total_counts.taxa==taxa]["count"].values[0])
scaled_filtered_layer = yg.where(
diff_area_rescaled != 0,
((inlayer / diff_area_rescaled) * -1.0) / taxa_species_count,
float('nan')
)
scaled_filtered_layer.parallel_save(result, band=idx + 2)
bands.append(scaled_filtered_layer)

yg.to_geotiff(output_path, bands, labels, nodata=float('nan'))

@snakemake_compatible(mapping={
"input_path": "params.input_dir",
"diff_area_map_path": "input.diffmap",
"totals_path": "input.totals",
"output_path": "output.final",
})
def main() -> None:
parser = argparse.ArgumentParser(description="Scale final results for publication.")
parser.add_argument(
Expand Down
Loading
Loading