FloodGraphNet

Solution for the Kaggle competition

A graph-aware XGBoost pipeline for urban flood prediction using hydraulic context features, flow proxies, and mass-balance approximations.

Developed by team NSEOverflow

Brandon Low — @zxKyouma
Daphne Chu — @D-Keii

Environment Setup

conda env create -f environment.yml
conda activate flood

Dataset

Download the UrbanFloodBench dataset and the model checkpoints

After downloading:

Extract the archive
Place the extracted folders in the project root (./)
Run the metadata setup script:
```
bash scripts/populate_model_metadata.sh
```
Place the model checkpoints in /saved_models

Repository Structure

Your directory should look like this:

FloodGraphFlow-XGB
...
├── saved_models
│   ├── model1_best.pkl
│   └── model2_best.pkl
├── Models
│   ├── Model_1
│   │   ├── train
│   │   │   ├── events.csv
│   │   │   ├── events_split_seed42
│   │   │   │   ├── train_split.csv
│   │   │   │   └── val_split.csv
│   │   │   └── events_hardholdout_seed42
│   │   ├── test
│   │   │   └── events.csv
│   │   ├── processed
│   │   │   └── csv_features_stats.yaml
│   │   └── model1_node_pca.joblib
│   └── Model_2
...
└── README.md

Model Training

The models for Cities 1 and 2 are trained with the following commands:

# Train Model_1
bash scripts/train_model1_best.sh

# Train Model_2
bash scripts/train_model2_best.sh

Inference

Model inference can be done with the following commands:

# Model_1 test predictions
python scripts/run_floodgraphflow_xgb.py \
    --config configs/model1_best.yaml \
    --backend xgboost_cpu \
    --load_model_path saved_models/model1_best.pkl \
    --dump_test_predictions predictions/model1_test_predictions.parquet

# Model_2 test predictions
python scripts/run_floodgraphflow_xgb.py \
    --config configs/model2_best.yaml \
    --backend xgboost_cpu \
    --load_model_path saved_models/model2_best.pkl \
    --dump_test_predictions predictions/model2_test_predictions.parquet

# Merge for submission
python scripts/merge_xgb_submission.py \
    --sample sample_submission.parquet \
    --model1 predictions/model1_test_predictions.parquet \
    --model2 predictions/model2_test_predictions.parquet \
    --output submissions/floodgraphflow_xgb_submission.parquet

Note: sample_submission.parquet should be downloaded from the competition website.

Approach

We use a graph-aware stacked XGBoost pipeline rather than a single end-to-end sequence model.

Flood behavior in this task is not purely local: each node depends on upstream inflow, downstream blockage, storage, and boundary conditions. A single flat regressor struggled to represent these different hydraulic regimes.

By combining graph-derived features, physics-inspired hydraulic proxy features, auxiliary flow surrogates (qnet, qin, qout), and a two-stage regime-aware predictor, the model captures propagation, retention, and delayed drainage more reliably than local rainfall and water-level features alone.

Model Overview

At a high level, the model works in four stages.

Stage 1 — Graph-aware feature construction

We build node-level temporal features from rainfall, water level, and static network attributes, then augment them with graph features, boundary indicators, and mass-deficit storage proxies that encode coarse hydraulic structure.

Stage 2 — Auxiliary flow prediction

We train Stage-A models to predict latent hydraulic quantities such as net flow, inflow, and outflow. Their out-of-fold predictions are fed back into the feature set.

Stage 3 — Regime-aware prediction

A two-stage XGBoost predictor combines regime classification with conditional regression, allowing the model to treat calm, rising, and storage-dominated states differently.

Stage 4 — Final submission assembly

The full pipeline is trained separately for Model_1 and Model_2, and their predictions are merged into the final submission.

Preprocessing

Normalization: z-score normalization applied to all static and dynamic node/edge features.
Stabilization of heavy-tailed features:
- Clipping of extreme values
- log1p / asinh transforms for hydraulic ratios and interaction terms
Edge-case handling:
- Dedicated treatment of zero-area endpoint nodes to maintain stable feature distributions

Hyperparameters

All hyperparameters were tuned using Optuna.

Main regressor: 800 trees, learning rate 0.03, max depth 8
Regime classifier: 600 trees, learning rate 0.03, max depth 6
Event settings: quantile = 0.88, horizon = 24

Feature Engineering

The final model used a total of 262 features.

The strongest feature families in the final submission were:

1. Graph-propagated water-level context

These features summarize nearby hydraulic state over the drainage graph rather than using only the local node.

fe_graph_pulse
fe_graph_hop2_features
fe_level_imbalance_features

2. Stage-A net-flow stack (`qnet`)

We train an auxiliary model to predict net flow, then feed those predictions back into the main model.

qnet_stack
qnet_phys_baseline_feature
fe_qhat_graph2
fe_qhat_graph2_hop2

3. Stage-A inflow / outflow stack (`qin`, `qout`)

These features expose directional transport structure that is hard to recover from local predictors alone.

qinout_stack

4. Basin mass-deficit and storage proxy features

These are the main mass-balance-like engineered features.

fe_basin_mass_deficit_features

A detailed list of all included features is included in FEATURES.md.

Model Ablations

Model	Addition	City 2 Score
A	Baseline XGBoost	0.203494
B	A + pruned feature set + graph / qhat graph-neighbor features	0.141594
C	B + auxiliary `peak_within_24` target	0.138205
D	C + basin / storage mass-deficit framing	0.084896
E	D + node priors + downstream lockup + subcatchment mass-deficit	0.079190
F	E + `twi_spi` + multiscale mass mismatch + `HAND` proxy features	0.077998
G	F + phase-MoE pilot	0.077271
H	G + pruneA regime cleanup	0.076822
I	H + edge-aware downstream features	0.076526
J	I + node drop priors	0.075713
K	J + drain-regime priors	0.074033
L	K + endpoint boundary features	0.074011
M	L + upstream historical EMA features	0.065236
N	M + `qin/qout/qnet` historical EMA features	0.056407
O	N + surcharge expert + deep-storage expert	0.051369
P	O + split directional `qnet` history EMA	0.048904

*Note: M was used as the final model, as models N, O, and P performed worse on the public Kaggle leaderboard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FloodGraphNet

Environment Setup

Dataset

Repository Structure

Model Training

Inference

Approach

Model Overview

Preprocessing

Hyperparameters

Feature Engineering

1. Graph-propagated water-level context

2. Stage-A net-flow stack (`qnet`)

3. Stage-A inflow / outflow stack (`qin`, `qout`)

4. Basin mass-deficit and storage proxy features

Model Ablations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
configs		configs
constants		constants
data		data
metadata		metadata
scripts		scripts
utils		utils
.gitignore		.gitignore
FEATURES.md		FEATURES.md
README.md		README.md
environment.yml		environment.yml
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

FloodGraphNet

Environment Setup

Dataset

Repository Structure

Model Training

Inference

Approach

Model Overview

Preprocessing

Hyperparameters

Feature Engineering

1. Graph-propagated water-level context

2. Stage-A net-flow stack (qnet)

3. Stage-A inflow / outflow stack (qin, qout)

4. Basin mass-deficit and storage proxy features

Model Ablations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Stage-A net-flow stack (`qnet`)

3. Stage-A inflow / outflow stack (`qin`, `qout`)

Packages