Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions docs/source/datasets/fpa_fod_tabular.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
FPA FOD — Tabular Classification Dataset
========================================

What it is
----------

This dataset provides a tabular supervised-learning view derived from FPA FOD wildfire records.
One sample corresponds to one incident (or one processed record).

Data source and licensing
-------------------------

The raw FPA FOD dataset is large and must be obtained by the user. PyHazards does not ship the raw data
due to licensing/size constraints.

How to provide data
-------------------

1. Obtain the FPA FOD sqlite (user-provided).
2. Place it at the path expected by the dataset builder (see project README/scripts).
3. Run the preprocess/build step to produce processed dataset artifacts.

Returned tensors (contract)
---------------------------

A single sample returns:
- ``x``: float tensor of shape ``[input_dim]``
- ``y``: integer class id (scalar)

A batch returns:
- ``x``: ``[B, input_dim]``
- ``y``: ``[B]``

Micro dataset for CI
--------------------

For CI/testing: set ``micro=True`` to use deterministic synthetic data that matches the schema and shapes,
so tests run without requiring the raw FPA FOD sqlite.
31 changes: 31 additions & 0 deletions docs/source/datasets/fpa_fod_weekly.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
FPA FOD — Weekly Forecasting Dataset
====================================

What it is
----------

This dataset provides weekly sequences for forecasting wildfire activity.
One sample corresponds to a sequence of weekly feature vectors.

Data source and licensing
-------------------------

The raw FPA FOD dataset must be obtained by the user. PyHazards does not ship the raw data due to
licensing/size constraints.

Returned tensors (contract)
---------------------------

A single sample returns:
- ``x``: float tensor of shape ``[T, input_dim]``
- ``y``: float tensor of shape ``[out_dim]`` (or scalar)

A batch returns:
- ``x``: ``[B, T, input_dim]``
- ``y``: ``[B, out_dim]``

Micro dataset for CI
--------------------

For CI/testing: set ``micro=True`` to use deterministic synthetic data (small B and T) to validate shapes,
dtypes, and a trainer smoke run, without requiring the raw FPA FOD sqlite.
12 changes: 12 additions & 0 deletions pyhazards/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,15 @@
"GraphTemporalDataset",
"graph_collate",
]
from .wildfire_fpa_fod import FPAFODWildfireTabular, FPAFODWildfireWeekly

# Register datasets so framework load_dataset(...) can find them
register_dataset(
name="wildfire_fpa_fod_tabular",
builder=lambda **kwargs: FPAFODWildfireTabular(**kwargs),
)

register_dataset(
name="wildfire_fpa_fod_weekly",
builder=lambda **kwargs: FPAFODWildfireWeekly(**kwargs),
)
Loading