Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .agents/GUIDELINES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Agent Notes

- Prefer explanatory comments in numerical kernels, coordinate math, and callback boundaries.
- Target roughly one meaningful comment for every 5-10 lines in dense array code.
- Comment the reason for a layout transform, cached constant, broadcast shape, or numerical safeguard.
- Do not add filler comments that simply restate the next line.
- Write tests, follow red green TDD.
- Ensure all methods, functions, modules, and classes have a docstring. Private objects can have a single line, public objects should have a full numpy docstring with examples (following doctest).
- All lines must be covered by tests; delete unreachable edge case code or test it using public API and minimal monkey patching.
- Keep Markdown prose unwrapped. Do not hard-wrap paragraphs or list items in `.md` files; let editors soft-wrap them. Code blocks and formats that require line breaks are exceptions.
7 changes: 0 additions & 7 deletions .agents/README.md

This file was deleted.

67 changes: 67 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
name: Documentation

on:
push:
branches:
- main
pull_request:
workflow_dispatch:

permissions:
contents: read

concurrency:
group: pages-${{ github.ref }}
cancel-in-progress: true

jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
pages: write
steps:
- uses: actions/checkout@v4
with:
fetch-tags: "true"
fetch-depth: "0"

- if: github.event_name != 'pull_request'
uses: actions/configure-pages@v5

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install documentation dependencies
run: python -m pip install -e ".[docs]"

- name: Generate API reference
run: python scripts/build_api_docs.py

- name: Generate benchmark reference
run: python scripts/build_benchmark_docs.py

- name: Build documentation
run: zensical build --clean

- name: Upload Pages artifact
if: github.event_name != 'pull_request'
uses: actions/upload-pages-artifact@v4
with:
path: site

deploy:
if: github.event_name != 'pull_request'
needs: build
runs-on: ubuntu-latest
permissions:
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,9 @@ target/

# docs
docs/api/*
docs/benchmarks/*
site/
.benchmarks/
_autosummary
.quarto/
docs/site_libs
Expand Down
82 changes: 41 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# dasjax

An experimental package for accelerating [DASCore](dascore.org) with [JAX](https://github.com/jax-ml/jax).
![dasjax logo](https://raw.githubusercontent.com/dasdae/dasjax/main/docs/static/dasjax_logo.png)

An experimental package for accelerating [DASCore](https://dascore.org) with [JAX](https://github.com/jax-ml/jax).

## Installation

Expand All @@ -10,11 +12,11 @@ python -m pip install -e ".[dev]"

## Usage

`dasjax`'s main feature is the ability to create compiled DAS pipelines that can run on CPU, GPU, or TPU. These also perform kernel fusions for increased efficiency.
`dasjax`'s main feature is the ability to create compiled DAS pipelines that can run on CPU, GPU, or TPU. These pipelines fuse adjacent JAX-backed operations where possible and cache metadata planning for repeated calls with the same static patch boundary.

### Compiled pipeline

Use `JaxPatchPipeline` when you want to compile a reusable sequence once and run it across many compatible patches.
Use `JaxPatchPipeline` when you want to build a reusable callable once and run it across many compatible patches.

```python
import dascore as dc
Expand All @@ -38,62 +40,60 @@ print(out.shape)

## Development

### Three-Tier Architecture
### Architecture

`dasjax` is organized as a small three-tier stack:
`dasjax` is organized around one core operation model:

1. Pipeline layer:
`src/dasjax/pipeline.py` records operation chains and compiles reusable patch transforms. This is the main user-facing API.
2. Operation layer:
`src/dasjax/operations/` defines the operation registry, execution policies, validation rules, eager patch implementations, and compiled leaf transforms.
3. Kernel layer:
`src/dasjax/kernels/` contains the array-level JAX and callback-backed kernels that actually do the numerical work, grouped by domain (`basic`, `signal`, `filters`, `spectral`).
1. Pipeline layer: `src/dasjax/pipeline.py` records operation chains, plans metadata boundaries, and compiles reusable patch transforms. This is the main user-facing API.
2. Operation layer: `src/dasjax/core.py` defines `PatchOperation`, `PatchBoundary`, `PatchPyTree`, and registry helpers. Registered operation classes live under `src/dasjax/operations/`, grouped by DASCore-style domains.
3. Kernel layer: `src/dasjax/kernels/` contains the array-level JAX kernels that actually do the numerical work, grouped by domain (`basic`, `signal`, `filters`, `spectral`).

This split keeps the package easier to extend: add or update numerical behavior in the kernel layer, describe how it plugs into compiled execution in the operation layer, and expose it through the pipeline layer.
Operation authors use `bind(boundary)` for Python-side metadata planning, `kernel(patch_tree)` for JAX-side data transforms, and `update_boundary(boundary)` for static metadata changes.


### Roadmap
### Operation Coverage

The table below tracks what is missing and roughly how much effort each addition requires.
`dasjax` currently registers 72 pipeline operations. Most operations use native JAX kernels; a smaller set of DASCore-compatible numeric transforms still use host callbacks where a fully static JAX kernel is not practical yet. The current operation set includes:

#### Near-term — straightforward pure-JAX array ops
- Elementwise math and masks: `abs`, `clip`, `real`, `imag`, `angle`, `conj`, `exp`, `log`, `log10`, `log2`, `is_finite`, `isinf`, `isnan`, `fillna`, `where`, and scalar arithmetic operations.
- Reductions and aggregation: `aggregate`, `all`, `any`, `max`, `mean`, `median`, `min`, `std`, and `sum`.
- Coordinate-aware array transforms: `flip`, `roll`, `pad`, `taper`, `taper_range`, `detrend`, `standardize`, `differentiate`, and `integrate`.
- Spectral and signal operations: `dft`, `idft`, `stft`, `istft`, `hilbert`, `envelope`, `phase_weighted_stack`, `whiten`, `fbe`, and `correlate_shift`.
- Filters, mutes, and DAS-domain operations: `pass_filter`, `gaussian_filter`, `hampel_filter`, `median_filter`, `notch_filter`, `savgol_filter`, `sobel_filter`, `slope_filter`, `wiener_filter`, `line_mute`, `slope_mute`, `correlate`, `decimate`, `interpolate`, `resample`, `dispersion_phase_shift`, `tau_p`, `velocity_to_strain_rate`, `velocity_to_strain_rate_edgeless`, and `radians_to_strain`.

Implemented in the current package:
Remaining DASCore patch methods are mostly metadata, selection, convenience, or data-dependent shape operations. `rolling` returns a roller object rather than a patch, and `dropna` has data-dependent output shape, so neither fits the current static compiled-pipeline model directly.

- `real`, `imag`, `angle`, `conj`
- `flip`, `roll`, `pad`
- `standardize`, `differentiate`, `integrate`
- `dft`, `idft`
- `hilbert`, `envelope`
- `taper`, `taper_range`
- `whiten`
## Performance Notes

#### Medium-term — moderate effort or shape-changing
- The intended fast path is to build a `JaxPatchPipeline`, call `.compile()` once, and reuse the returned callable. Patch-specific metadata binding and JIT segment creation happen lazily on the first call for a static boundary, then cached plans and segment runners are reused for subsequent calls with matching dims, dynamic coordinate values, coordinate units, and attrs.
- Equivalent pipeline definitions reuse cached compiled callables automatically.
- Callback-backed operations preserve DASCore compatibility but execute their operation body on the host, so they generally do not benefit as much from JAX fusion as native kernels.
- Benchmarks live under `benchmarks/` and compare compiled `dasjax` pipelines against equivalent DASCore operation chains.

These need either more work in the kernel layer or are shape-changing (segmented pipeline execution, same mechanism as `fbe`).
## Documentation

| Method | Implementation notes |
|---|---|
| `notch_filter` | SOS filter; same pattern as `pass_filter` |
| `savgol_filter` | polynomial fitting per frame; JAX-doable |
| `rolling` | rolling-window reductions (mean, std, …); needs strided views |
| `correlate` | cross-correlation via `jnp.fft` |
| `stft` / `istft` | expose the STFT kernel already used by `fbe` |
| `decimate` | anti-aliased downsampling; shape-changing |
| `aggregate` / `mean` / `std` / `sum` | axis reductions; shape-changing |
Documentation is built with Zensical. The public API reference is generated at build time from the installed `dasjax` package, so run the API generation script before building or serving the site.

## Performance Notes
```bash
uv run python scripts/build_api_docs.py
uv run --extra docs zensical build --clean
```

- The intended fast path is to build a `JaxPatchPipeline`, call `.compile()` once, and reuse the returned callable across many patches of compatible shape and dtype.
- Equivalent pipeline definitions reuse cached compiled callables automatically.
- Benchmarks live under `benchmarks/` and compare compiled `dasjax` pipelines against equivalent DASCore operation chains.
For local preview, run:

```bash
uv run python scripts/build_api_docs.py
uv run --extra docs zensical serve
```

Generated files under `docs/api/` and `site/` are ignored by version control. GitHub Pages builds the same generated API docs and static site on pushes to `main`, then deploys the `site/` artifact through the `github-pages` environment.

## Development Guidelines

- Add new JAX patch methods by defining an array kernel in `src/dasjax/kernels/` and one operation spec in the relevant `src/dasjax/operations/` family module.
- The operation spec is the single source of truth for pipeline support, validation, and shared parity test cases.
- Add new JAX patch methods by defining an array kernel in `src/dasjax/kernels/` and one `PatchOperation` subclass in the appropriate `src/dasjax/operations/` module.
- The `PatchOperation` subclass is the single source of truth for pipeline support, metadata binding, and boundary updates.
- Every new patch method must be tested against a DASCore baseline across the shared mixed-patch fixture in `tests/conftest.py`.
- Prefer comparing internal operation behavior and compiled pipeline outputs against the closest native DASCore method or operator. If DASCore has no direct method, compare against an equivalent `Patch.update(...)` baseline.
- Method-equivalence assertions should check data closeness with `equal_nan=True` when needed and should also verify coordinate preservation.
- Compiled pipeline parity should come from the same declared operation cases rather than a separate hand-maintained test matrix.
- Compiled pipeline parity should compare `JaxPatchPipeline` output against DASCore baselines for each registered operation.
- Install Git hooks locally with `prek install`.
2 changes: 1 addition & 1 deletion agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This file gives AI/code agents a practical checklist for contributing safely to dasjax.

Keep Markdown prose unwrapped. Do not hard-wrap paragraphs in `.md` files unless a specific format requires it.
Keep Markdown prose unwrapped. Do not hard-wrap paragraphs or list items in `.md` files; let editors soft-wrap them. Code blocks and formats that require line breaks are exceptions.

## Scope and priorities

Expand Down
6 changes: 4 additions & 2 deletions benchmarks/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,14 @@ pytest benchmarks/test_pipeline_benchmarks.py

## Benchmark Structure

The first benchmark suite focuses on side-by-side comparisons between:
The benchmark suite focuses on side-by-side comparisons between:

- compiled `dasjax` pipelines
- equivalent DASCore-native operation chains
- individual compiled `dasjax` operations
- equivalent individual DASCore operations

Each comparison is exposed as a separate benchmark test per engine so CodSpeed output is easy to read.
Each comparison is exposed as a separate benchmark test per engine so CodSpeed output is easy to read. Pipeline benchmark groups use names like `scale_fbe`; individual operation benchmark groups use names like `operation_fbe`.

To export benchmark results for ratio comparisons, use:

Expand Down
Loading
Loading