DASDAE · ahmadtourei · Dec 10, 2025 · Dec 8, 2025 · Dec 10, 2025
diff --git a/.DS_Store b/.DS_Store
diff --git a/README.md b/README.md
@@ -1,13 +1,13 @@
 # das-anomaly
 [![DOI](https://zenodo.org/badge/823391484.svg)](https://doi.org/10.5281/zenodo.12747212)
 [![Licence](https://www.gnu.org/graphics/lgplv3-88x31.png)](https://www.gnu.org/licenses/lgpl.html)
-[![codecov](https://codecov.io/gh/ahmadtourei/das-anomaly/branch/main/graph/badge.svg)](https://codecov.io/gh/ahmadtourei/das-anomaly)
+[![codecov](https://codecov.io/gh/dasdae/das-anomaly/branch/main/graph/badge.svg)](https://codecov.io/gh/dasdae/das-anomaly)
 
 _das-anomaly_ is an open-source Python package for unsupervised anomaly detection in distributed acoustic sensing (DAS) datasets using an autoencoder-based deep learning algorithm. It is being developed by Ahmad Tourei under the supervision of Dr. Eileen R. Martin at Colorado School of Mines. 
 
 If you use _das-anomaly_ in your work, please cite the following:
 
-> Ahmad Tourei. (2025). ahmadtourei/das-anomaly: latest (Concept). Zenodo. http://doi.org/10.5281/zenodo.12747212
+> Ahmad Tourei. (2025). DASDAE/das-anomaly: latest (Concept). Zenodo. http://doi.org/10.5281/zenodo.12747212
 
 
 ## Installation
@@ -77,7 +77,7 @@ The overall workflow for using the package is illustrated below:
 The main steps are:  
 1. Define constants and create a Spool of data: 
 
-Using the _config_user_ script in the das_anomaly directory, define the constants and directory paths for data, power spectral density (PSD) images, detected anomaly results, etc. You would complete adding the values as you go over the steps mentioned below. Then, using DASCore, create an index file for the [spool](https://dascore.org/tutorial/spool.html) of data first time reading the DAS data directory:
+Using the _config_user_ script in the das_anomaly directory, define the constants and directory paths for the data, power spectral density (PSD) images, detected anomaly results, etc. You would complete adding the values and paths as you go over the steps mentioned below. Then, using DASCore, create an index file for the [spool](https://dascore.org/tutorial/spool.html) of data first time reading the DAS data directory:
 
 ### Example
 ```python
@@ -99,17 +99,17 @@ To ensure all PSD images share the same colorbar scale (in RGB), determine an ap
 from das_anomaly.psd import PSDConfig, PSDGenerator
 from das_anomaly.settings import SETTINGS
 
-# path to one or a few background noise data 
+# path to one or a few background noise patches 
 bn_data_path = SETTINGS.BN_DATA_PATH
 cfg = PSDConfig(data_path=bn_data_path)
 gen = PSDGenerator(cfg)
-percentile = 90 # data dependent
+percentile = 90 # data dependent - need visual inspection
 clip_val = gen.run_get_psd_val(percentile=percentile)
 print(f"Mean {percentile}-percentile amplitude across all patches: {clip_val:.3e}")
 ```
 3. Generate PSD plots: 
 
-Use the `das_anomaly.psd` module and create PSD plots in RGB format and in plain mode (with no axes or colorbar). The `das_anomaly.psd.PSDGenerator reads DAS data, creates a spool using DASCore library, applies a detrend function to each patch of the chunked spool, and then average the energy over a desired time window and stack all channels together to create a spatial PSD with channels on the X-axis and frequency on the Y-axis. You can use MPI to distribute reading data and plotting PSDs over CPUs. 
+Use the `das_anomaly.psd` module and create PSD plots in RGB format and in plain mode (with no axes or colorbar). The `das_anomaly.psd.PSDGenerator` reads DAS data, creates a spool using DASCore library, applies a detrend function to each patch of the chunked spool, and then average the energy over a desired time window and stack all channels together to create a spatial PSD image with channels on the X-axis and frequency on the Y-axis. You can use MPI to embarrassingly distribute reading data and plotting PSDs over CPUs. 
 ### Example
 ```python
 from das_anomaly.psd import PSDConfig, PSDGenerator
@@ -121,23 +121,25 @@ PSDGenerator(cfg).run()
 PSDGenerator(cfg).run_parallel()
 ```
 Note: If you'd like to use PSDs for purposes other than training the model, the `hide_axes=False` will plot the PSD with axes and colorbar (default is True).
+
 ### Example
 ```python
 from das_anomaly.psd import PSDConfig, PSDGenerator
 
 cfg = PSDConfig(hide_axes=False)
 # serial processing with single processor:
 PSDGenerator(cfg).run()
-# parallel processing with multiple processors using MPI:
+# parallel processing with multiple processors using MPI (first, make sure you've installed the package with all dependencies explained above):
 PSDGenerator(cfg).run_parallel()
 ```
 4. Select and copy known anomaly PSD plots:
 
-From the generated PSD plots, identify and copy examples of known anomalies to the ANOMALY_IMAGES_PATH specified in the _config_user_ input script. These anomalies can include events such as earthquakes from an existing catalog, instrument noise, anthropogenic disturbances, etc. Including these examples helps improve thresholding during the detection process.
+From the generated PSD plots, visually identify and then copy examples of known anomalies to the ANOMALY_IMAGES_PATH specified in the _config_user_ input script. These anomalies can include events such as earthquakes from an existing catalog, instrument noise, anthropogenic disturbances, etc. Including these examples helps improve thresholding during the detection process.
 
 5. Train: 
 
 The `das_anomaly.train` module helps with randomly selecting train and test PSD images and training the model (with CPU or GPU) on anomaly-free PSD images. 
+
 ### Example
 ```python
 from das_anomaly.settings import SETTINGS
@@ -151,7 +153,7 @@ ImageSplitter(cfg).run()
 cfg = TrainAEConfig()
 AutoencoderTrainer(cfg).run()
 ```
-Note: Since the `TrainSplitConfig()` function randomly selects PSD images from the generated plots, you must ensure the training and testing datasets do not include obvious anomalies. If you have an excel sheet with time stamp of anomalies (such as a catalog), use the "exclude_known_events_from_training" in examples directory to exclude them. Or, manually inspect both the training and testing sets to ensure they do not contain apparent anomalies. Review their time- and frequency-domain representations, and remove any suspicious samples to maintain the quality of training.
+Note: Since the `TrainSplitConfig()` function randomly selects PSD images from the generated plots, you must ensure the training and testing datasets do not include obvious anomalies. If you have an excel sheet with time stamp of anomalies (such as a catalog), use the "exclude_known_events_from_training" in examples directory to exclude them. Or, manually inspect both the training and testing sets to ensure they do not contain apparent anomalies. Review their time- and frequency-domain plots, and remove any suspicious samples to maintain the quality of training.
 
 6. Test and set thresholds: 
 
@@ -160,6 +162,7 @@ Using the _validate_and_plot_density_ and _thresholding_f_score_ jupyter noteboo
 7. Run the trained model: 
 
 The `das_anomaly.detect` module uses the trained model to detect anomalies in the PSD images and writes their information (e.g., time stamp). It also copies the detected anomaly to the RESULTS_PATH. MPI can be used to distribute PSDs over CPUs. Then, using the `das_anomaly.count` module, count the number of detected anomalies and display their details and file paths.
+
 ### Example
 ```python
 from das_anomaly.count.counter import CounterConfig, AnomalyCounter
@@ -181,6 +184,6 @@ print(anomalies) # prints info on number of anomalies and path to them
 Still under development. Use with caution.
 
 ## Contact
-Ahmad Tourei, Colorado School of Mines
-
-tourei@mines.edu | ahmadtourei@gmail.com
+Ahmad Tourei 
+Colorado School of Mines
+ahmadtourei@gmail.com
diff --git a/das_anomaly/utils.py b/das_anomaly/utils.py
@@ -1,5 +1,5 @@
 """
-Utility functions for anomaly detection in DAS datasets using autoencoders.
+Utility functions for the package.
 """
 
 from __future__ import annotations
@@ -10,13 +10,14 @@
 import matplotlib
 
 matplotlib.use("Agg")
+
 import matplotlib.pyplot as plt
 import numpy as np
 import scipy.fftpack as ft
 import tensorflow as tf
+from PIL import Image
 from matplotlib import gridspec
 from matplotlib.colors import LinearSegmentedColormap
-from PIL import Image
 from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D
 from tensorflow.keras.models import Sequential
 
@@ -151,7 +152,7 @@ def decoder(
 
 
 def density(encoder_model, batch_images, kde):
-    """Caulculate the density score."""
+    """Caulculate the density score for the a batch of PSDs."""
     # Flatten the encoder output because KDE from sklearn takes 1D vectors as input
     encoder_output_shape = encoder_model.output_shape
     out_vector_shape = (
@@ -269,7 +270,7 @@ def plot_spec(
     hide_axes=True,
     save_fig=True,
 ):
-    """Save the power spectral density (Channel-Frequency-Amplitude) plot."""
+    """Plot and/or save the spatial power spectral density (Channel-Frequency-Amplitude) image."""
     # Get the data
     strain_rate = patch_strain.transpose("time", "distance").data  # pragma: no cover
     # Get coords info

diff --git a/examples/bash_jobs/detect.sh b/examples/bash_jobs/detect.sh
@@ -0,0 +1,22 @@
+#!/bin/bash
+#SBATCH --ntasks=1
+#SBATCH -t 24:00:00
+#SBATCH -A YOUR_ACCOUNT
+#SBATCH --mem-per-cpu=128G
+
+# Print start time
+echo "Job started at: $(date)"
+
+# Load modules and environment
+source activate dasanomaly
+
+# Run the script
+python << EOF
+from das_anomaly.detect import AnomalyDetector, DetectConfig
+
+cfg = DetectConfig()
+AnomalyDetector(cfg).run()
+EOF
+
+# Print end time
+echo "Job ended at: $(date)"
diff --git a/examples/bash_jobs/detect_mpi.sh b/examples/bash_jobs/detect_mpi.sh
@@ -11,7 +11,7 @@ echo "Job started at: $(date)"
 module load openmpi/gcc/64/4.1.5
 source activate dasanomaly
 
-# Recommended in a SLURM environment: use srun, not mpirun
+# Run with MPI 
 mpirun -n $SLURM_NTASKS python -u detect_parallel.py
 
 # Print end time

diff --git a/examples/bash_jobs/psd_mpi.sh b/examples/bash_jobs/psd_mpi.sh
@@ -11,7 +11,7 @@ echo "Job started at: $(date)"
 module load openmpi/gcc/64/4.1.5
 source activate dasanomaly
 
-# Recommended in a SLURM environment: use srun, not mpirun
+# Run with MPI 
 mpirun -n $SLURM_NTASKS python -u psd_parallel.py
 
 # Print end time

diff --git a/examples/exclude_known_events_from_training/find_event_time_from_excel_Sheet.ipynb b/examples/exclude_known_events_from_training/find_event_time_from_excel_Sheet.ipynb
@@ -11,9 +11,15 @@
     "from pathlib import Path\n",
     "import re\n",
     "\n",
-    "from das_anomaly.settings import SETTINGS\n",
-    "\n",
-    "\n",
+    "from das_anomaly.settings import SETTINGS"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
     "# Replace 'events.csv' with the path to your CSV file\n",
     "file_path = 'events.csv'\n",
     "\n",

diff --git a/examples/hyperparameter_tuning.ipynb b/examples/hyperparameter_tuning.ipynb
diff --git a/examples/plot_psd.ipynb b/examples/plot_psd.ipynb