Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,10 @@
node_modules/
.env
.env.local

# EEG data -- large binary files, never committed
derivatives/
*.set
*.fdt
*.bdf
*.edf
42 changes: 42 additions & 0 deletions .rules/ci_cd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# CI/CD Workflow Standards

## Current Pipelines

### Typo Check (`typos.yml`)
Runs on every push/PR. Catches spelling errors in all Markdown and text files.
Config: `.typos.toml` (custom overrides for technical terms).

## Adding New Workflows

### Triggers
- `on: [push, pull_request]` for quality gates
- `on: push: branches: [main]` for deploy/publish steps
- Always pin action versions: `actions/checkout@v4` (not `@master`)

### Pipeline Order (fail fast, cheap first)
1. Lint/typo check
2. Link validation (broken URLs)
3. Build (if applicable)
4. Deploy (main branch only)

## Markdown/Content Checks to Add

```yaml
# Example: broken link check
- name: Check links
uses: lycheeverse/lychee-action@v1
with:
args: --verbose --no-progress '**/*.md'
```

## Key Practices
- Never commit secrets; use GitHub Secrets
- Deploy (osc-docs publish) only from protected main branch
- Document required environment setup in session READMEs

## Week 4 Reference
Week 4 of the course covers CI/CD in depth. Use this repo's workflows
as live examples during that session.

---
*Every workflow failure is a production bug prevented.*
27 changes: 27 additions & 0 deletions .rules/python.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Python Standards (Practicum Code)

## Environment
- **Package Manager:** UV only (not pip, conda, or virtualenv)
- **Config:** `pyproject.toml`

## Quick Reference
```bash
uv init my-analysis && cd my-analysis
uv add numpy pandas mne
uv run python analysis.py
uv run pytest
```

## Style
- Formatter: `ruff format`
- Linter: `ruff check --fix`
- Type hints on all public functions

## Never Do This
- Never `pip install`; use `uv add`
- Never use `os.path`; use `pathlib.Path`
- Never bare `except:` or silent `pass`
- Never commit `.env` or hardcoded credentials

---
*UV for everything. Ruff for style. Real data for tests.*
3 changes: 3 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,15 @@ The live demos use HBN EEG data to analyze neural responses to movie shot change
- `.context/ideas.md` -- Pedagogical decisions, content design
- `.context/research.md` -- Technical investigations, tool evaluations
- `.context/scratch_history.md` -- Failed attempts, lessons learned
- `.context/publishing.md` -- Step-by-step workflow for publishing sessions to courses.osc.earth

## Rules
- `.rules/git.md` -- Version control standards
- `.rules/documentation.md` -- Content and documentation standards
- `.rules/code_review.md` -- PR review process
- `.rules/self_improve.md` -- Evolving course standards from experience
- `.rules/ci_cd.md` -- GitHub Actions workflow standards (see week 4)
- `.rules/python.md` -- Python/UV standards for practicum code

---
Check .context/plan.md for what to work on next.
33 changes: 33 additions & 0 deletions sessions/week-03/practicum/config/cfg_r3mini.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
function cfg = cfg_r3mini()
% Configuration for HBN R3-mini (100 Hz, 20 subjects, development run).
% Switch to cfg_r3full.m once the pipeline validates end-to-end on mini.

% Data paths -- update bids_root to your local R3-mini copy.
% Download: aws s3 sync s3://fcp-indi/data/Projects/HBN/BIDS_EEG/cmi_bids_R3 \
% <bids_root> --exclude "*" --include "sub-NDAR*/ses-HBNsite*/eeg/*ThePresent*"
% (Select 20 subjects from participants.tsv with EEG availability flag = 1.)
cfg.bids_root = fullfile(getenv('HOME'), 'data', 'HBN', 'R3-mini');
cfg.deriv_root = fullfile(fileparts(mfilename('fullpath')), '..', '..', '..', 'derivatives');

% Task
cfg.task = 'ThePresent';

% Sampling rate -- R3-mini is already downsampled to 100 Hz.
% Full R3 is 500 Hz; resample before this pipeline if using full data.
cfg.srate = 100;

% Preprocessing parameters
cfg.highpass_hz = 1; % Hz; FIR highpass via pop_eegfiltnew
cfg.cleanline_hz = [60 120 180]; % US line noise harmonics; adjust for 50 Hz countries

% clean_rawdata channel rejection thresholds (ASR and window rejection off).
% ChannelCriterion: correlation with neighboring channels (0.85 is conservative).
% FlatlineCriterion: max flat-line duration in seconds before channel is dropped.
% LineNoiseCriterion: max line-noise Z-score (applied after CleanLine; catches residual).
cfg.chan_criterion = 0.85;
cfg.flatline_criterion = 5;
cfg.linenoise_criterion = 4;

% subjects: cell array of subject IDs to process, or {} to use all found in bids_root.
% Explicit list is preferred for reproducibility on R3-mini.
cfg.subjects = {};
63 changes: 63 additions & 0 deletions sessions/week-03/practicum/phase1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Phase 1: BIDS Import and Preprocessing

Part of the HBN ERSP practicum (see `../project_brief.md`).

## What this phase does

Imports the HBN R3-mini BIDS dataset and runs four preprocessing steps, producing one cleaned EEG set per subject in `derivatives/preproc/`.

| Step | Function | Operation |
|------|----------|-----------|
| 1 | `p1_import_bids` | `pop_importbids` -- loads BIDS, attaches channel locations and events |
| 2 | `p1_highpass` | `pop_eegfiltnew` -- 1 Hz FIR highpass to remove DC drift |
| 3 | `p1_cleanline` | `pop_cleanline` -- removes 60/120/180 Hz line noise |
| 4 | `p1_channel_reject` | `clean_rawdata` -- drops flat/noisy channels (ASR off) |

## Prerequisites

- EEGLAB 2024+ on MATLAB path
- Plugins: `Biosig`, `CleanLine`, `clean_rawdata` (ships with EEGLAB 2024)
- matlab-mcp-tools configured if driving from Claude Code
- R3-mini data downloaded to `~/data/HBN/R3-mini` (or update `config/cfg_r3mini.m`)

## Running

**Interactive (MATLAB command window):**
```matlab
addpath(genpath('sessions/week-03/practicum'));
run_phase1('r3mini');
```

**Command line (from repo root):**
```bash
matlab -nodisplay -nosplash \
-r "addpath(genpath('sessions/week-03/practicum')); run_phase1('r3mini'); exit"
```

## Configuration

Edit `config/cfg_r3mini.m` to set:
- `cfg.bids_root` -- path to your local R3-mini copy
- `cfg.subjects` -- list specific subject IDs or leave `{}` for all found

## Outputs

```
derivatives/
└── preproc/
├── <subject>_preproc.set (one per subject, not committed to git)
└── phase1_report.mat (channel retention counts, committed if small)
```

`derivatives/` is listed in `.gitignore`. Data files are never committed.

## Acceptance criteria (closes #11)

- [ ] `run_phase1('r3mini')` completes without error on R3-mini
- [ ] All subjects retain >90% of channels (warning printed otherwise)
- [ ] `phase1_report.mat` saved with per-subject channel counts
- [ ] No data files committed to git

## Deviations from reference pipeline

The reference (`study_handy_scripts.m`) runs highpass before CleanLine. This order is preserved here. The reference uses `clean_rawdata` with default ASR settings; this pipeline disables ASR (`BurstCriterion = 'off'`) because ASR modifies the continuous signal in ways that can bias ICA decomposition (Phase 2). Channel-level rejection only.
49 changes: 49 additions & 0 deletions sessions/week-03/practicum/phase1/p1_channel_reject.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
function [ALLEEG, report] = p1_channel_reject(ALLEEG, cfg)
% Reject bad channels using clean_rawdata (channel-level criteria only).
%
% ASR (artifact subspace reconstruction) and window rejection are explicitly
% disabled -- those steps would alter the continuous signal in ways that
% interact with ICA (Phase 2). Channel rejection here only removes electrodes
% that are flat, noisy, or poorly correlated with neighbors.
%
% Thresholds from cfg (see cfg_r3mini.m for values and justification):
% ChannelCriterion: minimum correlation with neighbor channels
% FlatlineCriterion: max seconds of flat signal before rejection
% LineNoiseCriterion: residual line-noise Z-score after CleanLine
%
% Returns report: struct array with subject ID and channel counts.
%
% Requires: clean_rawdata plugin (EEGLAB 2024+ ships it by default).

report = struct('subject', {}, 'n_orig', {}, 'n_kept', {}, 'n_rejected', {});

for i = 1:length(ALLEEG)
n_orig = ALLEEG(i).nbchan;

ALLEEG(i) = clean_rawdata(ALLEEG(i), ...
'FlatlineCriterion', cfg.flatline_criterion, ...
'ChannelCriterion', cfg.chan_criterion, ...
'LineNoiseCriterion', cfg.linenoise_criterion, ...
'Highpass', 'off', ... % already done in p1_highpass
'BurstCriterion', 'off', ... % ASR off
'WindowCriterion', 'off', ... % window rejection off
'BurstRejection', 'off', ...
'Distance', 'Euclidian');

n_kept = ALLEEG(i).nbchan;
n_rej = n_orig - n_kept;

report(i).subject = ALLEEG(i).subject;
report(i).n_orig = n_orig;
report(i).n_kept = n_kept;
report(i).n_rejected = n_rej;

fprintf('[p1_channel_reject] Subject %s: kept %d/%d channels (%d rejected).\n', ...
ALLEEG(i).subject, n_kept, n_orig, n_rej);

if n_kept / n_orig < 0.80
warning('[p1_channel_reject] Subject %s: less than 80%% channels retained -- inspect manually.', ...
ALLEEG(i).subject);
end
end
end
38 changes: 38 additions & 0 deletions sessions/week-03/practicum/phase1/p1_cleanline.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
function ALLEEG = p1_cleanline(ALLEEG, cfg)
% Remove line noise at US power-line harmonics using the CleanLine plugin.
%
% Targets cfg.cleanline_hz (default [60 120 180] Hz for 60 Hz countries).
% CleanLine fits and subtracts sinusoidal components without affecting
% broadband signal, which is preferable to a notch filter for ERSP analysis.
%
% Key parameters justified:
% winsize/winstep = 4 s: balances frequency resolution and stationarity.
% bandwidth = 2 Hz: narrow enough not to smear adjacent bands.
% p = 0.01: conservative detection threshold; reduces false removals.
% scanforlines = 1: lets CleanLine search nearby frequencies in case
% the actual line drifts slightly from the nominal value.
%
% Requires: CleanLine plugin (EEGLAB plugin manager or manual install).

cleanline_opts = { ...
'bandwidth', 2, ...
'chanlist', [], ... % [] = all channels
'computepower', 0, ...
'linefreqs', cfg.cleanline_hz, ...
'normSpectrum', 0, ...
'p', 0.01, ...
'pad', 2, ...
'plotfigures', 0, ...
'scanforlines', 1, ...
'sigtype', 'Channels', ...
'tau', 100, ...
'verb', 0, ...
'winsize', 4, ...
'winstep', 4 };

for i = 1:length(ALLEEG)
ALLEEG(i) = pop_cleanline(ALLEEG(i), cleanline_opts{:});
fprintf('[p1_cleanline] Subject %s: line noise removed at %s Hz.\n', ...
ALLEEG(i).subject, num2str(cfg.cleanline_hz));
end
end
16 changes: 16 additions & 0 deletions sessions/week-03/practicum/phase1/p1_highpass.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
function ALLEEG = p1_highpass(ALLEEG, cfg)
% Apply a zero-phase FIR highpass filter to each dataset in ALLEEG.
%
% Uses pop_eegfiltnew (EEGLAB's built-in wrapper around firfilt).
% Cutoff: cfg.highpass_hz (default 1 Hz).
% 1 Hz removes slow DC drift without distorting the 0-500 ms epoch window.
% Filter order is set automatically by pop_eegfiltnew based on cutoff and srate.
%
% Requires: EEGLAB 2024+.

for i = 1:length(ALLEEG)
ALLEEG(i) = pop_eegfiltnew(ALLEEG(i), cfg.highpass_hz, []);
fprintf('[p1_highpass] Subject %s: highpass %.1f Hz applied.\n', ...
ALLEEG(i).subject, cfg.highpass_hz);
end
end
26 changes: 26 additions & 0 deletions sessions/week-03/practicum/phase1/p1_import_bids.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
function ALLEEG = p1_import_bids(cfg)
% Import HBN BIDS dataset for one task using pop_importbids.
%
% Returns ALLEEG (array of EEG structs), one per subject/session.
% Raw sets are written to <cfg.deriv_root>/raw/ for checkpoint recovery.
%
% Requires: EEGLAB 2024+ with Biosig plugin.

out_dir = fullfile(cfg.deriv_root, 'raw');
if ~exist(out_dir, 'dir'), mkdir(out_dir); end

import_opts = { ...
'outputdir', out_dir, ...
'task', cfg.task, ...
'bidsevent', 'on', ...
'bidschanloc', 'on' };

if ~isempty(cfg.subjects)
import_opts = [import_opts, {'subjects', cfg.subjects}];
end

[~, ALLEEG] = pop_importbids(cfg.bids_root, import_opts{:});

fprintf('[p1_import_bids] Imported %d dataset(s) for task %s.\n', ...
length(ALLEEG), cfg.task);
end
Loading