random feedback improvements by rvandewater · Pull Request #189 · rvandewater/YAIB

rvandewater · 2026-06-05T08:39:22Z

Summary by CodeRabbit

Bug Fixes
- Fixed a SLURM script syntax error and corrected a CLI help typo.
Chores
- CI workflows now run push/pull_request on more branches; E2E tests adjusted to run on selected Python versions and updated setup/log message; removed pinned dependencies from requirements; loosened several dev/runtime dependency pins.
Documentation
- Updated Python version badge and contribution link in the README.
Refactor
- Adjusted imputation mask initialization, training dataset defaults and SHAP output name, residual block iteration behavior, and minor dataset init cleanup.

coderabbitai · 2026-06-05T08:40:02Z

Need an answer fast? Review this PR in Change Stack to ask focused questions about the PR or a changed range.

Warning

Review limit reached

@rvandewater, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 20 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ca3c7c6d-3773-4c29-b0bd-32bade998c7a

📥 Commits

Reviewing files that changed from the base of the PR and between f781114 and ac71d4a.

📒 Files selected for processing (2)

icu_benchmarks/imputation/diffwave.py
icu_benchmarks/models/train.py

Walkthrough

Aligns CI branch triggers across workflows, updates e2e Python install and log messaging, fixes a SLURM echo typo, removes an unused amputation mask init, adjusts DiffWave residual-loop, sets training dataset defaults and renames test SHAP output, and updates README/CLI/loader/dependency strings.

Changes

Codebase maintenance and pipeline updates

Layer / File(s)	Summary
GitHub Actions workflow trigger and e2e setup fixes `.github/workflows/code-quality.yml`, `.github/workflows/e2e-integration.yml`, `.github/workflows/python-build.yml`	`push` and `pull_request` branch filters are aligned to `main`, `master`, `release/*`, and `development`. E2E test matrix drops 3.13, the E2E job installs `${{ matrix.python-version }}`, and an E2E log-path message was updated.
Model training dataset defaults and SHAP filename `icu_benchmarks/models/train.py`	Set non-empty default `dataset_names` when None; use `dataset_names.get(..., "default")` for lookups; rename test SHAP output to `test_shap_values.parquet`.
DiffWave residual loop change `icu_benchmarks/imputation/diffwave.py`	Introduce nested loop in `Residual_group.forward` that resets `h` each inner iteration and changes residual/skip accumulation semantics.
Imputation mask initialization removal `icu_benchmarks/imputation/amputations.py`	Remove unused `mask = torch.zeros_like(data)` so mask is produced by mechanism-specific logic.
SLURM script, docs, loader, and CLI fixes `experiments/slurm_base_template.sh`, `README.md`, `icu_benchmarks/data/loader.py`, `icu_benchmarks/run_utils.py`	Fix `echi` → `echo`; update Python badge to 3.13 and CONTRIBUTING link casing; adjust deprecation-warning whitespace; fix `--verbose` help spelling.
pyproject dependency loosenings `pyproject.toml`	Relax `pyarrow` and `numba` from exact pins to minimum-version constraints and loosen `coverage` dev pin.

sequenceDiagram
  participant GitHub
  participant E2EIntegration
  participant PythonRuntime
  participant CodeQuality
  GitHub->>CodeQuality: trigger push/pull_request (main, master, release/*, development)
  GitHub->>E2EIntegration: trigger push/pull_request (main, master, release/*, development)
  E2EIntegration->>PythonRuntime: run `uv python install ${{ matrix.python-version }}`
  E2EIntegration->>E2EIntegration: verify log path `yaib_logs/mimic_demo`

🎯 4 (Complex) | ⏱️ ~45 minutes

🐰
Paws tapped keys to tidy the trail,
Branches aligned and scripts regale,
Masks wake fresh, loops dance anew,
Deps and docs gleam through and through,
A tidy hop — commit complete!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'random feedback improvements' is too vague and does not clearly summarize the actual changes in the pull request, which include workflow updates, bug fixes, dependency management changes, and code refactoring across multiple files.	Replace with a more specific title that describes the main changes, such as 'Update CI workflows, fix bugs, and refactor code' or identify the primary objective of this changeset.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch coderabbit_feedback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

icu_benchmarks/models/train.py (1)
263-263: ⚡ Quick win

Consider renaming for consistency.

The test SHAP values now use the pattern test_shap_values.parquet (split prefix), while train SHAP values still use shap_values_train.parquet (split suffix). For consistency and maintainability, both should follow the same naming convention.
📝 Proposed fix to align naming convention
-            with (log_dir / "shap_values_train.parquet").open("wb") as f:
+            with (log_dir / "train_shap_values.parquet").open("wb") as f:
                 shaps_train.write_parquet(f)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@icu_benchmarks/models/train.py` at line 263, Rename the train SHAP output
filename to match the split-prefix convention used by tests (change
"shap_values_train.parquet" to "train_shap_values.parquet"); update the write
site that uses log_dir / "shap_values_train.parquet" and any other reads/writes
that reference "shap_values_train.parquet" so they use
"train_shap_values.parquet" instead (search for the exact string
"shap_values_train.parquet" and replace with "train_shap_values.parquet"),
ensuring consistency with the existing "test_shap_values.parquet" pattern.
experiments/slurm_base_template.sh (1)
23-23: 💤 Low value

Consider quoting the variable expansion.

Shellcheck suggests double-quoting ${TASK} to prevent globbing and word splitting. While the current usage is safe given the expected values (e.g., "BinaryClassification"), quoting is a defensive best practice.

Note that lines 21-22 and 24-27 also use unquoted variables in echo statements, so applying this change here alone would be inconsistent. Consider whether to quote all variable expansions in logging statements throughout the file.
🛡️ Proposed fix to add defensive quoting
-echo "Task type:" ${TASK}
+echo "Task type:" "${TASK}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@experiments/slurm_base_template.sh` at line 23, Quote variable expansions in
logging echo statements to prevent globbing and word-splitting: change echo
"Task type:" ${TASK} to use "${TASK}" and apply the same defensive quoting to
the other echo usages that expand variables (the nearby echo statements at the
same block that reference TASK, and the other unquoted expansions on lines
around it). Locate the echo calls in this script (references to TASK and other
env vars) and wrap their variable expansions in double quotes consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@experiments/slurm_base_template.sh`:
- Line 23: Quote variable expansions in logging echo statements to prevent
globbing and word-splitting: change echo "Task type:" ${TASK} to use "${TASK}"
and apply the same defensive quoting to the other echo usages that expand
variables (the nearby echo statements at the same block that reference TASK, and
the other unquoted expansions on lines around it). Locate the echo calls in this
script (references to TASK and other env vars) and wrap their variable
expansions in double quotes consistently.

In `@icu_benchmarks/models/train.py`:
- Line 263: Rename the train SHAP output filename to match the split-prefix
convention used by tests (change "shap_values_train.parquet" to
"train_shap_values.parquet"); update the write site that uses log_dir /
"shap_values_train.parquet" and any other reads/writes that reference
"shap_values_train.parquet" so they use "train_shap_values.parquet" instead
(search for the exact string "shap_values_train.parquet" and replace with
"train_shap_values.parquet"), ensuring consistency with the existing
"test_shap_values.parquet" pattern.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d40bc120-0a2c-4288-a774-1343cc6ff757

📥 Commits

Reviewing files that changed from the base of the PR and between a69c586 and 10314b9.

📒 Files selected for processing (7)

.github/workflows/code-quality.yml
.github/workflows/e2e-integration.yml
.github/workflows/python-build.yml
experiments/slurm_base_template.sh
icu_benchmarks/imputation/amputations.py
icu_benchmarks/models/train.py
requirements.txt

💤 Files with no reviewable changes (2)

requirements.txt
icu_benchmarks/imputation/amputations.py

Fix duplicated nested loop and state reset in residual stack. Lines 388-393 currently re-run the residual stack inside a nested loop, reset h to noise each outer iteration, and double-count skip_n. This breaks the intended iterative residual flow.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@icu_benchmarks/imputation/diffwave.py`:
- Around line 387-395: The forward loop over residual layers was corrupted:
remove the nested/repeated loops that reuse variable `n`, stop resetting `h` to
`noise` each iteration, and ensure skip connections are accumulated exactly once
per residual block; iterate once for i in range(self.num_res_layers) (or
similar), on each iteration call self.residual_blocks[i]((h, conditional,
diffusion_step_embed)) to get (h, skip_n), set h to the returned h (not noise),
and do skip += skip_n only once per block before returning skip * math.sqrt(1.0
/ self.num_res_layers) to restore the standard DiffWave residual chain behavior
using the existing attributes `num_res_layers`, `residual_blocks`, `skip`, `h`,
`noise`, `diffusion_step_embed`, and `conditional`.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f92b23d1-126e-4512-abfe-67a4cf0f1e43

📥 Commits

Reviewing files that changed from the base of the PR and between 10314b9 and f17020f.

📒 Files selected for processing (3)

README.md
icu_benchmarks/imputation/diffwave.py
icu_benchmarks/models/train.py

✅ Files skipped from review due to trivial changes (1)

README.md

🚧 Files skipped from review as they are similar to previous changes (1)

icu_benchmarks/models/train.py

random feedback improvements

10314b9

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

rvandewater added 3 commits June 5, 2026 10:46

diffwave

afbfc40

Fix duplicated nested loop and state reset in residual stack. Lines 388-393 currently re-run the residual stack inside a nested loop, reset h to noise each outer iteration, and double-count skip_n. This breaks the intended iterative residual flow.

readme typo

852cf6a

default datasets

f17020f

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread icu_benchmarks/imputation/diffwave.py

rvandewater added 10 commits June 5, 2026 11:06

minor fixes

0ee1e0e

update coverage for python 3.13

f9ca998

update coverage for python 3.13

5a2e069

update for python 3.13

02f35ca

update for python 3.13

a9304a5

change back the numpy version

886d4c6

openblas for python 3.13

1915b08

catboost

c3e8926

remove python 3.13 due to major dependency issues with numpy 1.26/2.0

f781114

fixes

ac71d4a

rvandewater merged commit 2b51e1b into development Jun 6, 2026
5 checks passed

rvandewater deleted the coderabbit_feedback branch June 6, 2026 10:06

coderabbitai Bot mentioned this pull request Jun 8, 2026

Merge pull request #169 from rvandewater/development #190

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random feedback improvements#189

random feedback improvements#189
rvandewater merged 14 commits into
developmentfrom
coderabbit_feedback

rvandewater commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Review limit reached

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rvandewater commented Jun 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rvandewater commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading