fix(deps): resolve 19 dependabot security vulnerabilities#153
fix(deps): resolve 19 dependabot security vulnerabilities#153cagataycali wants to merge 8 commits into
Conversation
Review notes (automated review)The pyproject changes are tiny (3 ceiling bumps), but the uv.lock diff is +1943/−1004 — that's the part to scrutinize, since it's where the actual security fixes (and any drive-by changes) are landing. Questions before approving
SuggestionSplitting this into two PRs would make both easier to review:
If the lock-file delta is purely resolver churn (no new direct deps, no major bumps that aren't already explained by the pyproject ceiling moves), CI is green. No code changes, so no functional regressions to test, but a CHANGELOG / release-note line would help users tracking why their resolver picks Pillow 12. |
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependency hygiene PR: bumps three runtime/dev caps in pyproject.toml (Pillow <12 → <13, pytest <9 → <10, pytest-cov <6 → <8) and regenerates uv.lock to pull the fixed transitive versions (Pillow 12.2.0, cryptography 48.0.0, pytest 9.0.3, python-multipart 0.0.28, gitpython 3.1.50, urllib3 2.7.0). Closes 19 of 20 open Dependabot alerts; one (diffusers GHSA-98h9-4798-4q5v) is deferred because lerobot 0.5.1 caps diffusers<0.36.
The constraint changes follow the AGENTS.md >=1.0 deps → cap-major convention. Validation (hatch run lint, hatch run test → 1671 passed) is included in the description, and I confirmed against PyPI that pytest-cov 7.1.0 (resolved) declares pytest>=7, so the pytest 9 / pytest-cov 7 combination is compatible.
What's good
- Scope discipline: 2 files, no source-code changes, exactly the diff a security-fix PR should be.
- Constraint changes propagated consistently to all three places they appear in
pyproject.toml([project.dependencies],devextra,[tool.hatch.envs.default]). - PR description has a clear before/after table, a section explicitly calling out the one alert that could not be fixed, and the validation commands run.
- Caps are bumped one major above current latest (
<13.0.0for Pillow 12.x,<10.0.0for pytest 9.x), matching the existing pattern.
Concerns
- Deferred alert has no tracking issue. PR description says "Tracking upstream issue needed" for diffusers GHSA-98h9-4798-4q5v but no follow-up issue exists in the repo (verified via
gh issue list --search diffusers). Per AGENTS.md ("ALWAYS use the project board to track work… create GitHub issues and add them to this board with Status + Priority set"), this should be filed and added to project board PVT_kwDOD151Fs4BSRJP before merge — otherwise the deferred alert becomes invisible the moment this PR closes. - No CHANGELOG entry. Pillow 11→12 and pytest 8→9 are major-version bumps in the toolchain. Downstream consumers who pin
strands-robots[dev]or who indirectly resolve through the runtime Pillow constraint may need to react. A one-line entry under the existingUnreleasedsection inCHANGELOG.mdwould land this in the right place. pytest-covrow missing from PR description's transitive table. The lock resolves it to 7.1.0 (from a previous 5.x or 6.x); worth listing alongside the other transitive bumps for completeness.
Verification suggestions
# Confirm all 19 alerts close on merge — sanity-check the GHSAs against the lock:
grep -E 'name = "(pillow|cryptography|pytest|python-multipart|gitpython|urllib3)"' uv.lock -A1
# Smoke-test that pytest-cov 7 + pytest 9 reporting still works end-to-end:
hatch run test -- -k registry --cov-report=term-missing | tail -20
# Confirm Pillow 12 wheels resolve on the python versions we declare (3.12, 3.13):
uv pip compile pyproject.toml --extra dev --python-version 3.13 2>&1 | grep -i pillow| "numpy>=1.21.0,<3.0.0", | ||
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=8.0.0,<13.0.0", |
There was a problem hiding this comment.
Pillow's constraint is duplicated here (line 40, runtime) and at line 124 ([tool.hatch.envs.default]). Both were updated in this PR, so the lockstep is correct today, but every future Pillow bump now needs two edits. Consider following the same pattern as pytest-timeout (only declared in the dev extra and not redeclared in the hatch env) — [tool.hatch.envs.default] with features = ["all"] already pulls runtime Pillow via the [project.dependencies] block, so the line-124 redeclaration is redundant. Not blocking for a security PR, but worth a follow-up cleanup.
| "pytest>=6.0,<10.0.0", | ||
| "pytest-cov>=4.0.0,<8.0.0", | ||
| "ruff>=0.4.0,<1.0.0", | ||
| "mypy>=1.0.0,<2.0.0", |
There was a problem hiding this comment.
pytest cap bumped to <10.0.0 is consistent with the AGENTS.md >=1.0 cap-major convention, and pytest-cov 7.1.0 (the version uv.lock resolves) declares pytest>=7, so the combination is compatible.
One thing worth confirming before merge: pytest-timeout>=2.0.0,<3.0.0 (line 100) is not declared in [tool.hatch.envs.default].dependencies (lines 119-128), which means hatch resolves it transitively from the dev extra. With the pytest 8→9 bump, pytest-timeout 2.4.0 is what uv.lock resolves — please double-check that hatch run test-integ (which uses --timeout=300, line 132) still works on a clean env. The unit-test pass count in the PR description is reassuring but test-integ exercises the timeout path that unit tests don't.
| "ruff>=0.4.0,<1.0.0", | ||
| "mypy>=1.0.0,<2.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=8.0.0,<13.0.0", |
There was a problem hiding this comment.
PR description's transitive table omits pytest-cov — the lock now resolves it to 7.1.0 (a major bump from the prior pin). Worth adding to the description for audit completeness, since pytest-cov 7 has a different reporting format than 5/6 and CI artifacts produced by --cov-report=term-missing (line 232) may look slightly different downstream.
There was a problem hiding this comment.
Fixed in d478d12. Added pytest-cov row to the transitive table in the PR description: pytest-cov | 5.x | 7.1.0 | (tooling bump, not CVE).
This comment was generated by an AI agent (strands-agents).
Upgrades direct + transitive dependencies to address open Dependabot alerts: Direct (pyproject.toml): - Pillow constraint: <12.0.0 -> <13.0.0 (allows 12.2.0+) - pytest constraint: <9.0.0 -> <10.0.0 (allows 9.0.3 fix) - pytest-cov constraint: <6.0.0 -> <8.0.0 Transitive (uv.lock regenerated via uv lock --upgrade): - pillow 11.3.0 -> 12.2.0 (#1, strands-labs#3, strands-labs#8, strands-labs#9, strands-labs#10, strands-labs#11, strands-labs#17, strands-labs#18) - cryptography -> 48.0.0 (strands-labs#2: GHSA-p423-j2cm-9vmq) - pytest 8.4.2 -> 9.0.3 (strands-labs#4: GHSA-6w46-j5rx-g56g) - python-multipart 0.0.22 -> 0.0.28 (strands-labs#5, strands-labs#13) - gitpython 3.1.46 -> 3.1.50 (strands-labs#6, strands-labs#7, strands-labs#12, strands-labs#14, strands-labs#16) - urllib3 2.6.3 -> 2.7.0 (strands-labs#19, strands-labs#20) Remaining (1/20): - diffusers 0.35.2 (strands-labs#15: GHSA-98h9-4798-4q5v) blocked by upstream lerobot 0.5.1 cap 'diffusers<0.36.0,>=0.27.2'. Requires lerobot to bump first; not actionable in this repo. Validation: - hatch run lint: pass (139 files) - hatch run test: 1671 passed, 1 skipped
3acebec to
aa7564d
Compare
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
This PR widens three constraints in pyproject.toml (Pillow <13, pytest <10, pytest-cov <8) and regenerates uv.lock with uv lock --upgrade to absorb the version bumps that close 19 of 20 open Dependabot alerts. The pyproject delta is a clean +6/-6 and matches the AGENTS.md convention of capping at the next major for >=1.0 deps. CI evidence (1671 passed, 1 skipped, lint clean) is reasonable for a pure-deps change.
What's good
- The PR description is the gold standard for a security-bump PR: per-package before/after table, CVE/GHSA mapping, explicit acknowledgement of the one alert that can't be closed (#15 / diffusers), and the upstream blocker explained.
- Constraint changes follow
AGENTS.md > Key Conventions > Dependency bounds(cap major for>=1.0deps). - Both copies of the dev pins (
[project.optional-dependencies] devand[tool.hatch.envs.default.dependencies]) are kept in sync.
Concerns
-
Scope claim doesn't match the lockfile diff. The PR states "No source-code changes - pure dependency hygiene", but the regenerated
uv.lockadds 42 new packages and removes 1 (pandas) - among the new ones:mujoco,transformers(5.8.1, a major-version jump),libero,robomimic,robosuite,numba,llvmlite,tensorboard,tensorboardx,thop,hydra-core,omegaconf,h5py. The previousuv.lockwas clearly generated against a smaller feature set (nosim-mujoco, nobenchmark-libero, nomesh-iot); this PR silently re-resolves with the full extras matrix declared inpyproject.toml. That's likely a strict improvement (the prior lock was incomplete), but it means a reviewer cannot meaningfully audit "only the 19 CVE-closing bumps" - they have to trust 42 new transitive locks too. Recommend either (a) calling this out explicitly in the PR body so it's not surprising, or (b) splitting the lockfile-completeness fix from the security bumps. -
Diffusers (#15, GHSA-98h9-4798-4q5v) follow-up is not yet captured. The PR body says "Tracking upstream issue needed" - per
AGENTS.md > Project Dashboard("Never track work only in local markdown - the board is the source of truth"), there should be a tracking issue on https://github.com/orgs/strands-labs/projects/2 referenced from this PR before merge so the unresolved alert doesn't slip. -
No CHANGELOG entry.
CHANGELOG.mdexists in the repo and the project is pre-1.0 (hatch-vcsproduces dev versions); a one-line entry under an Unreleased / next-version heading documenting the pytest 8->9 jump (which can break tests in downstream consumers' overlays) and the Pillow / cryptography / urllib3 bumps would help users understand why their lockfile changed.
Verification suggestions
# Spot-check: do the headline CVE-closing versions actually land in the resolved lock?
grep -A1 -E '^name = "(pillow|cryptography|pytest|python-multipart|gitpython|urllib3)"' uv.lock
# Confirm the lerobot 0.5.0 -> 0.5.1 bump is intentional and doesn't change the public API surface used by strands_robots/policies/lerobot_local/
diff <(pip show lerobot | grep Version) <(echo 'Version: 0.5.1')
# transformers 5.x compat already has explicit handling at
# strands_robots/policies/lerobot_local/policy.py:309-313
# Worth running the lerobot integration tests on a GPU box once before merge:
hatch run test-integ -k lerobot_local
# Confirm benchmark-libero install path resolves end-to-end now that libero is in the lock:
pip install -e '.[benchmark-libero]'| "numpy>=1.21.0,<3.0.0", | ||
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=8.0.0,<13.0.0", |
There was a problem hiding this comment.
Widening the upper bound to <13.0.0 lets uv lock pick up Pillow 12.2.0 (good), but the lower bound >=8.0.0 is unchanged. Downstream consumers who install strands-robots without using this repo's uv.lock (i.e. anyone doing pip install strands-robots into their own env) are still allowed to resolve Pillow 8.x / 9.x / 10.x / 11.x - all of which contain the CVEs this PR is closing in the lock (#1, #3, #8-#11, #17, #18). For a security-framed PR, consider also bumping the floor (e.g. >=10.3.0 - the first version with the buffer-overflow fix from CVE-2024-28219) so the constraint actually carries the security guarantee for library consumers, not just for this repo's reproducible env.
There was a problem hiding this comment.
Fixed in d478d12. Bumped Pillow floor from >=8.0.0 to >=10.3.0 at both constraint sites (line 40 runtime + line 115 hatch env). This carries CVE-2024-28219 for downstream pip install strands-robots consumers.
Also added a CHANGELOG entry under ## Unreleased - #153 documenting all constraint changes, the pytest 9 API removals, and the lockfile scope expansion.
This comment was generated by an AI agent (strands-agents). The code changes were validated with hatch run lint (clean) and hatch run test (2339 passed).
| "pytest>=6.0,<9.0.0", | ||
| "pytest-cov>=4.0.0,<6.0.0", | ||
| "pytest>=6.0,<10.0.0", | ||
| "pytest-cov>=4.0.0,<8.0.0", |
There was a problem hiding this comment.
pytest 9.0.x was released only days ago; bumping the cap to <10.0.0 and pytest-cov to <8.0.0 simultaneously is a fine direction, but the resolved lock pins a brand-new pytest 9.0.3 + pytest-cov 7.1.0 pair that very few downstream projects have validated yet. Two suggestions: (a) note in the PR body that this constitutes a tooling jump (pytest 8->9 has removed/renamed several APIs - the nose-style setup/teardown, pytest.mark.skipif evaluation timing, and --strict aliasing changes) and surface it in CHANGELOG.md; (b) since the dev pins are duplicated between [project.optional-dependencies] dev (here) and [tool.hatch.envs.default.dependencies] (lines 111-112), consider following up by sourcing one from the other to prevent future drift - the current PR keeps them in sync correctly, but the duplication is a known foot-gun.
| "ruff>=0.4.0,<1.0.0", | ||
| "mypy>=1.0.0,<2.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=8.0.0,<13.0.0", |
There was a problem hiding this comment.
Same constraint widening applied here as in [project.optional-dependencies] dev - good, this keeps the two pin sites consistent. Mentioning for completeness rather than as a blocker: this duplication itself is the risk pattern (same dep pinned in two places, easy to update one and forget the other in a future Dependabot PR). Worth a follow-up issue on the project board to consolidate, but not in this PR's scope.
There was a problem hiding this comment.
Agreed - the duplicate pin sites are a maintenance risk. Noted as a follow-up in the PR body ("Known follow-up: consolidate the duplicate Pillow/pytest pin sites"). Will file a tracking issue on the project board after this PR merges.
This comment was generated by an AI agent (strands-agents).
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=8.0.0,<13.0.0", | ||
| ] |
There was a problem hiding this comment.
The uv.lock regeneration in this PR pulls in 42 new transitive packages that weren't in the previous lock (most notably transformers 5.8.1 - a major-version bump - plus mujoco, libero, robomimic, robosuite, numba, tensorboard, hydra-core, omegaconf). The previous uv.lock appears to have been resolved against a partial extras set (no sim-mujoco, no benchmark-libero, no mesh-iot); this PR's uv lock --upgrade re-resolves with the full matrix. transformers 5.x already has explicit compat code at strands_robots/policies/lerobot_local/policy.py:309-313, so the jump is probably fine, but it deserves a line in the PR body so reviewers know they're approving more than the 19 CVE-closing bumps.
… (addresses review feedback on security guarantee for consumers)
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependency-only PR that bumps Pillow, pytest, pytest-cov constraints in pyproject.toml and regenerates uv.lock against the full extras matrix to close 19/20 open Dependabot alerts. The R1 round addressed the most important review feedback (raise the Pillow floor to >=10.3.0 so consumers who pip install strands-robots without this repo's lockfile still pick up the CVE-2024-28219 fix) and added a CHANGELOG entry. The remaining diffusers alert (GHSA-98h9-4798-4q5v) is correctly identified as upstream-blocked by lerobot 0.5.1's diffusers cap.
What's good
- Both pin sites (
[project.dependencies]and[tool.hatch.envs.default]) updated in lockstep — the duplicate-pin foot-gun called out in the PR description was not tripped. - Floor bumped to
>=10.3.0rather than left at>=8.0.0, providing a real security guarantee to downstreampip installusers (not just lockfile users). - CHANGELOG entry documents both the constraint changes and the lockfile-scope expansion (42 new transitives from the full extras matrix), which would otherwise be a silent supply-chain surface change.
- Constraint widening matches the AGENTS.md
>=1.0deps-cap-major rule forPillowandpytest. - pytest 9.x nose-style
setup/teardownremoval is a non-issue here — verified that tests use only the supportedsetup_method/teardown_methodxunit form (tests/mesh/test_transport.py:333,351).
Concerns
- No regression test covering the security floors. Per AGENTS.md > Review Learnings (#85) > 'Pin regression tests for reviewed fixes', a behavioural fix should ship with a test that fails on the pre-fix code. A small
test_pyproject_security_floors.pythat assertsPillow>=10.3.0,pytest>=6.0, and the absence ofPillow>=8.0.0would prevent a future careless edit from silently re-introducing the floor regression. (The PR description says "N/A — pure constraint change", but a constraint test is exactly the form of pin-test that catches this.) - Cryptography major-version jump is undocumented. The lock moves cryptography from a pre-46 floor to 48.0.0 — that's potentially three major versions in a single PR. The CHANGELOG mentions the version but not that it crosses major boundaries; for an audit trail this is worth calling out, especially since cryptography major versions historically remove deprecated APIs.
pytest-cov<8.0.0widens the ceiling by two majors (was<6.0.0, allows 6.x and 7.x and forbids 8.x). The conventional one-major bump (<7.0.0) would also satisfy the resolved 7.1.0 and stay in line with AGENTS.md's "cap major" rule. Worth narrowing unless 8.x compatibility is intended.- Diffusers follow-up has no tracking link. The PR description and CHANGELOG say "Remaining: diffusers GHSA-98h9-4798-4q5v" — per AGENTS.md > Project Dashboard rule ("the board is the source of truth"), this should be filed as an issue on https://github.com/orgs/strands-labs/projects/2 and the issue number referenced in the CHANGELOG so the alert isn't lost.
Verification suggestions
# Confirm constraint floors are honoured by a fresh resolution:
uv lock --upgrade --dry-run 2>&1 | grep -E '^(pillow|pytest|pytest-cov|cryptography|gitpython|urllib3|python-multipart) '
# Confirm the resolved lock satisfies the security floors:
python - <<'PY'
import tomllib, re
lock = open('uv.lock').read()
for pkg, floor in [('pillow', '10.3.0'), ('cryptography', '48.0.0'), ('pytest', '9.0.3')]:
m = re.search(rf'name = "{pkg}"\nversion = "([^"]+)"', lock)
print(f'{pkg}: {m.group(1) if m else "MISSING"} (floor: {floor})')
PY
# Confirm the test suite passes under the new resolution (PR description claims 2339 passed):
hatch run test| "numpy>=1.21.0,<3.0.0", | ||
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=10.3.0,<13.0.0", |
There was a problem hiding this comment.
Floor bump to >=10.3.0 correctly carries CVE-2024-28219 for downstream pip install consumers — good R1 fix. One thing worth surfacing: Pillow 10.3.0 is the floor, not the resolved, so a consumer's resolver may still pin to 10.3.0 if another constraint pulls it down. Pillow has had subsequent CVE fixes in 10.4 / 11.x (e.g. GHSA-44wm-f244-xhp3 in 11.3.0). Consider documenting in the CHANGELOG that the floor is the minimum security guarantee and that the lockfile-pinned 12.2.0 is the recommended install path. This isn't blocking — it's a clarification so users don't read >=10.3.0 as "all known CVEs fixed".
| dev = [ | ||
| "pytest>=6.0,<9.0.0", | ||
| "pytest-cov>=4.0.0,<6.0.0", | ||
| "pytest>=6.0,<10.0.0", |
There was a problem hiding this comment.
pytest-cov>=4.0.0,<8.0.0 widens the ceiling by two major versions (skips both 6.x and 7.x major boundaries). AGENTS.md > Key Conventions #2 says >=1.0 deps cap major — i.e. one major above current. Since the resolved version is 7.1.0, <8.0.0 is correct, but the previous <6.0.0 to <8.0.0 jump effectively bypasses the convention's "one major at a time" intent. If 8.x compat is unverified (it doesn't exist yet), narrow to <8.0.0 is fine but please confirm that 6.x/7.x compatibility was actually exercised — easy to do by pinning pytest-cov==6.x.y in a one-shot resolve and rerunning hatch run test. Same comment applies to the duplicate at line 112.
|
|
||
| - Closes 19 of 20 open Dependabot alerts. Remaining: diffusers | ||
| GHSA-98h9-4798-4q5v (blocked by lerobot 0.5.1 cap `diffusers<0.36`). | ||
| ## Unreleased - #178 (LiberoOffScreenRenderEngine retired) |
There was a problem hiding this comment.
Per AGENTS.md > Project Dashboard rule ("never track work only in local markdown — the board is the source of truth"), this remaining alert should be filed as a tracked issue on Strands Labs - Robots and referenced here by issue number, e.g.
- Closes 19 of 20 open Dependabot alerts. Remaining: diffusers
GHSA-98h9-4798-4q5v (tracked in #XXX, blocked by lerobot 0.5.1
cap `diffusers<0.36`).The PR description acknowledges this as "Known follow-up" but the issue itself isn't filed yet. Filing it before merge prevents the alert from going stale.
| - Closes 19 of 20 open Dependabot alerts. Remaining: diffusers | ||
| GHSA-98h9-4798-4q5v (blocked by lerobot 0.5.1 cap `diffusers<0.36`). | ||
| ## Unreleased - #178 (LiberoOffScreenRenderEngine retired) | ||
|
|
There was a problem hiding this comment.
Missing blank line between this ## Unreleased - #153 block and the ## Unreleased - #178 block immediately below. Keep-a-Changelog conventions (referenced at the top of the file) want one blank line between top-level sections. Also note line 5 / 6 introduces a double blank line above the new ## Unreleased - #153 heading — should be a single blank line.
| - **Transitive lockfile** (`uv.lock`) regenerated with full extras matrix, | ||
| resolving cryptography 48.0.0, python-multipart 0.0.28, gitpython 3.1.50, | ||
| urllib3 2.7.0, and 42 previously-unresolved transitive packages (mujoco, | ||
| transformers 5.8.1, libero, robomimic, robosuite, numba, hydra-core, etc.). |
There was a problem hiding this comment.
The 42-new-transitive-packages note is a meaningful supply-chain disclosure — good that it's documented. One missing detail: cryptography's bump to 48.0.0 likely crosses multiple major-version boundaries (the old lock had a pre-46 floor per the PR description). For an audit trail it's worth saying explicitly "cryptography crossed N major versions" rather than just listing the resolved version, so downstream consumers grep'ing for breaking-API risk find this entry. Similar for transformers 4.x → 5.8.1 — the policy.py compat patch (strands_robots/policies/lerobot_local/policy.py:309-313) is a real behavioural change in the dep tree, not just a version bump.
…isclosure, diffusers tracking issue strands-labs#210 (addresses threads on CHANGELOG.md:28,29,21)
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependency-only bump that closes 19 of 20 open Dependabot alerts by raising the Pillow floor (security guarantee for non-lockfile consumers), bumping pytest/pytest-cov ceilings, and regenerating uv.lock with the full extras matrix (sim-mujoco, benchmark-libero, mesh-iot). The previous lockfile was resolved with a partial extras set, so the +1943/-1004 churn is mostly the 42 new transitive packages (mujoco, transformers 5.8.1, libero, robomimic, etc.) plus the targeted CVE bumps.
Verified locally: floor change is consistent across both pin sites ([project.dependencies] and [tool.hatch.envs.default]), the transformers 5.x compat shim referenced in the CHANGELOG actually exists at strands_robots/policies/lerobot_local/policy.py:308-321, follow-up issue #210 for the diffusers gap exists and is on the project board, and the repo has zero direct cryptography imports so the 46→48 major jump is transparent to this codebase. Tests use pytest's xUnit-style setup_method/teardown_method (not the nose-style names pytest 9 removed) and only --strict-markers, so the pytest 9.x bump is genuinely safe.
What's good
- Floor bump is justified per round-1 review feedback (security guarantee for
pip install strands-robotsconsumers, not just lockfile users). - Both pin sites updated in lockstep — no drift introduced by this PR.
- CHANGELOG entry documents the lockfile-scope expansion (42 new transitive packages) which is the only way a future reader can make sense of the diff size.
- 1/20 unfixed alert is openly tracked in #210 with a clear upstream blocker (
lerobot<0.6capsdiffusers<0.36). - Dependency-review and CodeQL checks pass on the head SHA.
Concerns
- The
uv.lockdiff (+1943/-1004) is not realistically line-reviewable; correctness rests on the resolver, thedependency-reviewcheck, and the green test suite. That is acceptable here, but worth flagging that any future reviewer doing a partial spot-check should rely on the alert-by-alert table in the PR description, not the raw lock diff. - Lockfile still pins
diffusers==0.35.2, so users installing viauv syncfrom this lockfile remain exposed to GHSA-98h9-4798-4q5v until #210 lands. The CHANGELOG notes "19 of 20 closed" but the wording could be sharper about who is and isn't protected by this PR. - AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes" applies even to constraint-only PRs: a minimal
tests/test_dependency_floors.pythat parsespyproject.tomland assertsPillow>=10.3.0would prevent silent floor-lowering during a futureuv lock --upgradecleanup. The R1 entry in the PR's review-rounds table waives this as N/A — see inline. - Constraint duplication between
[project.dependencies]/[project.optional-dependencies].devand[tool.hatch.envs.default]is acknowledged as a separate follow-up; not blocking, but it is the exact "two pin sites drift" pattern that motivated this PR's R1 round.
Verification suggestions
# Confirm both pin sites match (catches future drift)
grep -nE 'Pillow|pytest[^-]|pytest-cov' pyproject.toml
# Sanity-check the alert-by-alert claim against the lock
for pkg in pillow cryptography pytest pytest-cov python-multipart gitpython urllib3 diffusers; do
awk -v p="$pkg" '/^name = "'"$pkg"'"$/{getline; print p": "$0}' uv.lock
done
# Verify hatch environment resolves cleanly with the new constraints
hatch env prune && hatch run lint && hatch run test| aliasing; no impact on this repo's test suite. | ||
| - **pytest-cov** ceiling raised from `<6.0.0` to `<8.0.0` (allows 7.1.0). | ||
| - **Transitive lockfile** (`uv.lock`) regenerated with full extras matrix, | ||
| resolving cryptography 48.0.0 (crosses 3 major versions from prior lock; no deprecated API usage found in this repo), python-multipart 0.0.28, gitpython 3.1.50, |
There was a problem hiding this comment.
Minor accuracy nit: this says cryptography "crosses 3 major versions from prior lock" but the prior lock on main pins cryptography==46.0.6 and head pins 48.0.0 — that's 2 major versions (46→47→48), not 3. Suggest: crosses 2 major versions from prior lock. The follow-up claim ("no deprecated API usage found in this repo") checks out — grep -rn 'from cryptography\|import cryptography' strands_robots/ returns nothing, so the major-version jump is transparent here.
|
|
||
| - Closes 19 of 20 open Dependabot alerts. Remaining: diffusers | ||
| GHSA-98h9-4798-4q5v (tracked in #210, blocked by lerobot 0.5.1 | ||
| cap `diffusers<0.36`). |
There was a problem hiding this comment.
Worth being sharper about the consumer impact here: this PR's constraint changes do not raise the diffusers floor (lerobot caps <0.36), and the regenerated uv.lock pins diffusers==0.35.2, so anyone installing via this repo's lockfile remains exposed to GHSA-98h9-4798-4q5v until #210 lands. Suggest adding a single sentence: "Lockfile users remain exposed; mitigation tracked in #210." Otherwise the "19 of 20 closed" phrasing reads like the residual risk has been eliminated for lockfile consumers, which it hasn't.
| "numpy>=1.21.0,<3.0.0", | ||
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=10.3.0,<13.0.0", |
There was a problem hiding this comment.
Per AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes": even though this is a constraint-only change, the security-floor guarantee for downstream pip install strands-robots consumers (the entire reason the floor was lifted from >=8.0.0 to >=10.3.0 in R1) has no test pinning it. A trivial tests/test_dependency_floors.py that parses this file with tomllib and asserts the Pillow floor is >=10.3.0 (and similarly for any future security-motivated floor) would prevent a routine uv lock --upgrade or constraint-cleanup PR from silently lowering it. The PR's review-rounds table waives this as "N/A — pure constraint change," but the floor is precisely the behavior R1 asked to harden, so it qualifies for a pin-test.
| "ruff>=0.4.0,<1.0.0", | ||
| "mypy>=1.0.0,<2.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=10.3.0,<13.0.0", |
There was a problem hiding this comment.
This is the second of two Pillow>=10.3.0,<13.0.0 pin sites (the other is line 40). The PR description acknowledges consolidating these as a follow-up; flagging here only because the very next constraint-bump PR is the most likely place this drifts (someone updates [project.dependencies] and forgets [tool.hatch.envs.default], or vice versa). A 5-line tests/test_pin_site_consistency.py that asserts the floor/ceiling for Pillow, pytest, and pytest-cov matches between the two sites would close this risk without waiting for the consolidation PR.
| - **Transitive lockfile** (`uv.lock`) regenerated with full extras matrix, | ||
| resolving cryptography 48.0.0 (crosses 3 major versions from prior lock; no deprecated API usage found in this repo), python-multipart 0.0.28, gitpython 3.1.50, | ||
| urllib3 2.7.0, and 42 previously-unresolved transitive packages (mujoco, | ||
| transformers 5.8.1 (explicit compat code at `strands_robots/policies/lerobot_local/policy.py:309-313`), libero, robomimic, robosuite, numba, hydra-core, etc.). |
There was a problem hiding this comment.
Inline parenthetical-in-parenthetical (transformers 5.8.1 (explicit compat code at ...:309-313)) makes this bullet hard to parse; the line is also long enough that it doesn't render as a clean Keep-a-Changelog bullet. The compat-code reference is genuine and useful (verified at strands_robots/policies/lerobot_local/policy.py:308-321) — suggest pulling it out into its own sub-bullet or footnote rather than nesting inside the package list. Non-blocking, purely a readability nit on a doc that'll be read by humans hunting for what changed at release time.
…ers exposure, transformers compat as sub-bullet - cryptography crosses 2 major versions (46.0.6 -> 48.0.0: 46->47->48), not 3. Also explicitly states the prior-lock base version so the count is auditable. - Diffusers GHSA-98h9-4798-4q5v: spell out that lockfile consumers remain exposed until strands-labs#210 lands. Names the two mitigation paths (float diffusers, or upgrade lerobot once upstream relaxes the <0.36 cap). - Transformers 5.x compat-code reference moved out of the parenthetical- in-parenthetical into its own sub-bullet for readability. Addresses CHANGELOG.md:18, :29, :20 review threads.
…nsistency Per AGENTS.md > Review Learnings strands-labs#85, behavioural fixes need a regression test that fails on pre-fix code. The R1 floor bump (Pillow >=8.0.0 -> >=10.3.0 for CVE-2024-28219) had no test pinning it; a routine 'uv lock --upgrade' or constraint-cleanup PR could silently lower the floor without review signal. Same risk for the dual pin sites - this PR's CHANGELOG already acknowledges site consolidation as a follow-up; in the meantime, test_pin_site_consistency closes the drift window. Two parametrized test classes: - test_security_floor_not_lowered: every dep in _SECURITY_FLOORS must declare a >= floor at or above the recorded minimum, in EVERY pin site. Adding a CVE-driven floor bump means appending one (name, floor, why) tuple. Lowering one means deleting the tuple in a reviewer-visible diff. - test_pin_site_consistency: every dep declared in both the runtime/dev block AND the hatch-env block must carry identical specs. Catches the Dependabot foot-gun where one site is updated and the other is missed. Verified pin behaviour with git stash round-trip: - Lowered 'Pillow>=10.3.0' -> 'Pillow>=8.0.0' in project.dependencies: 1 failed, 3 passed (security floor test fails with the floor delta surfaced in the assertion message; site-consistency tests still pass because both sites moved together). - Drifted ceiling on one Pillow site only ('<13' -> '<14'): 1 failed, 2 passed (site-consistency test names both sites and the delta in the assertion message). - Restored: 4 passed. Addresses pyproject.toml:40 (security floor pin) and pyproject.toml:115 (two-site consistency pin) review threads.
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependabot security sweep: bumps the Pillow, pytest, and pytest-cov constraints in pyproject.toml, regenerates uv.lock against the full extras matrix (closing 19/20 alerts; diffusers GHSA-98h9-4798-4q5v deferred to #210), adds a CHANGELOG.md entry, and lands tests/test_dependency_floors.py to pin the new floor against future regressions and surface drift between the runtime/dev block and the hatch env block. Scope, description, and CHANGELOG are all in good shape; the test scaffolding is the most reviewable surface area.
What's good
- AGENTS.md compliance: dep bound bumps follow the
>=1.0major-cap convention; CHANGELOG entry is present with bothChangedandSecuritysubsections; cryptography major delta is disclosed. - Regression-test pinning: a behavioural fix (raised floor + cross-site consistency) shipped alongside a test that fails on pre-fix state — exactly the PR-#85 "pin every reviewed fix" rule.
- Lockfile-scope honesty: PR description spells out that the previous
uv.lockwas resolved against a partial extras set, why the new one is +42 packages, and which of those are sensitive (transformers 5.x, with a citation to the in-repo compat shim). - Diffusers gap is documented, not hidden: the one Dependabot alert that can't be closed is named, linked to its tracking issue, and the mitigation path is spelled out.
Concerns
_DUAL_SITE_DEPSis a manual allowlist, not a property (see inline). The PR's own goal is "future Dependabot PR updates one site and forgets the other"; the test only fires forpillow / pytest / pytest-cov. A Dependabot bump onmypy,ruff,pyzmq, ormsgpack(each currently in two sites) would silently drift. Auto-detecting every dep that appears in ≥2 sites and asserting equality is a one-line replacement of the existing parametrize and would actually deliver the property the docstring claims.- CHANGELOG entry currently sits under
## Unreleased - #153stacked above## Unreleased - #178. TwoUnreleasedheadings is fine for keep-a-changelog but worth a sanity check that the release-cut script (if any) handles multipleUnreleasedblocks correctly. cryptography46→48 claim: PR description asserts "no deprecated API usage found in this repo" but the diff doesn't show how it was checked. A grep forcryptography.hazmat/cryptography.fernetin the working tree against the 47/48 deprecation notes would be a one-line confidence boost in the PR body.
Verification suggestions
# 1. Pin tests do what they claim — round-trip flip the floor and confirm fail.
hatch run pytest tests/test_dependency_floors.py -v
sed -i 's/Pillow>=10.3.0/Pillow>=10.0.0/' pyproject.toml
hatch run pytest tests/test_dependency_floors.py::test_security_floor_not_lowered -v # expect 1 fail
git checkout -- pyproject.toml
# 2. Confirm the lockfile actually closes the alerts it claims to.
uv pip compile --extra all pyproject.toml -o /tmp/resolved.txt
grep -E '^(pillow|cryptography|gitpython|urllib3|python-multipart|pytest)==' /tmp/resolved.txt
# 3. Smoke-test that hatch env still bootstraps cleanly post-pytest-9 bump.
hatch env prune && hatch run test -k 'not gpu' --collect-only | tail|
|
||
| # Deps that are intentionally pinned in BOTH the runtime/dev block AND the | ||
| # ``[tool.hatch.envs.default]`` block. Any drift between sites is a bug. | ||
| _DUAL_SITE_DEPS = ("pillow", "pytest", "pytest-cov") |
There was a problem hiding this comment.
Manual allowlist defeats the stated goal.
The module docstring (lines 14-17) and the PR description both promise that the test fires "if a future Dependabot PR updates one site and forgets the other." In practice this only checks the 3 names hard-coded here, but the current pyproject.toml already has 7 deps that live in two sites: mypy, msgpack, pyzmq, ruff, plus the 3 listed. A Dependabot PR that bumps mypy<2.0.0 to <3.0.0 in dev but not in tool.hatch.envs.default will silently pass.
Since constraint_index already produces the per-site map, the property version is a one-liner — no allowlist:
def test_pin_site_consistency_all(constraint_index):
drifted = {
name: sites for name, sites in constraint_index.items()
if len(sites) >= 2 and len(set(sites.values())) > 1
}
assert not drifted, f"Multi-site pins disagree: {drifted}"This matches what the docstring on line 14-17 already claims ("every dep that appears in more than one pin block must have identical floor + ceiling everywhere") and removes the foot-gun where adding a dep to two sites without remembering to update _DUAL_SITE_DEPS is itself a silent drift.
| out.append(int(m.group(1))) | ||
| else: | ||
| break | ||
| return tuple(out) or (0,) |
There was a problem hiding this comment.
_version_tuple is lossy on PEP 440 markers; can mask floor regressions.
The break on first non-numeric component means _version_tuple("10.3.0rc1") returns (10, 3, 0) — equal to _version_tuple("10.3.0") — so a future floor regression to Pillow>=10.3.0rc1 would pass the security-floor check despite being a pre-release that doesn't carry the CVE-2024-28219 fix.
Not exploitable today (no pre-release floor in tree), but this file's whole point is to be a future-proof guard. Either use packaging.version.Version (already available transitively via pip/setuptools) or document the limitation explicitly:
from packaging.version import Version
# ...
assert Version(floor) >= Version(min_floor), ...| try: | ||
| name, spec = _parse_constraint(raw) | ||
| except ValueError: | ||
| continue |
There was a problem hiding this comment.
Silent skip on unparseable entries hides bugs.
absorb() swallows ValueError from _parse_constraint and moves on. Today this gracefully skips entries like "strands-robots[mesh]" (self-references with extras), which is intentional, but it also silently swallows an entry like "Pillow >= 10.3.0" (extra space inside the spec — actually parses fine here, but a typo'd Pillow=>10.3.0 would skip and the security-floor test would then assert sites against an empty dict and fail with "pillow not found in any pyproject pin site" rather than the actual parse error).
Minor, but consider logging/raising on unexpected entries that aren't self-references:
except ValueError:
if not raw.lstrip().startswith("strands-robots["):
raise # don't hide real parse failures
continue| "pytest>=6.0,<9.0.0", | ||
| "pytest-cov>=4.0.0,<6.0.0", | ||
| "pytest>=6.0,<10.0.0", | ||
| "pytest-cov>=4.0.0,<8.0.0", |
There was a problem hiding this comment.
pytest 9.x major bump deserves a tests_integ/ smoke check.
The PR description correctly notes pytest 9 removed nose-style setup/teardown and changed --strict aliasing, and reports 2343 passed on the unit suite. That covers tests/, but [tool.hatch.envs.default.scripts].test-integ = "pytest tests_integ/ ..." (line 123) runs against a separate suite that the PR description doesn't mention exercising. If tests_integ/ uses any of the deprecated patterns, this PR will land green and break only on hardware/GPU runs.
Not a blocker — the bump is right — but worth confirming hatch run test-integ --collect-only still resolves cleanly before merge, and noting the result in the PR.
|
|
||
| - Closes 19 of 20 open Dependabot alerts. Remaining: diffusers | ||
| GHSA-98h9-4798-4q5v (tracked in #210, blocked by lerobot 0.5.1 | ||
| cap `diffusers<0.36`). Lockfile users remain exposed until #210 lands; |
There was a problem hiding this comment.
Diffusers GHSA exposure is documented but has no automated tripwire.
The entry says "Lockfile users remain exposed until #210 lands" — good disclosure. But there's no test or CI gate that fails (or even surfaces a notice) once lerobot relaxes its diffusers<0.36 cap. A reviewer six months from now will need to remember to check #210; a low-cost automated nudge is to add a parametrized entry to a _KNOWN_DEFERRED_VULNS table in tests/test_dependency_floors.py that asserts the relevant lerobot constraint is still <0.36 and prints a pytest.skip reason like "diffusers GHSA-98h9-4798-4q5v gated by lerobot<0.6 cap; revisit when #210 lands". Once lerobot bumps, the skip-reason changes and review attention follows. Optional, but cheap.
…format The sys.version_info < (3, 11) guard was dead code since pyproject.toml declares requires-python >= 3.12. Ruff UP036 correctly flags this as an outdated version block. Remove the guard and the now-unused sys import, then apply ruff format to satisfy format --check. Fixes the "call-test-lint / Test and Lint" CI failure on commit 1c3682c.
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
This PR resolves 19/20 open Dependabot alerts by bumping direct constraints in pyproject.toml (Pillow floor + ceiling, pytest ceiling, pytest-cov ceiling), regenerating uv.lock with the full extras matrix, and adding tests/test_dependency_floors.py to pin the security floor against silent regression. The remaining alert (diffusers GHSA-98h9-4798-4q5v) is blocked by lerobot 0.5.1's diffusers<0.36 cap and is tracked in #210.
Description accurately mirrors the diff. CHANGELOG entry is present and follows the project's Keep-a-Changelog convention. The new pin-regression test directly addresses AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes."
What's good
- Both pin sites (
[project.dependencies]+[tool.hatch.envs.default]) updated in lockstep, and the newtest_pin_site_consistencyenforces this going forward. - Pillow floor raised to
>=10.3.0carries the CVE-2024-28219 fix for consumers who pip-install without using the repo lockfile (the right call — lockfiles don't propagate to wheel users). _SECURITY_FLOORSis structured for additive growth: future CVE-triggered floor bumps just append a tuple, with rationale forced inline. Good ergonomics.- Tracking issue #210 exists and is open; CHANGELOG
### Securityblock names the GHSA + the upstream blocker. - Transformers 5.x compatibility shim already present in the repo (
strands_robots/policies/lerobot_local/policy.py:308-321), so the lockfile bump totransformers 5.8.1is unblocked.
Concerns
- Asymmetric exposure framing. The CHANGELOG and PR description correctly note that diffusers lockfile users remain exposed, but the same caveat applies in the opposite direction to cryptography / python-multipart / gitpython / urllib3: those are transitive-only fixes with no constraint floor anywhere in
pyproject.toml, so consumers runningpip install strands-robots(without usinguv.lock) can still resolve vulnerable versions. The "19 of 20 alerts closed" claim is true for lockfile users but overstated for wheel-install users. A one-line note in the CHANGELOG### Securityblock clarifying that only Pillow's fix has a constraint-level guarantee would close this gap. - Test surface is narrow by design. The new test pins one floor (Pillow). It cannot catch regression on the four transitive-only CVEs above because no constraint exists to inspect. This is fine — just worth noting that
test_security_floor_not_loweredis not a substitute for adding a direct dep floor when a transitive CVE recurs. - No regression test for the ceiling raises (Pillow
<13, pytest<10, pytest-cov<8). Less critical than floor regression since lowering a ceiling is a more obviously manual edit, but a future cleanup PR could narrow them silently.
Verification suggestions
# Confirm the pin test fires on the pre-fix state:
git show HEAD~N:pyproject.toml | sed 's/Pillow>=10.3.0/Pillow>=8.0.0/' > /tmp/pre.toml
# (then run the test against the simulated regression)
# Smoke-test the lockfile resolves cleanly with the full extras matrix:
uv sync --extra all --extra dev
# Confirm transformers 5.x shim still loads:
python -c 'from strands_robots.policies.lerobot_local.policy import LerobotLocalPolicy; print("ok")'
# Spot-check that no consumer-level Pillow regression is possible:
uv pip install --dry-run 'strands-robots' --resolution=lowest-direct| resolving cryptography 48.0.0 (crosses 2 major versions from prior lock at | ||
| 46.0.6: 46->47->48; no deprecated API usage found in this repo), | ||
| python-multipart 0.0.28, gitpython 3.1.50, urllib3 2.7.0, and 42 | ||
| previously-unresolved transitive packages (mujoco, transformers 5.8.1, |
There was a problem hiding this comment.
The ### Security block below explicitly flags diffusers lockfile exposure, but the same kind of caveat applies inversely to cryptography / python-multipart / gitpython / urllib3 listed here: these are transitive-only and no [project.dependencies] or extras floor was added, so a pip install strands-robots consumer who doesn't use uv.lock can still resolve vulnerable versions.
Consider adding a sentence in the ### Security block clarifying that only Pillow's fix is constraint-level (and therefore wheel-install consumers are protected by the floor); for the transitive ones, only uv.lock users get the fix. As-is, the "19 of 20 alerts closed" framing is accurate for lockfile users but overstated for wheel-install users.
| def _floor_of(spec: str) -> str | None: | ||
| """Extract the ``>=`` lower bound from a spec like ``>=10.3.0,<13.0.0``.""" | ||
| m = re.search(r">=\s*([A-Za-z0-9._*+-]+)", spec) | ||
| return m.group(1) if m else None |
There was a problem hiding this comment.
_floor_of only recognises >=. If a future floor uses ~=10.3.0 (compatible-release) or ==10.3.*, this returns None and the assertion at line 155-157 fires with a misleading "no >= floor" message even though the spec semantically does enforce a floor.
Not urgent (no current dep uses ~=), but consider broadening the regex to also accept ~= and == floors, or at minimum document in the docstring that _SECURITY_FLOORS entries must use a >= constraint.
| out.append(int(m.group(1))) | ||
| else: | ||
| break | ||
| return tuple(out) or (0,) |
There was a problem hiding this comment.
_version_tuple("10.3") returns (10, 3) and _version_tuple("10.3.0") returns (10, 3, 0). Python tuple comparison treats the shorter as less, so a hypothetical Pillow>=10.3 declaration would fail test_security_floor_not_lowered against a 10.3.0 minimum even though they're semantically equivalent.
Low likelihood (current spec is >=10.3.0), but pad with zeros to a fixed length before comparing, or use packaging.version.Version if available, to avoid a surprise failure on a future cosmetic edit.
| "10.3.0", | ||
| "CVE-2024-28219 buffer-overflow fix; raised in PR #153 R1.", | ||
| ), | ||
| ] |
There was a problem hiding this comment.
Only Pillow is enumerated here, but the PR's stated security wins also include transitive bumps for cryptography, python-multipart, gitpython, urllib3 — none of which have a direct constraint to pin. That's correct behaviour for this PR, but worth a short comment in this list explaining the policy: floors only get added when a CVE recurs in a direct dep, otherwise lockfile regeneration is the mitigation. Otherwise the next reviewer will reasonably ask "why isn't cryptography here?"
| "numpy>=1.21.0,<3.0.0", | ||
| "opencv-python-headless>=4.5.0,<5.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=10.3.0,<13.0.0", |
There was a problem hiding this comment.
Confirming the lockstep with line 116 (hatch env). The new test_pin_site_consistency enforces this pairing for pillow, pytest, pytest-cov, which is exactly the foot-gun called out in AGENTS.md > Review Learnings (#86) under "Don't conflate identity with schema." Good.
No change requested — noting for the human reviewer that the duplicate-pin-site issue is acknowledged in the PR description as a known follow-up.
Replace the manual _DUAL_SITE_DEPS allowlist (3 entries: pillow, pytest,
pytest-cov) with a property-style scan that catches drift on every dep
appearing in 2+ pin sites. The allowlist version missed 4 dual-site deps
already in pyproject.toml (mypy, msgpack, pyzmq, ruff): a Dependabot PR
that updated only one site for any of these would have silently passed.
Also fix two correctness issues in the helpers:
- _version_tuple now pads to a fixed (major, minor, patch) triple so
'10.3' compares equal to '10.3.0' instead of being treated as smaller
by tuple-shorter-is-less. Also documents the conservative pre-release
handling ('10.3.0rc1' -> (10, 3, 0)).
- absorb() now re-raises ValueError on entries that don't look like
self-references with extras ('strands-robots[mesh]'), so a typo'd
spec surfaces as a parse error instead of silently disappearing.
Documents in the module docstring why only Pillow is in _SECURITY_FLOORS
(transitive bumps have no direct constraint to pin; uv.lock regeneration
is the mitigation).
Verified: simulated mypy drift between [project.optional-dependencies.dev]
and [tool.hatch.envs.default.dependencies] now fails with a precise
diagnostic; allowlist version would have silently passed.
Tests: 2 passed (security_floor_not_lowered + multi_site_pins_consistent).
Lints: ruff check + format clean.
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependabot security sweep: 19/20 alerts closed via Pillow floor bump (>=10.3.0), pytest/pytest-cov ceiling bumps, and a full uv lock --upgrade against the complete extras matrix. Adds a 222-line repo-hygiene test (tests/test_dependency_floors.py) that pins the security floor and enforces multi-site spec consistency, and a CHANGELOG entry. The PR description, R1-R4 audit trail, and CHANGELOG are unusually thorough; #210 confirmed open as the diffusers tracker.
What's good
- Pin regression test added for the only behavioural change (the Pillow floor), in line with AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes".
- Multi-site consistency check is property-style (scans every dep in 2+ sites) rather than an allowlist, addressing the R4 finding correctly. Spot-check:
Pillow,pytest,pytest-cov,ruff,mypy,pyzmq,msgpackall appear in two sites today and match exactly post-bump. - CHANGELOG, pyproject, and lockfile updated in lockstep. Cryptography major-jump (46->48) audited via grep for direct usage in
strands_robots/-- none found, so no migration burden. - diffusers GHSA-98h9-4798-4q5v deferral is documented loudly with the lerobot upstream cap as the blocker and #210 as the tracker; lockfile-user exposure is called out plainly in the CHANGELOG.
- Round budget overrun (R4) is justified inline with a concrete bug rationale, not waved away.
Concerns
- Wheel-install consumers are silently underprotected for the transitive CVEs (cryptography, urllib3, gitpython, python-multipart). The PR description states this clearly; the CHANGELOG only implies it ("transitive lockfile regenerated"). A line in the CHANGELOG's
### Securityblock stating "pip install strands-robotsusers withoutuv.lockare NOT protected against the cryptography/urllib3/gitpython/python-multipart CVEs by a floor pin -- only direct-dep recurrences (Pillow) get a floor" would let downstream readers self-assess without re-reading the PR. _version_tuplesemantics are PEP-440-incorrect for pre-releases (see inline). Low practical risk today (no pre-release in_SECURITY_FLOORS), but the helper exists specifically to catch a future regression -- usingpackaging.version.Versionremoves the foot-gun for ~3 lines of code.- Pin-site duplication is acknowledged as a follow-up but no tracking issue is referenced. The new consistency test ratchets this in (changes now cost 2x edits per dep), which raises the cost of not consolidating. Worth filing on the project board now rather than letting it drift.
hatch run test-integnot exercised -- explicitly deferred (no GPU on CI), reasonable; lockfile resolves successfully which catches the most common failure mode.
Verification suggestions
# Confirm the new tests fail when a floor is silently lowered
sed -i.bak 's/Pillow>=10.3.0,<13.0.0/Pillow>=8.0.0,<13.0.0/' pyproject.toml
hatch run test tests/test_dependency_floors.py -v # expect FAIL on test_security_floor_not_lowered
mv pyproject.toml.bak pyproject.toml
# Confirm multi-site drift is caught
sed -i.bak 's/"pytest>=6.0,<10.0.0",/"pytest>=6.0,<11.0.0",/' pyproject.toml # bump only one of the two sites
hatch run test tests/test_dependency_floors.py::test_multi_site_pins_consistent -v # expect FAIL
mv pyproject.toml.bak pyproject.toml
# Sanity check lockfile resolves with full extras
uv lock --check
# Confirm pillow 12.2.0 + cryptography 48.0.0 actually install in the hatch env
hatch env prune && hatch run python -c 'import PIL, cryptography; print(PIL.__version__, cryptography.__version__)'| # tuple-shorter-is-less surprising the security-floor check. | ||
| while len(out) < 3: | ||
| out.append(0) | ||
| return (out[0], out[1], out[2]) |
There was a problem hiding this comment.
_version_tuple is PEP-440-unsafe for the security-floor direction. Per PEP 440, 10.3.0rc1 < 10.3.0, but here both produce (10, 3, 0) and compare equal. Concrete consequence: if a future PR sets _SECURITY_FLOORS = [("foo", "2.5.0", ...)] and pyproject declares foo>=2.5.0rc1, this test passes -- but pip install foo>=2.5.0rc1 happily resolves to the rc, which is before the CVE-fixed release. The docstring calls this "conservative", but it's actually permissive in the wrong direction for a security floor.
The fix is one import:
from packaging.version import Version
# ...
assert Version(floor) >= Version(min_floor), ...packaging is already a transitive dep (pulled in by pytest/setuptools), and Version handles pre-releases, post-releases, dev-releases, and local versions per PEP 440. This is also one of the deferred nits, but it's the only one that actually weakens the test's stated guarantee, so worth landing in this PR rather than as a follow-up.
| hatch_default = ( | ||
| pyproject.get("tool", {}).get("hatch", {}).get("envs", {}).get("default", {}).get("dependencies", []) or [] | ||
| ) | ||
| absorb("tool.hatch.envs.default.dependencies", hatch_default) |
There was a problem hiding this comment.
Scope of the multi-site scan is tool.hatch.envs.default.dependencies only. If a future PR adds a sibling env (e.g. [tool.hatch.envs.docs.dependencies] or [tool.hatch.envs.ci.dependencies]), drift in those sites will not be caught -- the property-style check silently downgrades to a 2-site check.
Minor today (only default exists), but the intent of this test is to be a permanent ratchet, so iterating over pyproject["tool"]["hatch"]["envs"] and absorbing every env's dependencies would future-proof it against the exact "forgot the second site" foot-gun the test exists to prevent. Either fix it now or add a comment that explicitly limits scope to default.
| "ruff>=0.4.0,<1.0.0", | ||
| "mypy>=1.0.0,<2.0.0", | ||
| "Pillow>=8.0.0,<12.0.0", | ||
| "Pillow>=10.3.0,<13.0.0", |
There was a problem hiding this comment.
The PR description and AGENTS.md both mention the duplicate Pillow/pytest pin sites as a known follow-up to consolidate, but no tracking issue is linked. With the new test_multi_site_pins_consistent ratcheting drift in, every future bump now costs 2x edits per dep. Recommend filing a tracking issue on the project board (PVT_kwDOD151Fs4BSRJP) before this PR merges, so the consolidation work doesn't quietly age out -- AGENTS.md > Project Dashboard explicitly requires "the board is the source of truth" for follow-ups.
| ### Security | ||
|
|
||
| - Closes 19 of 20 open Dependabot alerts. Remaining: diffusers | ||
| GHSA-98h9-4798-4q5v (tracked in #210, blocked by lerobot 0.5.1 |
There was a problem hiding this comment.
The Security block calls out diffusers exposure for lockfile users but does not call out the inverse exposure for wheel-install users on the transitive CVEs. The PR description states this clearly ("only lockfile (uv.lock) users get the fix. Wheel-install consumers (pip install strands-robots) are not protected by a constraint floor for these packages"). That sentence belongs in the CHANGELOG too -- a downstream user reading just CHANGELOG.md after pip install strands-robots upgrades cannot tell from the current text that they are still exposed to GHSA-p423-j2cm-9vmq (cryptography), GHSA-* (urllib3, gitpython, python-multipart) unless they switch to the lockfile install path.
…ess (addresses thread on test_dependency_floors.py:99)
yinsong1986
left a comment
There was a problem hiding this comment.
Summary
Dependabot vulnerability sweep: bumps Pillow floor (>=10.3.0 for CVE-2024-28219), bumps four ceilings to absorb fresh majors (Pillow<13, pytest<10, pytest-cov<8), and regenerates uv.lock against the full extras matrix — pulling cryptography 48, urllib3 2.7, gitpython 3.1.50, python-multipart 0.0.28, plus 42 previously-unresolved transitive packages. Closes 19/20 alerts; the remaining diffusers CVE is correctly identified as upstream-blocked by lerobot 0.5.1's cap and tracked in #210. Adds tests/test_dependency_floors.py as a regression pin.
The diff is well-scoped, the CHANGELOG is unusually honest about wheel-install vs lockfile exposure, and the R5-evolved test file (packaging.version.Version, property-style multi-site scan) is materially better than what most floor-test PRs land with. Round budget was exceeded (5/3) but each round produced a real fix; no scope creep into unrelated areas.
What's good
- Pin regression test in
tests/test_dependency_floors.pyexercises both the floor invariant and cross-site drift, with a dedicated PEP-440 pre-release test pinning the R5 fix. Matches AGENTS.md > Review Learnings (#85) > "Pin regression tests for reviewed fixes". - CHANGELOG entry explicitly distinguishes wheel-install consumers from lockfile users for transitive CVE coverage, and discloses the
cryptography46→48 major-version jump alongside a verification that no first-party code importscryptography. - Diffusers gap is named, attributed (lerobot 0.5.1 cap), tracked (#210), and the lockfile-user exposure window is called out — this is the right way to ship a partial security fix.
- Constraint changes are paired (line 40
[project.dependencies]+ line 115[tool.hatch.envs.default]) per AGENTS.md convention #2; the new test enforces this property going forward. - Validation evidence (
hatch run lintclean, 2339 tests passing) is stated up front. Verified locally:hatch run test tests/test_dependency_floors.py→ 3 passed, lockfile pillow=12.2.0 / cryptography=48.0.0 / pytest=9.0.3 / urllib3=2.7.0 / gitpython=3.1.50 / python-multipart=0.0.28 all match the CHANGELOG. - Ceiling raises are uniformly one-major-above-resolved (Pillow 12→<13, pytest 9→<10, pytest-cov 7→<8), consistent with AGENTS.md convention #2.
Concerns
None block merge. Inline notes are robustness improvements on the new test file (foot-guns a future contributor could trip) and one CHANGELOG fragility nit. The deferred follow-ups listed in PR Section 13 (~=/== operator support, _KNOWN_DEFERRED_VULNS tripwire, multi-env scan scope, pin-site consolidation) are appropriately deferred — file them as the single follow-up issue post-merge as the PR description commits to.
Verification suggestions
Beyond CI:
# 1. Floor regression — should fail loudly
sed -i 's/Pillow>=10.3.0,<13.0.0/Pillow>=10.0.0,<13.0.0/' pyproject.toml
hatch run test tests/test_dependency_floors.py::test_security_floor_not_lowered
git checkout pyproject.toml
# 2. Multi-site drift — should fail loudly
# (edit only one of the two Pillow entries, then run)
hatch run test tests/test_dependency_floors.py::test_multi_site_pins_consistent
git checkout pyproject.toml
# 3. Confirm diffusers cap gap is real (lerobot still caps it)
uv pip compile --extra all pyproject.toml 2>&1 | grep -E '^diffusers' # expect <0.36| _CONSTRAINT_RE = re.compile( | ||
| r"^\s*(?P<name>[A-Za-z0-9._-]+)\s*" | ||
| r"(?P<spec>(?:[<>=!~]=?\s*[A-Za-z0-9._*+-]+(?:\s*,\s*[<>=!~]=?\s*[A-Za-z0-9._*+-]+)*)?)\s*$" | ||
| ) |
There was a problem hiding this comment.
_CONSTRAINT_RE accepts the package name as [A-Za-z0-9._-]+ only, so it won't match a third-party spec with PEP 508 extras such as httpx[http2]>=0.27,<1.0. _parse_constraint will raise ValueError, and absorb() only swallows that error when the line starts with strands-robots[ — a future contributor adding any pkg-with-extras to pyproject.toml will get a confusing test failure rather than have the test cleanly index it.
No such dep exists today, so this isn't a current blocker. Cheap robustness fix: extend the name class to allow \[[^\]]*\] after the name and strip extras post-parse (name.split('[', 1)[0]), or generalize the special-case in absorb() to skip any <name>[...] self-style line. Either keeps the test from going off when an unrelated PR adds a pkg[extra] dep.
| every dep that lives in 2+ sites today and any added tomorrow. | ||
| """ | ||
| drifted = { | ||
| name: sites for name, sites in constraint_index.items() if len(sites) >= 2 and len(set(sites.values())) > 1 |
There was a problem hiding this comment.
Drift detection compares specs as strings (after whitespace strip), so semantically-equivalent specs that differ only in operator order will false-positive. Example future Dependabot PR landing:
Pillow<13.0.0,>=10.3.0 # in [project.dependencies]
Pillow>=10.3.0,<13.0.0 # in [tool.hatch.envs.default]
would flag as drift even though pip/uv treat them identically. Today every entry uses floor-then-ceiling order so this is latent, but the test's stated property is "identical spec everywhere it appears". A more robust check parses each spec to a packaging.specifiers.SpecifierSet and compares those, giving semantic equality for free — and packaging is already imported in this file.
Not blocking; flagging so the next reader doesn't waste an afternoon debugging a false-drift failure.
| previously-unresolved transitive packages (mujoco, transformers 5.8.1, | ||
| libero, robomimic, robosuite, numba, hydra-core, etc.). | ||
| - transformers 5.x: explicit compat code at | ||
| `strands_robots/policies/lerobot_local/policy.py:308-321`. |
There was a problem hiding this comment.
Line-number reference (policy.py:308-321) is fragile — any reformat or new import above that range silently invalidates the citation, and the CHANGELOG is the one place in the repo that doesn't get an automated check for this. Suggest referencing the symbol instead, e.g. "LerobotLocalPolicy._load_model (XVLA / Florence2 compat shim)". AGENTS.md uses file_path:line_number shorthand for code references but those live in code review threads with finite shelf-life; CHANGELOGs persist across many rebases.
Non-blocking nit.
| operator, broaden this helper at the same time as the floor is added. | ||
| """ | ||
| m = re.search(r">=\s*([A-Za-z0-9._*+-]+)", spec) | ||
| return m.group(1) if m else None |
There was a problem hiding this comment.
_floor_of deliberately rejects ~= and == (per the docstring), but the rejection mode is a silent None return and the assertion at line 177 just says cannot enforce CVE floor. If anyone ever switches Pillow to Pillow~=10.3 on the (mistaken) belief that ~= implies a security floor, the test fails with a message that doesn't tell them which operator is the problem.
One extra clause in the helper or assertion message (...use the >= operator for security floors; ~= and == are not recognised) would convert that into a self-documenting failure. Strictly cosmetic — the existing message is enough to grep _floor_of and find the docstring — but cheap to add now.
Summary
Closes 19 of 20 open Dependabot alerts on https://github.com/strands-labs/robots/security/dependabot.
Changes
Direct dependency constraints (pyproject.toml)
>=8.0.0,<12.0.0>=10.3.0,<13.0.0>=6.0,<9.0.0>=6.0,<10.0.0<6.0.0<8.0.0Transitive (
uv.lockregenerated viauv lock --upgrade)Lockfile scope note
The previous
uv.lockwas generated against a partial extras set (nosim-mujoco, nobenchmark-libero, nomesh-iot). This PR'suv lock --upgradere-resolves with the full extras matrix declared inpyproject.toml, which adds 42 previously-unresolved transitive packages (notablymujoco,transformers 5.8.1,libero,robomimic,robosuite,numba,llvmlite,tensorboard,hydra-core,omegaconf,h5py).transformers 5.xalready has explicit compat code atstrands_robots/policies/lerobot_local/policy.py:309-313.Not fixed (1/20)
>=0.38.0)Blocked by upstream: lerobot 0.5.1 caps
diffusers<0.36.0,>=0.27.2. Cannot upgrade until lerobot bumps.Tracked in sec(deps): upgrade diffusers to >=0.38.0 for GHSA-98h9-4798-4q5v #210. Lockfile users remain exposed to GHSA-98h9-4798-4q5v until sec(deps): upgrade diffusers to >=0.38.0 for GHSA-98h9-4798-4q5v #210 lands.
Validation
Notes
>=1.0deps cap major).>=10.3.0per review feedback: carries the CVE-2024-28219 buffer-overflow fix for downstream consumers who install viapip install strands-robotswithout using this repo's lockfile. Note: this is the minimum security guarantee; the lockfile-pinned 12.2.0 is the recommended install path.[project.dependencies]line 40 and hatch env[tool.hatch.envs.default]line 115) updated in lockstep.grep -rn 'from cryptography\|import cryptography' strands_robots/returns nothing).uv.lock) users get the fix. Wheel-install consumers (pip install strands-robots) are not protected by a constraint floor for these packages. Floors are only added for direct deps where a CVE recurs.Section 13 - Review Rounds
d478d12989b9567fd6449+1c3682ctests/test_dependency_floors.py::test_security_floor_not_lowered,tests/test_dependency_floors.py::test_pin_site_consistency9c45ebf_DUAL_SITE_DEPSallowlist (3 entries) silently missed 4 dual-site deps already in pyproject (mypy, msgpack, pyzmq, ruff). Replaced with property-style scan that catches every dep appearing in 2+ sites; also fixed_version_tuplepadding so10.3 == 10.3.0and re-raise on unexpected parse failures so typo'd specs surface instead of silently disappearing.6f4e25dtest_multi_site_pins_consistent(replaces 3-row parametrizedtest_pin_site_consistency); verified by simulated mypy drift between[project.optional-dependencies.dev]and[tool.hatch.envs.default]-- new test fails with precise diagnostic, allowlist version would have silently passed._version_tuplePEP-440-unsafe: pre-release floor10.3.0rc1compared equal to10.3.0, silently accepting a floor that may not carry the CVE fix. Replaced withpackaging.version.Versionfor correct pre-release/post-release/dev ordering. Also added CHANGELOG wheel-install exposure note for transitive CVEs.ade474ftests/test_dependency_floors.py::test_parse_version_rejects_prerelease_below_releaseRound Budget Status
Round budget exceeded (5/3). R5 addresses the reviewer's explicit request to land the
packaging.version.Versionfix in this PR rather than as a follow-up, since it is the only deferred nit that weakens the test's stated security guarantee (pre-release floors silently pass).Remaining review feedback (deferred to follow-up issue):
_floor_ofonly recognizes>=, not~=or==(documented as intentional in the helper docstring)_KNOWN_DEFERRED_VULNStripwire for diffusers cap detection (would surface lerobot relaxation automatically)hatch run test-integ --collect-onlyresolves (CI covers unit tests; integ needs GPU)tool.hatch.envs.default(add comment or iterate all envs)These will be filed as a single follow-up issue on the project board after this PR merges.