chore(ci): add security baseline — CodeQL, Dependency Review, LLM-input check, action pinning by sundargthb · Pull Request #185 · strands-labs/robots

sundargthb · 2026-05-20T20:27:20Z

Summary

Adds a minimal CI security baseline targeting this repo's actual risk surface:
LLM input flowing into subprocess / XML / paths, heavy optional-extras dependency
tree, and supply-chain risk on the PyPI publish path. Hard-fail gates are limited
to the things that genuinely warrant blocking a merge; everything else surfaces
in the Security tab or as PR annotations.

What's added

Workflow	Catches	Hard-fail?
CodeQL (Python, `security-and-quality` suite)	Taint flows from LLM-supplied input into `subprocess`, MJCF/XML, file paths — the class of bug surfaced manually in PR #85 (MJCF sanitization gap) and PR #90 (subprocess injection, path traversal). PR + weekly schedule.	No — informational, Security tab
Dependency Review	New deps with known high/critical CVEs. Comment-on-failure.	Yes (high/critical CVE on new dep)
LLM Input Safety Check	Pattern scan for `subprocess + f-string` and `name-variable-into-XML` in agent-callable code. Inline PR annotations on suspect lines.	No — annotation only
Dependabot	Weekly grouped updates for the ML stack, sim stack, dev tools, AND the GitHub Actions used by these workflows. Grouping prevents 8-PR weekly churn from torch/transformers/etc.	N/A (config)

Action references pinned to commit SHAs

All uses: references in existing workflows (test-lint.yml, pypi-publish-on-release.yml)
and the three new workflow files are now pinned to full 40-character commit SHAs,
with the version tag preserved as a trailing comment. pr-and-push.yml only
references a local reusable workflow so it needs no change.

Highest-impact pin: pypa/gh-action-pypi-publish. It uses release/v1 as a
moving release branch — exactly the supply-chain pattern that the
tj-actions/changed-files incident (March 2025) exploited. Pinning it to a
specific commit closes that risk on the PyPI publish path.

Doesn't pinning create maintenance burden?

No — Dependabot handles it. The package-ecosystem: github-actions block in
dependabot.yml means new action releases arrive as auto-generated PRs that
are already SHA-pinned to the new version. Manual SHA lookups happen once
(this PR); after that, every bump is a one-click Dependabot merge — same workflow
as the existing Python dependency updates.

Expected steady-state: 1-2 single-line Dependabot PRs per month for GitHub Actions,
each with a changelog link. No tag-tracking, no manual SHA copying.

AGENTS.md update

Adds a "Review Learnings (PR #92 - CI Security Baseline)" section capturing the
LLM-input-safety pattern (allowlist enumerable params, centralise in
validate_inputs(), reject shell metachars in paths, bind to 127.0.0.1) so
the autonomous coding pipeline starts writing safer code rather than waiting
for review. Also documents the action-pinning convention and what hard-fails
vs warns.

Repo settings to enable (admin action, not in this diff)

These live in Settings → Code security and analysis and need to be toggled
by a repo admin — they aren't configurable from a PR:

Secret scanning → Enable
Push protection → Enable (blocks pushes containing detected secrets)
Dependabot alerts → Enable
Dependabot security updates → Enable
Dependency graph → confirm Enabled (usually on by default for public repos)

Also in Settings → Actions → General → Workflow permissions:

Confirm "Read repository contents and packages permissions" is selected
(not "Read and write"). Workflows in this PR elevate security-events: write
/ pull-requests: write only where needed.

The CodeQL and Dependency Review workflows in this PR will surface findings in
the Security tab once these are on.

Verification

CodeQL runs cleanly on this PR. Findings (if any) appear in the Security tab.
Dependency Review is a no-op on this PR since it adds no runtime deps.
LLM-input-safety annotates the files it scans; PR shows whether annotations fire.
Dependabot config is validated by GitHub on push.

Why this set, why now

The PRs landed in the v0.4.0 prep cycle (#85, #86, #90) each had at least one
real bug in the LLM-input-to-subprocess / LLM-input-to-XML class that was
caught in human review. CodeQL + the input-safety annotations are aimed at
that specific bug class. The action pinning closes a known supply-chain
threat model on the PyPI publish path. Dependency Review + Dependabot give
the maintainer visibility into the optional-extras dep tree which currently
has none.

…ut check, action pinning

cagataycali

This is excellent work — exactly the CI hardening the repo needs before v0.4.0. Approving the design and content.

What I like

Action SHA pinning — especially pypa/gh-action-pypi-publish on the publish path. That's the highest-risk vector and the most impactful single change.
Dependabot grouping — the ML stack generates huge churn (torch, transformers, diffusers); grouping keeps it manageable.
LLM-input-safety annotations — novel and directly targets the bug class we hit in PRs #85, #86, #90. Non-blocking is the right call since the pattern scan can't know context.
AGENTS.md section — this is how you get an autonomous coding agent to stop repeating the same class of bug. Teaching the pattern via docs is high-leverage.

Minor note on the CodeQL failure

The Analyze (Python) check failed — this is expected if the repo hasn't had CodeQL's "default setup" disabled in Settings → Code security (having both default-setup and an advanced codeql.yml conflicts). A repo admin should:

Disable the default CodeQL setup in Settings if the advanced workflow is preferred, OR
Remove .github/workflows/codeql.yml and rely on the default setup.

The second commit (Remove CodeQL workflow) already tried to solve this — the third commit re-added it. Might need a quick clarification on which approach the maintainers prefer.

Verdict

✅ Design is sound, hard-fail gates are well-calibrated, no runtime code changes. The CodeQL config issue is a settings toggle, not a code problem.

🤖 AI agent review. Strands Agents. Feedback welcome!

cagataycali

Thorough and well-scoped security baseline. The threat model is clearly articulated (LLM-input → subprocess/XML, supply-chain on PyPI publish, optional-extras dep tree), and each workflow targets exactly those risks without over-reaching.

Highlights:

SHA-pinning + Dependabot maintenance strategy — the "pin once, auto-bump via Dependabot" pattern is the right answer to the maintenance burden concern. Most repos either don't pin (risky) or pin manually (unsustainable).
LLM Input Safety Check as annotation-only — smart to keep this as a signal rather than a gate. The pattern (subprocess + f-string, name-variable-into-XML) is the exact bug class we've hit in PRs #85 and #90.
Explicit "not added" section — prevents the inevitable "but why no Bandit/Trivy/Snyk" discussion. Shows the thought went into scoping, not just adding tools.
AGENTS.md update — feeding the learnings back into the autonomous pipeline so the agent writes safer code from the start.

CI is green. Ship it.

🤖 AI agent response. Strands Agents. Feedback welcome!

chore(ci): add security baseline — CodeQL, Dependency Review, LLM-inp…

18b8193

…ut check, action pinning

sundargthb requested a review from cagataycali May 20, 2026 20:27

Sundar Raghavan added 2 commits May 20, 2026 13:46

Remove CodeQL workflow; upstream uses default setup

1329da8

fix: adding codeql advanced yml

1b243bd

cagataycali approved these changes May 20, 2026

View reviewed changes

sundargthb merged commit 0bc8460 into strands-labs:main May 21, 2026
3 of 4 checks passed

sundargthb deleted the chore/ci-security-baseline branch May 21, 2026 07:03

yinsong1986 mentioned this pull request May 21, 2026

fix(eval): close #187 — propagate spec.instruction + per-episode policy.reset(seed) for ZMQ reproducibility #188

Merged

cagataycali mentioned this pull request May 22, 2026

strands-labs/robots roadmap | v0.3.9 → v0.4.0 #94

Open

cagataycali moved this to Done in Strands Labs - Robots May 23, 2026

cagataycali added this to Strands Labs - Robots May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(ci): add security baseline — CodeQL, Dependency Review, LLM-input check, action pinning#185

chore(ci): add security baseline — CodeQL, Dependency Review, LLM-input check, action pinning#185
sundargthb merged 3 commits into
strands-labs:mainfrom
sundargthb:chore/ci-security-baseline

sundargthb commented May 20, 2026 •

edited

Loading

Uh oh!

cagataycali left a comment

Uh oh!

cagataycali left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sundargthb commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's added

Action references pinned to commit SHAs

Doesn't pinning create maintenance burden?

AGENTS.md update

Repo settings to enable (admin action, not in this diff)

Verification

Why this set, why now

Uh oh!

cagataycali left a comment

Choose a reason for hiding this comment

What I like

Minor note on the CodeQL failure

Verdict

Uh oh!

cagataycali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sundargthb commented May 20, 2026 •

edited

Loading