chore(ci): add security baseline — CodeQL, Dependency Review, LLM-input check, action pinning#185
Conversation
…ut check, action pinning
cagataycali
left a comment
There was a problem hiding this comment.
This is excellent work — exactly the CI hardening the repo needs before v0.4.0. Approving the design and content.
What I like
- Action SHA pinning — especially
pypa/gh-action-pypi-publishon the publish path. That's the highest-risk vector and the most impactful single change. - Dependabot grouping — the ML stack generates huge churn (torch, transformers, diffusers); grouping keeps it manageable.
- LLM-input-safety annotations — novel and directly targets the bug class we hit in PRs #85, #86, #90. Non-blocking is the right call since the pattern scan can't know context.
- AGENTS.md section — this is how you get an autonomous coding agent to stop repeating the same class of bug. Teaching the pattern via docs is high-leverage.
Minor note on the CodeQL failure
The Analyze (Python) check failed — this is expected if the repo hasn't had CodeQL's "default setup" disabled in Settings → Code security (having both default-setup and an advanced codeql.yml conflicts). A repo admin should:
- Disable the default CodeQL setup in Settings if the advanced workflow is preferred, OR
- Remove
.github/workflows/codeql.ymland rely on the default setup.
The second commit (Remove CodeQL workflow) already tried to solve this — the third commit re-added it. Might need a quick clarification on which approach the maintainers prefer.
Verdict
✅ Design is sound, hard-fail gates are well-calibrated, no runtime code changes. The CodeQL config issue is a settings toggle, not a code problem.
🤖 AI agent review. Strands Agents. Feedback welcome!
cagataycali
left a comment
There was a problem hiding this comment.
Thorough and well-scoped security baseline. The threat model is clearly articulated (LLM-input → subprocess/XML, supply-chain on PyPI publish, optional-extras dep tree), and each workflow targets exactly those risks without over-reaching.
Highlights:
-
SHA-pinning + Dependabot maintenance strategy — the "pin once, auto-bump via Dependabot" pattern is the right answer to the maintenance burden concern. Most repos either don't pin (risky) or pin manually (unsustainable).
-
LLM Input Safety Check as annotation-only — smart to keep this as a signal rather than a gate. The pattern (
subprocess + f-string, name-variable-into-XML) is the exact bug class we've hit in PRs #85 and #90. -
Explicit "not added" section — prevents the inevitable "but why no Bandit/Trivy/Snyk" discussion. Shows the thought went into scoping, not just adding tools.
-
AGENTS.md update — feeding the learnings back into the autonomous pipeline so the agent writes safer code from the start.
CI is green. Ship it.
🤖 AI agent response. Strands Agents. Feedback welcome!
Summary
Adds a minimal CI security baseline targeting this repo's actual risk surface:
LLM input flowing into subprocess / XML / paths, heavy optional-extras dependency
tree, and supply-chain risk on the PyPI publish path. Hard-fail gates are limited
to the things that genuinely warrant blocking a merge; everything else surfaces
in the Security tab or as PR annotations.
What's added
security-and-qualitysuite)subprocess, MJCF/XML, file paths — the class of bug surfaced manually in PR #85 (MJCF sanitization gap) and PR #90 (subprocess injection, path traversal). PR + weekly schedule.subprocess + f-stringandname-variable-into-XMLin agent-callable code. Inline PR annotations on suspect lines.Action references pinned to commit SHAs
All
uses:references in existing workflows (test-lint.yml,pypi-publish-on-release.yml)and the three new workflow files are now pinned to full 40-character commit SHAs,
with the version tag preserved as a trailing comment.
pr-and-push.ymlonlyreferences a local reusable workflow so it needs no change.
Highest-impact pin:
pypa/gh-action-pypi-publish. It usesrelease/v1as amoving release branch — exactly the supply-chain pattern that the
tj-actions/changed-filesincident (March 2025) exploited. Pinning it to aspecific commit closes that risk on the PyPI publish path.
Doesn't pinning create maintenance burden?
No — Dependabot handles it. The
package-ecosystem: github-actionsblock independabot.ymlmeans new action releases arrive as auto-generated PRs thatare already SHA-pinned to the new version. Manual SHA lookups happen once
(this PR); after that, every bump is a one-click Dependabot merge — same workflow
as the existing Python dependency updates.
Expected steady-state: 1-2 single-line Dependabot PRs per month for GitHub Actions,
each with a changelog link. No tag-tracking, no manual SHA copying.
AGENTS.md update
Adds a "Review Learnings (PR #92 - CI Security Baseline)" section capturing the
LLM-input-safety pattern (allowlist enumerable params, centralise in
validate_inputs(), reject shell metachars in paths, bind to127.0.0.1) sothe autonomous coding pipeline starts writing safer code rather than waiting
for review. Also documents the action-pinning convention and what hard-fails
vs warns.
Repo settings to enable (admin action, not in this diff)
These live in Settings → Code security and analysis and need to be toggled
by a repo admin — they aren't configurable from a PR:
Also in Settings → Actions → General → Workflow permissions:
(not "Read and write"). Workflows in this PR elevate
security-events: write/
pull-requests: writeonly where needed.The CodeQL and Dependency Review workflows in this PR will surface findings in
the Security tab once these are on.
Verification
Why this set, why now
The PRs landed in the v0.4.0 prep cycle (#85, #86, #90) each had at least one
real bug in the LLM-input-to-subprocess / LLM-input-to-XML class that was
caught in human review. CodeQL + the input-safety annotations are aimed at
that specific bug class. The action pinning closes a known supply-chain
threat model on the PyPI publish path. Dependency Review + Dependabot give
the maintainer visibility into the optional-extras dep tree which currently
has none.