feat: cascade checker recognizes in-repo container-base.yml builds#5
Conversation
Some crunchtools images are built and published by a workflow inside a
DIFFERENT repo than their name suggests:
- quay.io/crunchtools/acquacotta-base is built by container-base.yml
inside crunchtools/acquacotta
- quay.io/crunchtools/rotv-base is built by build-base.yml inside
crunchtools/rotv
Before this change, the checker flagged FROM lines pointing at those
images as broken (no separate repo of that name in the org). Augment
the checker with a pre-pass that scans every workflow file for image-
publication patterns (IMAGE_NAME: crunchtools/X env vars, plus literal
quay.io/crunchtools/X references). FROM targets resolve against the
union of (repo names + workflow-published image names).
Cuts the live-org false-positive WARN about acquacotta-base.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request updates validate-cascade.py to scan and cache GitHub workflow files across repositories to identify published crunchtools images, resolving false-positive broken image references. The review feedback highlights two key improvements: first, removing a redundant fetch_text fallback since all workflows are already cached, and handling .yaml extensions; second, filtering out non-file entries when scanning the workflows directory to prevent potential errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| txt = workflows_cache.get((r, wf)) | ||
| if txt is None: | ||
| txt = fetch_text(args.org, r, f".github/workflows/{wf}", token) | ||
| if txt is None: |
There was a problem hiding this comment.
Since workflows_cache is populated by scanning the entire .github/workflows directory for every repository, any existing workflow file is already cached. If a workflow file (like build.yml or container.yml) is not present in workflows_cache, it means the file does not exist in the repository.
Calling fetch_text here is redundant and will always result in a useless 404 API request. Since most repositories only have one of these two workflow files, this causes an unnecessary 404 request for almost every repository, which can easily trigger GitHub API rate limits and slow down execution.
We can safely remove the fallback to fetch_text and also handle the .yaml extension if the .yml lookup fails.
| txt = workflows_cache.get((r, wf)) | |
| if txt is None: | |
| txt = fetch_text(args.org, r, f".github/workflows/{wf}", token) | |
| if txt is None: | |
| txt = workflows_cache.get((r, wf)) | |
| if txt is None and wf.endswith(".yml"): | |
| txt = workflows_cache.get((r, wf[:-4] + ".yaml")) | |
| if txt is None: | |
| continue |
| for entry in wf_entries: | ||
| name = entry.get("name", "") | ||
| if not name.endswith((".yml", ".yaml")): | ||
| continue |
There was a problem hiding this comment.
When scanning the .github/workflows directory, some entries might not be regular files (e.g., subdirectories, symlinks, or submodules). Calling fetch_text on a non-file entry will cause a TypeError or KeyError because the GitHub API response for a directory listing does not contain a content field.
To prevent potential crashes, explicitly filter out non-file entries by checking entry.get("type") == "file".
| for entry in wf_entries: | |
| name = entry.get("name", "") | |
| if not name.endswith((".yml", ".yaml")): | |
| continue | |
| for entry in wf_entries: | |
| if entry.get("type") != "file": | |
| continue | |
| name = entry.get("name", "") | |
| if not name.endswith((".yml", ".yaml")): | |
| continue |
Gatehouse AI code review is now gate #5 (between Gourmand and Container Build) for all MCP servers. Added to both the ordered gate list and the CI pipeline table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Some crunchtools Quay images are built by a workflow inside a DIFFERENT repo than their name suggests (e.g.
quay.io/crunchtools/acquacotta-baseis built bycontainer-base.ymlinsidecrunchtools/acquacotta;rotv-baselikewise).Before this change, the checker treated those FROM targets as broken because no separate repo of that name existed. Now it scans every workflow file in every repo for image-publication patterns (
IMAGE_NAME: crunchtools/Xenv vars + literalquay.io/crunchtools/Xreferences) and resolves FROM lines against the union of repo-names + workflow-published-image-names.Cuts the live-org false-positive WARN about
acquacotta-base. Remaining WARNs are legitimate (over-dispatch from ubi10-core to rotv, where rotv's main Containerfile uses anARGfor its FROM — separate, smaller issue).🤖 Generated with Claude Code