Skip to content

ci: add retry job to refresh cron workflows#393

Open
matheus1lva wants to merge 1 commit intomainfrom
ci/refresh-retry-jobs
Open

ci: add retry job to refresh cron workflows#393
matheus1lva wants to merge 1 commit intomainfrom
ci/refresh-retry-jobs

Conversation

@matheus1lva
Copy link
Copy Markdown
Collaborator

Summary

  • Adds a refresh_retry job to each refresh workflow (lists, snapshot, timeseries, reports, and both historical variants) that runs only when the primary refresh job fails. This absorbs transient GitHub infra failures like action-download errors (e.g. Failed to download archive '.../oven-sh/setup-bun/...').
  • Consolidates Uptime Kuma reporting into a final notify job so success/failure is based on whether either attempt succeeded, instead of firing from the primary job that just failed.
  • Historical workflows get the retry job but no notify job, since they didn't have Uptime Kuma reporting before.

Test plan

  • Trigger refresh-lists manually and confirm only the refresh job runs on success
    ```
    gh workflow run refresh-lists.yml
    gh run list --workflow=refresh-lists.yml --limit 1
    ```
    Expected: single run with refresh + notify jobs, no retry job executed.

  • Simulate failure (temporarily break the refresh command on a test branch) and confirm refresh_retry runs
    ```
    gh workflow run refresh-lists.yml --ref
    gh run view
    ```
    Expected: refresh fails, refresh_retry runs, notify reports based on retry result.

  • Verify Uptime Kuma gets exactly one status update per workflow run (no duplicates from both attempts reporting)

🤖 Generated with Claude Code

Adds a `refresh_retry` job that runs on failure to handle transient
GitHub infra errors (e.g. action download failures). A final `notify`
job consolidates Uptime Kuma reporting based on either attempt's result.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
kong Ready Ready Preview, Comment Apr 20, 2026 4:41pm

Request Review

Copy link
Copy Markdown
Contributor

@murderteeth murderteeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • why are we using retries? what evidence motivated this
  • the original issue suggests it may be a timing issue. is that theory wrong?
  • why aren't historical workflows monitored?

@matheus1lva
Copy link
Copy Markdown
Collaborator Author

At the time of the report it was a random github failure, it was exitting during setup phase, not on any timeout or anything similar - But as i looked again, the reports are failing and apparently that is a timing thing.

The initial intent remains, there's nothing to do because of github failing - now im gonna check the other ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants