Skip to content

Add Kubernetes Job & CronJob for isolated and scheduled pipeline execution#89

Merged
fuzziecoder merged 1 commit intocodex/fix-remaining-issues-and-raise-prfrom
codex/implement-containerized-task-execution
Feb 25, 2026
Merged

Add Kubernetes Job & CronJob for isolated and scheduled pipeline execution#89
fuzziecoder merged 1 commit intocodex/fix-remaining-issues-and-raise-prfrom
codex/implement-containerized-task-execution

Conversation

@fuzziecoder
Copy link
Copy Markdown
Owner

@fuzziecoder fuzziecoder commented Feb 25, 2026

Motivation

  • Enable secure, containerized stage execution and recurring pipeline runs using Kubernetes primitives so stages run isolated from API pods with clear retry/timeout/cleanup semantics.
  • Provide a production-friendly scheduling option compatible with autoscaling and node isolation to improve fault tolerance and operational safety.

Description

  • Added deploy/k8s/stage-execution-job.yaml, a Job template with backoffLimit, activeDeadlineSeconds, ttlSecondsAfterFinished, nodeSelector, tolerations, and resource requests/limits for isolated one-off stage execution.
  • Added deploy/k8s/pipeline-scheduler-cronjob.yaml, a CronJob template with schedule, concurrencyPolicy: Forbid, history retention, retry/deadline controls, node pinning, and resource constraints for scheduled pipelines.
  • Updated deploy/k8s/kustomization.yaml to include the new Job and CronJob so they are applied with kubectl apply -k deploy/k8s.
  • Expanded deploy/README.md with usage examples and operational guidance for job-based execution and CronJob scheduling.

Testing

  • Parsed and validated all Kubernetes manifests with python + yaml.safe_load_all and the validation run succeeded without parse errors.
  • Attempted kubectl kustomize deploy/k8s but kubectl was not available in the environment, so rendering via kubectl could not be executed here.
  • Verified deploy/k8s/kustomization.yaml references the new stage-execution-job.yaml and pipeline-scheduler-cronjob.yaml and the new manifest files are syntactically valid YAML.

Codex Task


Open with Devin

@vercel
Copy link
Copy Markdown

vercel bot commented Feb 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
flexi-roaster Building Building Preview, Comment Feb 25, 2026 1:22pm

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 25, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/implement-containerized-task-execution

Comment @coderabbitai help to get the list of available commands and usage tips.

@fuzziecoder fuzziecoder self-assigned this Feb 25, 2026
@fuzziecoder fuzziecoder merged commit 68284bf into codex/fix-remaining-issues-and-raise-pr Feb 25, 2026
4 of 7 checks passed
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment on lines +31 to +36
command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml"]
env:
- name: PIPELINE_ID
value: "sample-pipeline"
- name: STAGE_ID
value: "stage-1"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Stage execution Job runs entire pipeline instead of a single stage

The stage-execution-job.yaml is described as running "one-off isolated stage execution" and sets STAGE_ID and PIPELINE_ID environment variables, but the actual command invokes python -m backend.cli execute --pipeline /app/backend/examples/sample_pipeline.yaml which has no --stage flag. The CLI's execute subcommand only accepts --pipeline and --verbose (backend/cli.py:117-130), and the backend code never reads STAGE_ID or PIPELINE_ID from environment variables.

Root Cause

The container command does not pass the stage identifier to the CLI. The STAGE_ID env var is set but never consumed:

command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml"]
env:
  - name: STAGE_ID
    value: "stage-1"

Meanwhile, backend/cli.py:117-130 only defines --pipeline and --verbose arguments for the execute subcommand. There is no --stage argument and no code that reads os.environ['STAGE_ID'].

Impact: Every invocation of this Job will execute the entire pipeline (all 4 stages) rather than just the intended single stage (stage-1). This defeats the purpose of isolated stage execution and could cause unintended side effects (e.g., writing to a database via the write_data stage when only validate_data was intended).

Prompt for agents
The stage-execution-job.yaml sets STAGE_ID and PIPELINE_ID env vars but the CLI command does not use them. To fix this properly:

1. In backend/cli.py, add a --stage argument to the execute subcommand (around line 119-130) that accepts an optional stage ID to execute only that stage.
2. Update the execute_pipeline function (line 17) to accept and use the stage parameter, filtering execution to only the specified stage.
3. Update the command in deploy/k8s/stage-execution-job.yaml line 31 to pass the stage flag, e.g.:
   command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml", "--stage", "stage-1"]
4. Alternatively, if env-var-based configuration is preferred, update backend/cli.py to read STAGE_ID from os.environ as a fallback.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

- frontend.yaml
- worker.yaml
- autoscaling.yaml
- stage-execution-job.yaml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Job included in kustomization.yaml will run immediately on every kubectl apply -k

The stage-execution-job.yaml (a Job, not a CronJob) is listed in kustomization.yaml resources. This means every kubectl apply -k deploy/k8s will attempt to create this Job, causing it to execute immediately on first deployment.

Root Cause

Unlike Deployments and CronJobs, a Kubernetes Job is a run-once resource that starts executing as soon as it's created. Including it in kustomization.yaml:9 means:

  1. On first kubectl apply -k deploy/k8s, the Job is created and immediately starts running the pipeline — this is almost certainly unintended for what the README calls a "template" (deploy/README.md:49: "one-off isolated stage execution job template").
  2. On subsequent kubectl apply -k deploy/k8s invocations, if the Job spec hasn't changed, it's a no-op. But if the spec changes, the apply will fail because Job specs are immutable after creation.

The README itself shows two separate usage patterns — kubectl apply -k deploy/k8s for the full stack (deploy/README.md:39) and kubectl apply -f deploy/k8s/stage-execution-job.yaml for one-off execution (deploy/README.md:58) — which are contradictory.

Impact: Every fresh cluster deployment will trigger an unintended pipeline execution. Additionally, updating the Job template will cause kubectl apply -k to fail on clusters where the Job already exists.

Suggested change
- stage-execution-job.yaml
# stage-execution-job.yaml is a one-off template; apply it directly with:
# kubectl apply -f deploy/k8s/stage-execution-job.yaml
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant