Add Kubernetes Job & CronJob for isolated and scheduled pipeline execution#89
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
68284bf
into
codex/fix-remaining-issues-and-raise-pr
| command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml"] | ||
| env: | ||
| - name: PIPELINE_ID | ||
| value: "sample-pipeline" | ||
| - name: STAGE_ID | ||
| value: "stage-1" |
There was a problem hiding this comment.
🔴 Stage execution Job runs entire pipeline instead of a single stage
The stage-execution-job.yaml is described as running "one-off isolated stage execution" and sets STAGE_ID and PIPELINE_ID environment variables, but the actual command invokes python -m backend.cli execute --pipeline /app/backend/examples/sample_pipeline.yaml which has no --stage flag. The CLI's execute subcommand only accepts --pipeline and --verbose (backend/cli.py:117-130), and the backend code never reads STAGE_ID or PIPELINE_ID from environment variables.
Root Cause
The container command does not pass the stage identifier to the CLI. The STAGE_ID env var is set but never consumed:
command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml"]
env:
- name: STAGE_ID
value: "stage-1"Meanwhile, backend/cli.py:117-130 only defines --pipeline and --verbose arguments for the execute subcommand. There is no --stage argument and no code that reads os.environ['STAGE_ID'].
Impact: Every invocation of this Job will execute the entire pipeline (all 4 stages) rather than just the intended single stage (stage-1). This defeats the purpose of isolated stage execution and could cause unintended side effects (e.g., writing to a database via the write_data stage when only validate_data was intended).
Prompt for agents
The stage-execution-job.yaml sets STAGE_ID and PIPELINE_ID env vars but the CLI command does not use them. To fix this properly:
1. In backend/cli.py, add a --stage argument to the execute subcommand (around line 119-130) that accepts an optional stage ID to execute only that stage.
2. Update the execute_pipeline function (line 17) to accept and use the stage parameter, filtering execution to only the specified stage.
3. Update the command in deploy/k8s/stage-execution-job.yaml line 31 to pass the stage flag, e.g.:
command: ["python", "-m", "backend.cli", "execute", "--pipeline", "/app/backend/examples/sample_pipeline.yaml", "--stage", "stage-1"]
4. Alternatively, if env-var-based configuration is preferred, update backend/cli.py to read STAGE_ID from os.environ as a fallback.
Was this helpful? React with 👍 or 👎 to provide feedback.
| - frontend.yaml | ||
| - worker.yaml | ||
| - autoscaling.yaml | ||
| - stage-execution-job.yaml |
There was a problem hiding this comment.
🔴 Job included in kustomization.yaml will run immediately on every kubectl apply -k
The stage-execution-job.yaml (a Job, not a CronJob) is listed in kustomization.yaml resources. This means every kubectl apply -k deploy/k8s will attempt to create this Job, causing it to execute immediately on first deployment.
Root Cause
Unlike Deployments and CronJobs, a Kubernetes Job is a run-once resource that starts executing as soon as it's created. Including it in kustomization.yaml:9 means:
- On first
kubectl apply -k deploy/k8s, the Job is created and immediately starts running the pipeline — this is almost certainly unintended for what the README calls a "template" (deploy/README.md:49: "one-off isolated stage execution job template"). - On subsequent
kubectl apply -k deploy/k8sinvocations, if the Job spec hasn't changed, it's a no-op. But if the spec changes, the apply will fail because Job specs are immutable after creation.
The README itself shows two separate usage patterns — kubectl apply -k deploy/k8s for the full stack (deploy/README.md:39) and kubectl apply -f deploy/k8s/stage-execution-job.yaml for one-off execution (deploy/README.md:58) — which are contradictory.
Impact: Every fresh cluster deployment will trigger an unintended pipeline execution. Additionally, updating the Job template will cause kubectl apply -k to fail on clusters where the Job already exists.
| - stage-execution-job.yaml | |
| # stage-execution-job.yaml is a one-off template; apply it directly with: | |
| # kubectl apply -f deploy/k8s/stage-execution-job.yaml | |
Was this helpful? React with 👍 or 👎 to provide feedback.
Motivation
Description
deploy/k8s/stage-execution-job.yaml, aJobtemplate withbackoffLimit,activeDeadlineSeconds,ttlSecondsAfterFinished,nodeSelector,tolerations, and resource requests/limits for isolated one-off stage execution.deploy/k8s/pipeline-scheduler-cronjob.yaml, aCronJobtemplate withschedule,concurrencyPolicy: Forbid, history retention, retry/deadline controls, node pinning, and resource constraints for scheduled pipelines.deploy/k8s/kustomization.yamlto include the new Job and CronJob so they are applied withkubectl apply -k deploy/k8s.deploy/README.mdwith usage examples and operational guidance for job-based execution and CronJob scheduling.Testing
python+yaml.safe_load_alland the validation run succeeded without parse errors.kubectl kustomize deploy/k8sbutkubectlwas not available in the environment, so rendering viakubectlcould not be executed here.deploy/k8s/kustomization.yamlreferences the newstage-execution-job.yamlandpipeline-scheduler-cronjob.yamland the new manifest files are syntactically valid YAML.Codex Task