Replies: 7 comments 8 replies
-
|
How would Conda inter-operate with a docker container in this instance? If there was a Queue Environment using Conda, then a Docker Environment. Would it install to the docker container? (I think not) |
Beta Was this translation helpful? Give feedback.
-
|
One more thought on this is what if the user wants more than 1 instance of a container? Given |
Beta Was this translation helpful? Give feedback.
-
|
One thing worth considering: how would credentials, especially queue creds, be made available inside the container? The current sketch only bind-mounts |
Beta Was this translation helpful? Give feedback.
-
|
Here's another older request from deadline-cloud about this feature: aws-deadline/deadline-cloud#321 |
Beta Was this translation helpful? Give feedback.
-
|
For my container experiments, I've been directly mapping the session folder, along with the creds files that worker agent generates. This has 2 positive side effects. 1) they are auto rotated by the worker agent, 2) paths are mapped the same, so the same command to setup credentials is shared with the container. That also goes the same for sharing job attachments too! Would be a good extension into the OpenJD Sessions for Python I think? |
Beta Was this translation helpful? Give feedback.
-
|
Following up with two shapes I've been thinking around for how the wrap hook There are two proposals below — three hook points vs one unified hook — My take on this - go with Option A, 3 separate hooks to keep the cognitive load simpler. 1. Summary of the two optionsOption A — three separate hook pointsAdd three hooks to <EnvironmentActions> ::= the object:
onEnter: <Action>
+ onWrapEnter: <Action> # wraps inner envs' onEnter
+ onWrapTaskRun: <Action> # wraps tasks' onRun (the current RFC)
+ onWrapExit: <Action> # wraps inner envs' onExit
onExit: <Action>Each hook gets a focused variable namespace:
Option B — one unified hook with OpenJD EXPR
|
| Variable | Type | Description |
|---|---|---|
WrappedAction.Type |
string |
TASK_RUN | ENV_ENTER | ENV_EXIT |
WrappedAction.Name |
string | null |
Env name for enter/exit; null for task runs |
WrappedAction.Command |
string |
The wrapped action's command |
WrappedAction.Args |
list[string] |
The wrapped action's args |
WrappedAction.Environment |
list[string] |
openjd_env-exported env vars at this point |
WrappedAction.Timeout |
int |
Wrapped action's timeout, in seconds |
The common escape hatch
Both options need a runOnHost: true flag on <Action> so that inner
actions can opt out of wrapping. This is necessary because some actions
can't run in the wrapped context and still work:
- Mounting an NFS/SMB share the container will then bind-mount
- Setting up a VPN tunnel or SSH forward for a license server
- License checkout from a FlexLM server whose client isn't baked into
the container image - An
onExitcleanup that must run even if the container crashed or
was OOM-killed
<Action> ::= the object:
command: string
args: list[string]
timeout: int
cancelation: <Cancelation>
+ runOnHost: bool # @optional, default falseWhen runOnHost: true is set, the runtime executes the action directly on
the host, bypassing any active wrap hook.
Motivation & use cases
Both options target the same gap the current RFC leaves open:
-
Container setup isn't complete without wrapping lifecycle actions.
A Docker or Apptainer job env that starts a container inonEnter
and stops it inonExitwants inner step envs' setup
(pip install,conda activate, licensing, warm-up) to run inside
the container too. Wrapping onlyonRunleaves half the setup
running on the host, and maybe executing to an incorrect taget. -
Instrumentation/tracing wrappers want coverage of the full session.
A wrap env that runs everything understrace, or inside a cgroup
for resource accounting, or behind a profiler, wants to cover
enter/exit too — not just tasks. -
Remote-execution wrappers. An env that SSHes into a remote host
and runs tasks there needs to run the inner env setup on the same
remote host, not locally. -
Escape hatches are a hard requirement. Without
runOnHost: true,
there's no way to fetch credentials on the host, set up host-level
mounts the container will bind in, or guarantee cleanup runs even
when the wrapper itself is broken.
2. Option A walkthrough — three hooks with opt-out
Scenario: a studio-wide Docker job env wraps everything. A BlenderSetup
step env installs a Blender plugin inside the container. An NFSMount
step env mounts a render share on the host — it must not be wrapped.
specificationVersion: "environment-2023-09"
extensions:
- WRAP_TASK_RUN
- EXPR
environment:
name: Docker
script:
actions:
onEnter:
command: "bash"
args: ["{{Env.File.Enter}}"]
# NEW: wraps inner env onEnter actions
onWrapEnter:
command: "bash"
args: ["{{Env.File.WrapEnter}}"]
timeout: "{{Env.Wrapped.Timeout}}"
# Same as current RFC
onWrapTaskRun:
command: "bash"
args: ["{{Env.File.WrapTaskRun}}"]
timeout: "{{Env.Action.Timeout}}"
# NEW: wraps inner env onExit actions
onWrapExit:
command: "bash"
args: ["{{Env.File.WrapExit}}"]
timeout: "{{Env.Wrapped.Timeout}}"
onExit:
command: "bash"
args: ["{{Env.File.Exit}}"]
embeddedFiles:
- name: Enter
filename: docker-env-enter.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
DOCKER_CONTAINER_ID=$(docker container run --rm --detach \
--mount 'type=bind,src={{Session.WorkingDirectory}},dst={{Session.WorkingDirectory}}' \
'{{Param.ContainerImage}}' bash -c 'sleep infinity')
echo "openjd_env: DOCKER_CONTAINER_ID=$DOCKER_CONTAINER_ID"
# Wrap inner env onEnter — forward the inner env's command into the container.
- name: WrapEnter
filename: docker-wrap-enter.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
echo "[Docker] Running onEnter for env '{{Env.Wrapped.Name}}' inside container"
docker container exec \
$DOCKER_CONTAINER_ID \
{{ repr_sh(flatten([['-e', e] for e in Env.Wrapped.Environment])) }} \
{{ repr_sh(Env.Wrapped.Command) }} \
{{ repr_sh(Env.Wrapped.Args) }}
# Wrap task onRun — unchanged from current RFC.
- name: WrapTaskRun
filename: docker-wrap-task-run.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
docker container exec \
$DOCKER_CONTAINER_ID \
{{ repr_sh(flatten([['-e', e] for e in Task.Environment])) }} \
{{ repr_sh(Task.Command) }} \
{{ repr_sh(Task.Args) }}
# Wrap inner env onExit.
- name: WrapExit
filename: docker-wrap-exit.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
echo "[Docker] Running onExit for env '{{Env.Wrapped.Name}}' inside container"
docker container exec \
$DOCKER_CONTAINER_ID \
{{ repr_sh(flatten([['-e', e] for e in Env.Wrapped.Environment])) }} \
{{ repr_sh(Env.Wrapped.Command) }} \
{{ repr_sh(Env.Wrapped.Args) }}
- name: Exit
filename: docker-env-exit.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
docker container stop $DOCKER_CONTAINER_ID --timeout {{Env.Action.Timeout}}Job template using it, with two inner step envs
specificationVersion: 'jobtemplate-2023-09'
name: Blender Render
steps:
- name: Render
stepEnvironments:
# This env's onEnter and onExit run ON THE HOST (bypass the wrapper)
# because mounting NFS inside the container would be pointless.
- name: NFSMount
script:
actions:
onEnter:
command: mount
args: ["-t", "nfs", "fileserver:/renders", "/mnt/renders"]
runOnHost: true
onExit:
command: umount
args: ["/mnt/renders"]
runOnHost: true
# This env's onEnter runs INSIDE the container — Blender plugin
# install needs to modify the container's Python env.
- name: BlenderSetup
script:
actions:
onEnter:
command: pip
args: ["install", "--quiet", "blender-batch==2.1"]
onExit:
command: pip
args: ["uninstall", "--yes", "blender-batch"]
parameterSpace:
taskParameterDefinitions:
- name: Frame
type: INT
range: "1-100"
script:
actions:
# This runs INSIDE the container via onWrapTaskRun.
onRun:
command: blender
args:
- "--background"
- "{{Param.SceneFile}}"
- "--frame-set"
- "{{Task.Param.Frame}}"Execution order
Docker.onEnter → HOST (starts container)
NFSMount.onEnter → HOST (runOnHost: true)
BlenderSetup.onEnter → CONTAINER via Docker.onWrapEnter
[task 1] blender ... → CONTAINER via Docker.onWrapTaskRun
[task 2] blender ... → CONTAINER via Docker.onWrapTaskRun
...
BlenderSetup.onExit → CONTAINER via Docker.onWrapExit
NFSMount.onExit → HOST (runOnHost: true)
Docker.onExit → HOST (stops container)
Each hook has exactly one job. The schema tells you at a glance what the
env wraps. No expression logic required — works without the EXPR extension
for the shell-based case (though repr_sh() still matters for safety).
3. Option B walkthrough — one hook with if/else and opt-out
Same scenario, same behavior — but expressed as a single onWrapAction
that branches on WrappedAction.Type:
specificationVersion: "environment-2023-09"
extensions:
- WRAP_TASK_RUN
- EXPR
environment:
name: Docker
script:
actions:
onEnter:
command: "bash"
args: ["{{Env.File.Enter}}"]
# NEW: single hook, branches inside the script
onWrapAction:
command: "bash"
args: ["{{Env.File.WrapAction}}"]
timeout: "{{WrappedAction.Timeout}}"
onExit:
command: "bash"
args: ["{{Env.File.Exit}}"]
embeddedFiles:
- name: Enter
filename: docker-env-enter.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
DOCKER_CONTAINER_ID=$(docker container run --rm --detach \
--mount 'type=bind,src={{Session.WorkingDirectory}},dst={{Session.WorkingDirectory}}' \
'{{Param.ContainerImage}}' bash -c 'sleep infinity')
echo "openjd_env: DOCKER_CONTAINER_ID=$DOCKER_CONTAINER_ID"
# One wrap script. Branches on WrappedAction.Type for action-type-specific
# behavior, and forwards everything else into the container the same way.
- name: WrapAction
filename: docker-wrap-action.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
# EXPR-driven log line. Null-coalesce Name since task runs have no env name.
echo {{ repr_sh(
"[Docker] " + WrappedAction.Type
+ (" for env '" + WrappedAction.Name + "'" if WrappedAction.Name else "")
) }}
# The forwarding itself is the same shape for all three action types.
# If you wanted to SUPPRESS ENV_EXIT when the container is already gone,
# or to run ENV_ENTER with --user root inside the container, you'd
# branch here using EXPR conditionals.
docker container exec \
{{ '--user root' if WrappedAction.Type == 'ENV_ENTER' else '' }} \
$DOCKER_CONTAINER_ID \
{{ repr_sh(flatten([['-e', e] for e in WrappedAction.Environment])) }} \
{{ repr_sh(WrappedAction.Command) }} \
{{ repr_sh(WrappedAction.Args) }}
- name: Exit
filename: docker-env-exit.sh
type: TEXT
data: |
#!/bin/env bash
set -euo pipefail
docker container stop $DOCKER_CONTAINER_ID --timeout {{Env.Action.Timeout}}Job template — identical to Option A
The inner step envs don't care which wrap model the outer env uses. The
same runOnHost: true opt-out works identically:
steps:
- name: Render
stepEnvironments:
- name: NFSMount
script:
actions:
onEnter:
command: mount
args: ["-t", "nfs", "fileserver:/renders", "/mnt/renders"]
runOnHost: true # bypasses onWrapAction
onExit:
command: umount
args: ["/mnt/renders"]
runOnHost: true # bypasses onWrapAction
- name: BlenderSetup
script:
actions:
onEnter:
command: pip
args: ["install", "--quiet", "blender-batch==2.1"]
# No runOnHost → runs via Docker.onWrapAction with Type=ENV_ENTER
onExit:
command: pip
args: ["uninstall", "--yes", "blender-batch"]
# No runOnHost → runs via Docker.onWrapAction with Type=ENV_EXIT
parameterSpace:
taskParameterDefinitions:
- name: Frame
type: INT
range: "1-100"
script:
actions:
onRun:
command: blender
args: ["--background", "{{Param.SceneFile}}", "--frame-set", "{{Task.Param.Frame}}"]
# No runOnHost → runs via Docker.onWrapAction with Type=TASK_RUNExecution order — identical to Option A
Docker.onEnter → HOST (starts container)
NFSMount.onEnter → HOST (runOnHost: true)
BlenderSetup.onEnter → CONTAINER via Docker.onWrapAction (Type=ENV_ENTER)
[task 1] blender ... → CONTAINER via Docker.onWrapAction (Type=TASK_RUN)
[task 2] blender ... → CONTAINER via Docker.onWrapAction (Type=TASK_RUN)
...
BlenderSetup.onExit → CONTAINER via Docker.onWrapAction (Type=ENV_EXIT)
NFSMount.onExit → HOST (runOnHost: true)
Docker.onExit → HOST (stops container)
Where Option B is unique, and uses EXPR to select at runtime
This pattern lets the wrapper suppress inner actions it doesn't want
to forward. You can't express that in Option A without ad-hoc fields:
onWrapAction:
command: bash
args:
- "-c"
- >-
{{ "docker exec $DOCKER_CONTAINER_ID "
+ repr_sh(WrappedAction.Command) + " " + repr_sh(WrappedAction.Args)
if WrappedAction.Type == "TASK_RUN"
else "true" }}Here the wrapper only forwards task runs into the container and no-ops
everything else. That's a legit pattern for envs that manage their own
lifecycle and only care about isolating the hot path.
Tradeoffs in one paragraph
Option A has a clearer schema (you can see at a glance what the env
wraps), a cleaner variable namespace (Task.* vs Env.Wrapped.*), and
doesn't require the EXPR extension for the common case. But template
authors end up writing three nearly-identical docker exec blocks.
Option B is DRY and more expressive (can suppress or transform per
action type), but it requires EXPR for anything non-trivial, silently
wraps everything from inner envs, and has a slightly less obvious debug
story ("which branch ran?"). Both need the runOnHost: true escape
hatch — that's table stakes either way.
David's suggestion: I think Option A is the right starting point because the
schema is explicit about what's wrapped, and a future extension could
always add onWrapAction as a shorthand for "use the same script for
all three hooks." Going the other direction — starting with the
unified hook and then retrofitting per-phase hooks — is much harder.
Thoughts?
Security implications
The security model is the same for both options. The main risks are:
- Shell injection. Unchanged from the current RFC — wrap scripts
must userepr_sh()(orrepr_cmd()/repr_pwsh()on Windows) from
RFC 0006 to safely reconstruct the wrapped command line. This applies
to wrapped enter/exit commands just as much as to task commands. - Command length doubling. Unchanged — still a "double-penalty"
problem, but now it can happen for enter/exit as well as task runs. - Credential leakage into the wrapped context. New concern. If an
inner env'sonEnterwrites secrets to
{{Session.WorkingDirectory}}expecting host-only access, those
secrets may now land inside the container filesystem, where a
different security boundary applies. This is exactly what
runOnHost: trueis for — author of a credential-fetching env
declares it to avoid the leak. - Privilege shifts. Wrapped enter/exit actions now run under the
wrapper's identity, bind mounts, and network namespace, which may
differ from what the env author expected. Doc this loudly in the
security section. - Silent wrapping (Option B specific). With a single unified hook,
inner env authors may not realize their lifecycle code will run
inside a container. Option A's schema makes the wrapping explicit —
if you seeonWrapEnter, you know enter is wrapped. Option B relies
on runtime behavior that isn't visible in the inner env's schema.
Beta Was this translation helpful? Give feedback.
-
|
Following the RFC process, I have posted #132 for comments. Please take a look if you have any comments outstanding or want to help with polishing. Next, we will be moving to rfc final comments. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
Environment templates can accept parameters for a job, and use them to prepare the environment variables, file system contents, or other aspects of the context in which a job runs. They can be included inside job templates at the job and step level, or can be added externally to a job with the openjd-cli
--environmentoption or scheduler support like with a Deadline Cloud queue environment.Some existing example use cases for context:
To follow the Open Job Description Design Tenets, a solution for containers would interoperate with the example Conda and Rez use cases. Job templates should be portable in a way to run them, unmodified, with either a Conda, Rez, Docker, or Apptainer environment template that provides the software environment to run in.
The RFC Idea
The bash-in-docker example, and the example that runs a background daemon process and then communicating it in each task are not far from starting a docker container in the background and then running each task in that container. The missing ingredient is a way to let the environment template say how to run the task inside of the container without having to modify the job template it's running.
The idea for this RFC is to extend the
<Environment>with a third session action,onWrapTaskRun. The runtime would provide this action with enough context to run the task'sonRunaction itself. That context includes all the properties of the<Action>: Thecommand,args,timeout, andcancelationmethod. Because all the context for running the task lives inside the session working directory, the container commands would need to create a bind mount that maps the path{{Session.WorkingDirectory}}to the identical path within the container.Credit for inspiration of this idea goes to Pydantic's WrapValidator, while implementing OpenJobDescription/openjd-model-for-python#164.
A sketch of how a docker environment template could look with this feature:
Beta Was this translation helpful? Give feedback.
All reactions