feat: Implement environment variable isolation for components#3401
feat: Implement environment variable isolation for components#3401tschneidereit wants to merge 5 commits intospinframework:mainfrom
Conversation
This implements the SIP introduced in spinframework#3400. Note: I made extensive use of GitHub Copilot in implementing this, but made sure that I fully understand the entire implementation. Additionally, I ensured that everything is thoroughly documented and tested, since some of the implementation is very subtle and low-level. In particular, the isolation itself is implemented by synthesizing a new component containing two, also synthesized, core modules, plus a wrapper component for each target component. This code makes substantial use of the `wasm-encoder` crate to generate the necessary WebAssembly components and modules, and the `wasmparser` crate to validate the generated components. By comparison, the integration into Spin is delightfully simple: manifest loading and composition are the only two pieces that needed updating, with everything else not needing to have any awareness of the isolation at all. Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Not sure why, but locally Clippy is entirely happy with the previous code, running the exact same clippy invocation as in CI 🤷 Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Turns out `cargo fmt` doesn't like virtual workspaces unless it's called with `--all`. (Also includes a little bit of lockfile cleanup.) Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
@lann pointed out to me that those functions on `wasi:cli/environment` that shouldn't have filtering applied to them can just be forwarded at the Component Model level, instead of taking a detour through core wasm—beatiful as the scenery may be. That simplifies the implementation *and* improves performance. Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
|
|
||
| // Apply env isolation when there is at least one dependency. | ||
| if self.should_apply_env_isolation(&prepared) { | ||
| self.apply_env_isolation(component.id(), instantiation_id, world_id, &prepared) |
There was a problem hiding this comment.
I am not sure if I am misunderstanding this or if I have missed the relevant change elsewhere, but this seems to happen after the deny-all adapter has been applied (in prepare_dependencies). So my understanding would be that when the dep called get-environment, that would be denied by the adapter and never reach the isolator. I know you've tested this and it works, so this is probably more for my education, but what am I missing? (Sorry if this is a stupid question.)
| // but in practice almost all components will import `wasi:cli/environment`, | ||
| // so we go with this simpler heuristic for now. | ||
| fn should_apply_env_isolation(&self, prepared: &IndexMap<String, DependencyInfo>) -> bool { | ||
| prepared.values().len() > 0 |
There was a problem hiding this comment.
Any particular reason for preferring this over !prepared.is_empty()? Is the call to values() significant? If so maybe comment on why?
|
|
||
| /// Apply environment variable isolation. | ||
| /// | ||
| /// Generates an isolator component shared across all targets, plus a small |
There was a problem hiding this comment.
Does "target" mean "dependency" here?
(Edit: I think is means "component" in the sense of any component, main or dependency.)
| // Each target tracks the specific wasi:cli/environment version it imports, | ||
| // since different components may import different versions. | ||
| let mut targets = Vec::new(); | ||
| let mut target_instances: Vec<(String, NodeId, String)> = Vec::new(); |
There was a problem hiding this comment.
This might merit being a struct with field names, in particular because I for one will mix up the order of the two strings!
| self.register_package(&wrapper_pkg_name, None, wrapper_bytes)?; | ||
|
|
||
| // Wire isolator → wrapper (filtered get-environment) | ||
| let export_name = format!("environment-{target_name}-get-environment"); |
There was a problem hiding this comment.
Two sources of truth for this format string? (cf isolator line 180)
| // Apply --env to component environments | ||
| // Apply --env to component environments. | ||
| // Dynamic env vars are added as-is: if env isolation is active, | ||
| // the env map is already prefixed from lock time, and the isolator |
There was a problem hiding this comment.
| // the env map is already prefixed from lock time, and the isolator | |
| // environment variable names are already prefixed in the lockfile, and the isolator |
(I had to read the original a few times, sorry.)
| // Apply --env to component environments. | ||
| // Dynamic env vars are added as-is: if env isolation is active, | ||
| // the env map is already prefixed from lock time, and the isolator | ||
| // will route vars by prefix. The user provides the prefix they want. |
There was a problem hiding this comment.
The user provides the prefix they want.
I am not sure what this is getting at. The user has no control over the prefix. They have to use the prescribed prefix for the component or dep that will be consuming the EV.
| } | ||
|
|
||
| #[test] | ||
| fn dynamic_env_vars_passed_through_unchanged() { |
There was a problem hiding this comment.
| fn dynamic_env_vars_passed_through_unchanged() { | |
| fn command_line_env_vars_passed_through_unchanged() { |
| } | ||
|
|
||
| #[test] | ||
| fn dynamic_env_vars_passed_through_unchanged() { |
There was a problem hiding this comment.
I don't get the significance of "passed through unchanged." Why would they be anything other than "unchanged"? What are you testing here?
Is this a loader test, perhaps, that the bit where it munges env names from the manifest skips names supplied on the CLI? (No, because the loader never sees the CLI vars. Okay I think I see what this is testing now but I'm not sure I can think of a good name for it.)
|
|
||
| let cmd = UpCommand { | ||
| env: vec![ | ||
| ("MAIN_FOO".into(), "bar".into()), |
There was a problem hiding this comment.
As per comments on SIP, I find this very awkward. I am concerned about the coupling of internal and external naming. I am bothered that the operator needs to know the internal composition structure. And I dislike that if you have a dep-less component that uses EVs, adding a dep to it is a breaking change to the application (because the existing EV names stop working). I would really like us to explore alternatives here, please.
This implements the SIP introduced in #3400.
Note: I made extensive use of GitHub Copilot in implementing this, but made sure that I fully understand the entire implementation. Additionally, I ensured that everything is thoroughly documented and tested, since some of the implementation is very subtle and low-level.
In particular, the isolation itself is implemented by synthesizing a new component containing two, also synthesized, core modules, plus a wrapper component for each target component. This code makes substantial use of the
wasm-encodercrate to generate the necessary WebAssembly components and modules, and thewasmparsercrate to validate the generated components.By comparison, the integration into Spin is delightfully simple: manifest loading and composition are the only two pieces that needed updating, with everything else not needing to have any awareness of the isolation at all.