Skip to content

feat: Implement environment variable isolation for components#3401

Open
tschneidereit wants to merge 5 commits intospinframework:mainfrom
tschneidereit:env-isolator-impl
Open

feat: Implement environment variable isolation for components#3401
tschneidereit wants to merge 5 commits intospinframework:mainfrom
tschneidereit:env-isolator-impl

Conversation

@tschneidereit
Copy link
Contributor

This implements the SIP introduced in #3400.

Note: I made extensive use of GitHub Copilot in implementing this, but made sure that I fully understand the entire implementation. Additionally, I ensured that everything is thoroughly documented and tested, since some of the implementation is very subtle and low-level.

In particular, the isolation itself is implemented by synthesizing a new component containing two, also synthesized, core modules, plus a wrapper component for each target component. This code makes substantial use of the wasm-encoder crate to generate the necessary WebAssembly components and modules, and the wasmparser crate to validate the generated components.

By comparison, the integration into Spin is delightfully simple: manifest loading and composition are the only two pieces that needed updating, with everything else not needing to have any awareness of the isolation at all.

This implements the SIP introduced in spinframework#3400.

Note: I made extensive use of GitHub Copilot in implementing this, but made sure that I fully understand the entire implementation. Additionally, I ensured that everything is thoroughly documented and tested, since some of the implementation is very subtle and low-level.

In particular, the isolation itself is implemented by synthesizing a new component containing two, also synthesized, core modules, plus a wrapper component for each target component. This code makes substantial use of the `wasm-encoder` crate to generate the necessary WebAssembly components and modules, and the `wasmparser` crate to validate the generated components.

By comparison, the integration into Spin is delightfully simple: manifest loading and composition are the only two pieces that needed updating, with everything else not needing to have any awareness of the isolation at all.

Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Not sure why, but locally Clippy is entirely happy with the previous code, running the exact same clippy invocation as in CI 🤷

Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
Turns out `cargo fmt` doesn't like virtual workspaces unless it's called with `--all`.

(Also includes a little bit of lockfile cleanup.)

Signed-off-by: Till Schneidereit <till@tillschneidereit.net>
@lann pointed out to me that those functions on `wasi:cli/environment` that shouldn't have filtering applied to them can just be forwarded at the Component Model level, instead of taking a detour through core wasm—beatiful as the scenery may be.

That simplifies the implementation *and* improves performance.

Signed-off-by: Till Schneidereit <till@tillschneidereit.net>

// Apply env isolation when there is at least one dependency.
if self.should_apply_env_isolation(&prepared) {
self.apply_env_isolation(component.id(), instantiation_id, world_id, &prepared)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I am misunderstanding this or if I have missed the relevant change elsewhere, but this seems to happen after the deny-all adapter has been applied (in prepare_dependencies). So my understanding would be that when the dep called get-environment, that would be denied by the adapter and never reach the isolator. I know you've tested this and it works, so this is probably more for my education, but what am I missing? (Sorry if this is a stupid question.)

// but in practice almost all components will import `wasi:cli/environment`,
// so we go with this simpler heuristic for now.
fn should_apply_env_isolation(&self, prepared: &IndexMap<String, DependencyInfo>) -> bool {
prepared.values().len() > 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for preferring this over !prepared.is_empty()? Is the call to values() significant? If so maybe comment on why?


/// Apply environment variable isolation.
///
/// Generates an isolator component shared across all targets, plus a small
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "target" mean "dependency" here?

(Edit: I think is means "component" in the sense of any component, main or dependency.)

// Each target tracks the specific wasi:cli/environment version it imports,
// since different components may import different versions.
let mut targets = Vec::new();
let mut target_instances: Vec<(String, NodeId, String)> = Vec::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might merit being a struct with field names, in particular because I for one will mix up the order of the two strings!

self.register_package(&wrapper_pkg_name, None, wrapper_bytes)?;

// Wire isolator → wrapper (filtered get-environment)
let export_name = format!("environment-{target_name}-get-environment");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two sources of truth for this format string? (cf isolator line 180)

// Apply --env to component environments
// Apply --env to component environments.
// Dynamic env vars are added as-is: if env isolation is active,
// the env map is already prefixed from lock time, and the isolator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// the env map is already prefixed from lock time, and the isolator
// environment variable names are already prefixed in the lockfile, and the isolator

(I had to read the original a few times, sorry.)

// Apply --env to component environments.
// Dynamic env vars are added as-is: if env isolation is active,
// the env map is already prefixed from lock time, and the isolator
// will route vars by prefix. The user provides the prefix they want.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user provides the prefix they want.

I am not sure what this is getting at. The user has no control over the prefix. They have to use the prescribed prefix for the component or dep that will be consuming the EV.

}

#[test]
fn dynamic_env_vars_passed_through_unchanged() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn dynamic_env_vars_passed_through_unchanged() {
fn command_line_env_vars_passed_through_unchanged() {

}

#[test]
fn dynamic_env_vars_passed_through_unchanged() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the significance of "passed through unchanged." Why would they be anything other than "unchanged"? What are you testing here?

Is this a loader test, perhaps, that the bit where it munges env names from the manifest skips names supplied on the CLI? (No, because the loader never sees the CLI vars. Okay I think I see what this is testing now but I'm not sure I can think of a good name for it.)


let cmd = UpCommand {
env: vec![
("MAIN_FOO".into(), "bar".into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per comments on SIP, I find this very awkward. I am concerned about the coupling of internal and external naming. I am bothered that the operator needs to know the internal composition structure. And I dislike that if you have a dep-less component that uses EVs, adding a dep to it is a breaking change to the application (because the existing EV names stop working). I would really like us to explore alternatives here, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants