fix(network-agent): allow dots in sandboxID validation#163
Conversation
|
/cubebot review |
|
Claude finished @fslongjin's task in 1m 10s —— View job PR Review: fix(network-agent): allow dots in sandboxID validation
Verdict: LGTM — the fix is correct and well-aligned with existing patternsThe change ( // Before:
if strings.ContainsAny(sandboxID, `/\.`) || sandboxID == "" {
// After:
if sandboxID == "" || strings.ContainsAny(sandboxID, `/\`) || strings.Contains(sandboxID, "..") {CorrectnessThe fix correctly distinguishes between a single Consistency with codebaseThis now matches exactly the pattern used in if strings.ContainsAny(id, `/\`) || strings.Contains(id, "..") {It also aligns with if strings.ContainsAny(untrusted, `/\`) || untrusted == "." || untrusted == ".." ||
strings.Contains(untrusted, "..") {One subtle difference: SecurityPath traversal protection remains intact:
Suggestion (non-blocking)The existing func TestStateStorePathValidation(t *testing.T) {
store, _ := newStateStore(t.TempDir())
tests := []struct {
id string
wantErr bool
}{
{"valid-id-123", false},
{"cubesandbox-python-slim-3.11-nydus_0", false}, // single dots ok
{"", true}, // empty
{"foo/bar", true}, // forward slash
{"foo\\bar", true}, // backslash
{"foo..bar", true}, // double dot
{"../etc/passwd", true}, // traversal prefix
}
for _, tt := range tests {
_, err := store.path(tt.id)
if (err != nil) != tt.wantErr {
t.Errorf("path(%q) err=%v, wantErr=%v", tt.id, err, tt.wantErr)
}
}
}This mirrors the test coverage that already exists for |
|
@rogeroger-yu Hi, could you please add some unit tests to stabilize this behavior? |
|
@fslongjin okey. |
164bd78 to
1713a7c
Compare
1713a7c to
e15e0a4
Compare
|
After reviewing this, I don't think this is the right way to fix it. The sandbox ID should be designed as a UUID. The root cause here is that when CubeMaster generates the template, it sends the template OCI image name to Cubelet as the sandbox ID. |
e15e0a4 to
16fd291
Compare
|
@fslongjin I agree with your judgment and resubmit a version. I will review it after my hard work. |
|
|
||
| templateID, ok := opts.GetSnapshotTemplateID() | ||
| if !ok { | ||
| if _, ok := opts.GetSnapshotTemplateID(); !ok { |
There was a problem hiding this comment.
This if statement should be dropped.
16fd291 to
5a7bee8
Compare
Cubelet: generate UUID-based sandboxID for snapshot creation
5a7bee8 to
4f8cedb
Compare
|
cc @fslongjin |
|
/cubebot review |
|
Claude finished @kinwin-ustc's task in 1m 58s —— View job Review: fix: use UUID-based sandboxID
Verdict: Core fix is correct, but has unaddressed downstream breakageThe UUID-based approach (as suggested by @fslongjin) is the right fix — prevent dotted IDs from ever reaching state_store rather than weakening validation. However, the change leaves two stale Change 1:
|
| Test case | ID | wantErr | Correct? |
|---|---|---|---|
| valid simple id | sb-1 |
false | ✓ UUID-safe format |
| valid uuid | a1b2c3d4e5f6 |
false | ✓ matches GenerateID output |
| valid snapshot id | a1b2c3d4e5f6_snapshot |
false | ✓ matches new snapshot format |
| empty string | `` | true | ✓ |
| contains dot | sb.1 |
true | ✓ still rejected by state_store |
| forward slash | a/b |
true | ✓ |
| backslash | a\\b |
true | ✓ |
| dot prefix | .hidden |
true | ✓ |
| double dot traversal | .. |
true | ✓ |
| oci image name | registry.example.com/path/image:tag |
true | ✓ |
The oci image name case is good documentation: it proves state_store remains strict — but since Cubelet no longer sends IDs in this format, the validation will never fire in normal operation.
⚠️ Issue: Stale templateID + "_0" references will break
Two files still hardcode the old sandboxID format and will become stale after this commit:
1. Cubelet/services/cubebox/appsnapshot.go:116
expectedSandboxID := templateID + "_0"This cleanup code (triggered when cubebox.Create returns PreConditionFailed) reconstructs the sandbox ID in the old format. With UUID-based IDs, there's no deterministic mapping from templateID to sandboxID, so the cleanup will attempt to destroy a non-existent sandbox. The correct approach would be to use createRsp.SandboxID (the actual sandboxID from the workflow response).
2. Cubelet/plugins/cbri/cubeboxcbri/cubebox.go:210
annotations[constants.AnnotationAppSnapshotContainerID] = templateID + "_0"This sets an annotation in the old format. If anything downstream reads this annotation and expects it to match the sandboxID, it will break. The annotation value is no longer correlated with the actual sandboxID.
These files are NOT modified by this PR, but they will be logically inconsistent after this change lands.
Suggestion: Fix appsnapshot.go:116 to use the response sandboxID
The cleanup path should reference createRsp.SandboxID instead of reconstructing the ID:
// Before (appsnapshot.go:116):
expectedSandboxID := templateID + "_0"
// After:
expectedSandboxID := createRsp.SandboxIDMinor: PR description may need updating
The PR body still references "Fix overly strict sandboxID validation in network-agent's state_store" but state_store.go:95 was not changed — it still uses strings.ContainsAny(sandboxID, "/\\."). The actual fix is at the Cubelet level (use UUIDs instead of template IDs). The description could be updated to reflect the actual approach.
Summary
| Aspect | Assessment |
|---|---|
| Core fix | ✅ Correct |
| Test coverage | ✅ Good |
| Security hardening | ✅ Improved (no external string data enters sandboxID) |
| Consistency with codebase | ✅ Aligns with reviewer guidance |
| Downstream correctness | appsnapshot.go:116 and cubebox.go:210 need updating |
Summary
rejected any ID containing
.(dot). IDs likecubesandbox-python-slim-3.11-nydus_0(containing version numbers) were incorrectly flagged as path traversal.
Changed to only reject
/,\, and..— consistent with Cubelet'spathutil.ValidateID()./etc/NetworkManager/dnsmasq.ddoes notexist (hosts using systemd-resolved instead of NM-dnsmasq).
already pre-staged locally.
Root Cause
state_store.go:path()usedstrings.ContainsAny(sandboxID, "/\\.")whichtreats a single
.the same as path separators. A dot in a version string(e.g.
3.11) is perfectly safe — only..is a traversal risk.Test Plan
Original Error