-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Problem
When /api/register returns an error response (e.g. 403 Forbidden), vpc-node-setup.sh writes the string "null" to /shared/server_url, /shared/pre_auth_key, and /shared/shared_key. The
VPC client then loops forever:
Received error: fetch control key: Get "null/key?v=123": unsupported protocol scheme ""
The script also has no retry logic — it runs once and exits, so there's no recovery if the VPC server isn't ready or the app isn't in the allowlist yet.
Root Cause
Two issues in scripts/vpc-node-setup.sh:
1. jq -r returns string "null" for missing fields, which passes the -z check:
PRE_AUTH_KEY=$(jq -r .pre_auth_key <<<"$RESPONSE")
# When RESPONSE is {"error":"Forbidden"}, jq -r .pre_auth_key outputs "null" (non-empty string)
if [ -z "$PRE_AUTH_KEY" ] || [ -z "$SHARED_KEY" ] || [ -z "$VPC_SERVER_URL" ]; then
# This check PASSES because "null" is non-empty2. No retry logic:
The script runs the registration request once. In orchestrated environments, the VPC server's ALLOWED_APPS may be updated after the node is deployed, creating a race condition where the node
always gets 403.
Reproduction
- Deploy a VPC node whose
app_idis NOT yet inVPC_ALLOWED_APPS /api/registerreturns{"error":"Forbidden"}jq -r .server_urloutputsnull(string, not empty)- Script writes
nullto/shared/server_urland exits reporting "VPC setup completed" - VPC client loops forever on
Get "null/key?v=123": unsupported protocol scheme ""
Suggested Fix
# Use jq '// empty' to return empty string instead of "null" for missing fields
PRE_AUTH_KEY=$(jq -r '.pre_auth_key // empty' <<<"$RESPONSE")
SHARED_KEY=$(jq -r '.shared_key // empty' <<<"$RESPONSE")
VPC_SERVER_URL=$(jq -r '.server_url // empty' <<<"$RESPONSE")
# Add retry loop for race conditions
MAX_RETRIES=30
RETRY_INTERVAL=10
for i in $(seq 1 $MAX_RETRIES); do
RESPONSE=$(curl -s ...)
PRE_AUTH_KEY=$(jq -r '.pre_auth_key // empty' <<<"$RESPONSE")
# ... parse other fields ...
if [ -n "$PRE_AUTH_KEY" ] && [ -n "$SHARED_KEY" ] && [ -n "$VPC_SERVER_URL" ]; then
break
fi
echo "Registration failed (attempt $i/$MAX_RETRIES): $RESPONSE"
sleep $RETRY_INTERVAL
doneEnvironment
- dstack-vpc: main branch
- Phala CVM with cgroup v2
- VPC server with
VPC_ALLOWED_APPSset to specific app IDs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels