EAI-7012: Fix Lua filter to suppress x-ai-eg-model when backend is specified#762
Closed
johnl-amd wants to merge 5 commits into
Closed
EAI-7012: Fix Lua filter to suppress x-ai-eg-model when backend is specified#762johnl-amd wants to merge 5 commits into
johnl-amd wants to merge 5 commits into
Conversation
…ecified When a client includes "backend": "<aim-uuid>" in the request body, the Lua filter now sets x-ai-eg-backend and returns immediately without setting x-ai-eg-model. Previously the filter always set x-ai-eg-model from the "model" field, even when "backend" was also present. Envoy evaluates AIGatewayRoute rules in creation order across all routes, so an older AIM's model-name rule (rule 2) could fire before the target AIM's UUID rule (rule 1), routing the request to the wrong AIM and misattributing metrics to it. The fix: when "backend" is extracted from the body (or x-ai-eg-backend is already present in the headers), skip x-ai-eg-model entirely. The UUID rule on the correct AIM's AIGatewayRoute then wins unambiguously regardless of route creation order. Standard OpenAI-compatible clients that omit "backend" are unaffected — they continue to get x-ai-eg-model set for model-name fallback routing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The AI Gateway ext_proc runs before this Lua filter and sets x-ai-eg-model unconditionally from the request body. When a client also includes "backend" in the body, the Lua filter sets x-ai-eg-backend but left x-ai-eg-model in place. The router then evaluated both headers and matched the older AIM's model-name rule first (Envoy cross-route first-match semantics), routing to the wrong AIM. Remove x-ai-eg-model after setting x-ai-eg-backend so only the UUID header remains when routing runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EPP ext_proc filters (one per InferencePool) run after the first Lua filter and re-add x-ai-eg-model from the request body. This causes the oldest AIM's model-name route rule to fire first via Envoy first-match, routing to the wrong AIM when x-ai-eg-backend is set. Add a second EnvoyExtensionPolicy (clear-model-header-for-routing) with a simple header-only Lua filter that removes x-ai-eg-model when x-ai-eg-backend is present. Position it before envoy.filters.http.router via filterOrder so it runs after all EPP ext_procs and has the final word before the router matches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Envoy Gateway allows only one EnvoyExtensionPolicy per Gateway target. The separate clear-model-header-for-routing policy was rejected as Conflicted. Instead, add the clear filter as lua[1] in the existing set-model-header-from-body policy and reference it as lua/1 in filterOrder. Effect is identical: lua/0 runs before ext_authz (sets routing headers from body), lua/1 runs before the router (re-clears x-ai-eg-model after EPP ext_procs re-add it). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The EnvoyProxy CRD only accepts generic filter type names in filterOrder, not specific instance names or envoy.filters.http.router as a before target. Instead of positioning lua/1 after EPP ext_procs, move all ext_proc filters before the lua filter. EPPs then add x-ai-eg-model first, then Lua/0 reads the body, sets x-ai-eg-backend, and removes x-ai-eg-model before ext_authz and the router run. The redundant lua[1] entry is removed. Result filter chain: ext_proc (EPPs) → lua → ext_authz → router Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When two AIMs deploy the same base model in different projects, inference requests routed via the AI Gateway using the
backendfield (AIM UUID) are silently sent to the wrong AIM.Root cause — two interacting bugs:
Lua filter sets
x-ai-eg-modeleven whenbackendis present. When a client includes"backend": "<aim-uuid>"in the request body, the filter was setting bothx-ai-eg-backendandx-ai-eg-model. Only thex-ai-eg-backendextraction was added in d9f5416 (on the EAI-5821 branch, never merged to main); main only setsx-ai-eg-model.Envoy evaluates AIGatewayRoute rules in creation order across all routes. Each AIM gets two rules: rule 1 matches on
x-ai-eg-backend: <uuid>, rule 2 matches onx-ai-eg-model: <model-name>. When both headers are present, an older AIM's rule 2 is evaluated before the target AIM's rule 1. The older AIM wins and the request is routed there instead.Observed symptom: TTFT and token metrics appear on the wrong AIM's workload details page. Confirmed on app-dev with two deployments of
openai/gpt-oss-20b— requests sent to AIM36be1743were actually served by AIM0d659598, which had an older AIGatewayRoute.The
ai-gateway-discoveryside of this (AIGatewayRoute rule ordering) was fixed in core commit31806fe/3570f92b1. This PR fixes the gateway side.Fix
When
"backend"is extracted from the request body, setx-ai-eg-backendand return immediately without settingx-ai-eg-model. This leaves only one matching header in the request, so Envoy's first-match evaluation always reaches the correct AIM's UUID rule.A secondary guard also skips
x-ai-eg-modelifx-ai-eg-backendis already present in the headers (set by the caller directly rather than via the body field).Standard OpenAI-compatible clients that omit
"backend"are unaffected — they continue to getx-ai-eg-modelset for model-name fallback routing.Test plan
targetRevisiontomain-<sha>after merge)36be1743(qa-test-june) with"backend": "36be1743-bea0-42be-9e59-981ec4935ea2"in the body36be1743's workload page, not on0d659598's page"backend"still route correctly via model-name fallback🤖 Generated with Claude Code