Skip to content

EAI-7012: Fix Lua filter to suppress x-ai-eg-model when backend is specified#762

Closed
johnl-amd wants to merge 5 commits into
mainfrom
EAI-7012-fix-lua-backend-routing
Closed

EAI-7012: Fix Lua filter to suppress x-ai-eg-model when backend is specified#762
johnl-amd wants to merge 5 commits into
mainfrom
EAI-7012-fix-lua-backend-routing

Conversation

@johnl-amd

Copy link
Copy Markdown
Contributor

Problem

When two AIMs deploy the same base model in different projects, inference requests routed via the AI Gateway using the backend field (AIM UUID) are silently sent to the wrong AIM.

Root cause — two interacting bugs:

  1. Lua filter sets x-ai-eg-model even when backend is present. When a client includes "backend": "<aim-uuid>" in the request body, the filter was setting both x-ai-eg-backend and x-ai-eg-model. Only the x-ai-eg-backend extraction was added in d9f5416 (on the EAI-5821 branch, never merged to main); main only sets x-ai-eg-model.

  2. Envoy evaluates AIGatewayRoute rules in creation order across all routes. Each AIM gets two rules: rule 1 matches on x-ai-eg-backend: <uuid>, rule 2 matches on x-ai-eg-model: <model-name>. When both headers are present, an older AIM's rule 2 is evaluated before the target AIM's rule 1. The older AIM wins and the request is routed there instead.

Observed symptom: TTFT and token metrics appear on the wrong AIM's workload details page. Confirmed on app-dev with two deployments of openai/gpt-oss-20b — requests sent to AIM 36be1743 were actually served by AIM 0d659598, which had an older AIGatewayRoute.

The ai-gateway-discovery side of this (AIGatewayRoute rule ordering) was fixed in core commit 31806fe / 3570f92b1. This PR fixes the gateway side.

Fix

When "backend" is extracted from the request body, set x-ai-eg-backend and return immediately without setting x-ai-eg-model. This leaves only one matching header in the request, so Envoy's first-match evaluation always reaches the correct AIM's UUID rule.

A secondary guard also skips x-ai-eg-model if x-ai-eg-backend is already present in the headers (set by the caller directly rather than via the body field).

Standard OpenAI-compatible clients that omit "backend" are unaffected — they continue to get x-ai-eg-model set for model-name fallback routing.

Test plan

  • Deploy to app-dev (update targetRevision to main-<sha> after merge)
  • Send requests to AIM 36be1743 (qa-test-june) with "backend": "36be1743-bea0-42be-9e59-981ec4935ea2" in the body
  • Verify TTFT and request count metrics appear on 36be1743's workload page, not on 0d659598's page
  • Verify standard requests without "backend" still route correctly via model-name fallback

🤖 Generated with Claude Code

…ecified

When a client includes "backend": "<aim-uuid>" in the request body, the Lua
filter now sets x-ai-eg-backend and returns immediately without setting
x-ai-eg-model.

Previously the filter always set x-ai-eg-model from the "model" field,
even when "backend" was also present. Envoy evaluates AIGatewayRoute rules
in creation order across all routes, so an older AIM's model-name rule
(rule 2) could fire before the target AIM's UUID rule (rule 1), routing
the request to the wrong AIM and misattributing metrics to it.

The fix: when "backend" is extracted from the body (or x-ai-eg-backend is
already present in the headers), skip x-ai-eg-model entirely. The UUID
rule on the correct AIM's AIGatewayRoute then wins unambiguously regardless
of route creation order.

Standard OpenAI-compatible clients that omit "backend" are unaffected —
they continue to get x-ai-eg-model set for model-name fallback routing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@johnl-amd johnl-amd requested a review from a team as a code owner June 23, 2026 11:07
johnl-amd and others added 4 commits June 23, 2026 11:43
The AI Gateway ext_proc runs before this Lua filter and sets x-ai-eg-model
unconditionally from the request body. When a client also includes "backend"
in the body, the Lua filter sets x-ai-eg-backend but left x-ai-eg-model in
place. The router then evaluated both headers and matched the older AIM's
model-name rule first (Envoy cross-route first-match semantics), routing to
the wrong AIM.

Remove x-ai-eg-model after setting x-ai-eg-backend so only the UUID header
remains when routing runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EPP ext_proc filters (one per InferencePool) run after the first Lua filter
and re-add x-ai-eg-model from the request body. This causes the oldest AIM's
model-name route rule to fire first via Envoy first-match, routing to the
wrong AIM when x-ai-eg-backend is set.

Add a second EnvoyExtensionPolicy (clear-model-header-for-routing) with a
simple header-only Lua filter that removes x-ai-eg-model when x-ai-eg-backend
is present. Position it before envoy.filters.http.router via filterOrder so it
runs after all EPP ext_procs and has the final word before the router matches.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Envoy Gateway allows only one EnvoyExtensionPolicy per Gateway target.
The separate clear-model-header-for-routing policy was rejected as Conflicted.

Instead, add the clear filter as lua[1] in the existing set-model-header-from-body
policy and reference it as lua/1 in filterOrder. Effect is identical: lua/0 runs
before ext_authz (sets routing headers from body), lua/1 runs before the router
(re-clears x-ai-eg-model after EPP ext_procs re-add it).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The EnvoyProxy CRD only accepts generic filter type names in filterOrder,
not specific instance names or envoy.filters.http.router as a before target.

Instead of positioning lua/1 after EPP ext_procs, move all ext_proc filters
before the lua filter. EPPs then add x-ai-eg-model first, then Lua/0 reads
the body, sets x-ai-eg-backend, and removes x-ai-eg-model before ext_authz
and the router run. The redundant lua[1] entry is removed.

Result filter chain: ext_proc (EPPs) → lua → ext_authz → router

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@johnl-amd johnl-amd closed this Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant