Summary
Add support for fetching routing policies from an external HTTP endpoint. Plano makes an HTTP call to a configured URL, receives routing preferences as JSON, and caches them locally with a configurable TTL. This keeps Plano's policy integration generic — the external service can be backed by anything (database, config service, API gateway).
In a multitenant deployment, the caller includes a policy_id and revision in the routing request payload. Plano uses these when fetching and caching the policy — enabling per-tenant/per-customer routing policies with revision-aware caching.
Configuration
routing:
policy_provider:
url: "https://my-service.internal/v1/routing-policy"
headers:
Authorization: "Bearer $POLICY_API_KEY"
ttl_seconds: 300
Routing request
{
"messages": [...],
"policy_id": "customer-abc-123",
"revision": 42
}
revision is a monotonically increasing integer. When the caller sends a higher revision than what's cached, Plano fetches the updated policy.
When policy_id is present and no inline routing_policy is provided, Plano fetches the policy from the configured endpoint:
GET https://my-service.internal/v1/routing-policy?policy_id=customer-abc-123&revision=42
Routing response (returned to the caller)
{
"model": "gpt-4o",
"route": "quick",
"trace_id": "abc123..."
}
Expected payload from external policy endpoint
{
"policy_id": "customer-abc-123",
"revision": 42,
"schema_version": "v1",
"routing_preferences": [
{
"model": "gpt-4o",
"routing_preferences": [
{"name": "quick response", "description": "fast lightweight responses"}
]
},
{
"model": "claude-sonnet",
"routing_preferences": [
{"name": "deep analysis", "description": "comprehensive detailed analysis"}
]
}
]
}
policy_id — identifies the policy (e.g. per-customer)
revision — monotonically increasing integer indicating which revision of the policy
schema_version — the format of the policy document itself (e.g. "v1"). Plano validates this and rejects unsupported versions. First implementation supports v1 only.
Caching behavior
- Cache key:
policy_id with stored revision
- On request: if cached revision >= requested revision, use cache. If requested revision > cached revision, fetch fresh.
- Cache entries also expire via TTL as a safety net
- If
revision is omitted, cache key is just policy_id and TTL is the only invalidation
Flow
- Routing request comes in with
policy_id and revision (and no inline policy)
- Plano checks local cache for
policy_id
- If cached and cached revision >= requested revision, use cached policy
- Otherwise, make HTTP request to configured URL with
policy_id and revision
- Validate response:
policy_id and revision match, schema_version is supported
- Cache result and pass policy to
RouterService::determine_route()
Resolution order
- Inline
routing_policy in request payload (highest priority)
policy_id + revision → HTTP policy provider (with cache)
- Config-file preferences (default fallback)
Summary
Add support for fetching routing policies from an external HTTP endpoint. Plano makes an HTTP call to a configured URL, receives routing preferences as JSON, and caches them locally with a configurable TTL. This keeps Plano's policy integration generic — the external service can be backed by anything (database, config service, API gateway).
In a multitenant deployment, the caller includes a
policy_idandrevisionin the routing request payload. Plano uses these when fetching and caching the policy — enabling per-tenant/per-customer routing policies with revision-aware caching.Configuration
Routing request
{ "messages": [...], "policy_id": "customer-abc-123", "revision": 42 }revisionis a monotonically increasing integer. When the caller sends a higher revision than what's cached, Plano fetches the updated policy.When
policy_idis present and no inlinerouting_policyis provided, Plano fetches the policy from the configured endpoint:Routing response (returned to the caller)
{ "model": "gpt-4o", "route": "quick", "trace_id": "abc123..." }Expected payload from external policy endpoint
{ "policy_id": "customer-abc-123", "revision": 42, "schema_version": "v1", "routing_preferences": [ { "model": "gpt-4o", "routing_preferences": [ {"name": "quick response", "description": "fast lightweight responses"} ] }, { "model": "claude-sonnet", "routing_preferences": [ {"name": "deep analysis", "description": "comprehensive detailed analysis"} ] } ] }policy_id— identifies the policy (e.g. per-customer)revision— monotonically increasing integer indicating which revision of the policyschema_version— the format of the policy document itself (e.g. "v1"). Plano validates this and rejects unsupported versions. First implementation supportsv1only.Caching behavior
policy_idwith storedrevisionrevisionis omitted, cache key is justpolicy_idand TTL is the only invalidationFlow
policy_idandrevision(and no inline policy)policy_idpolicy_idandrevisionpolicy_idandrevisionmatch,schema_versionis supportedRouterService::determine_route()Resolution order
routing_policyin request payload (highest priority)policy_id+revision→ HTTP policy provider (with cache)