Stackbilt-dev · stackbilt-admin · May 6, 2026 · May 6, 2026 · May 6, 2026
diff --git a/README.md b/README.md
@@ -43,8 +43,8 @@ Zero LLM calls for file generation. ~20ms for structure, ~2s with oracle prose.
 - **OAuth 2.1 with PKCE** — GitHub SSO, Google SSO, and email/password authentication
 - **Backend adapter pattern** — tool catalogs aggregated from multiple service bindings, namespaced to avoid collisions
 - **Per-tier rate limiting** — fixed-window per-tenant limits via `RATELIMIT_KV` (free=20/min, hobby=60, pro=300, enterprise=1000); 429 with `Retry-After` and `X-RateLimit-*` headers
-- **Cost attribution & quota** — every tool call carries a credit cost; quota is reserved via `edge-auth` before dispatch and committed/refunded on outcome; `image_generate` cost scales with `quality_tier` (1×/1×/3×/5×/8× for draft/standard/premium/ultra/ultra_plus)
-- **Scope + tier enforcement** — `tools/list` is filtered by token scopes; `tools/call` requires the `generate` scope for mutating tools; expensive `image_generate` quality tiers (`premium` and above) are gated to Pro+ plans
+- **Cost attribution & quota** — every tool call carries a credit cost; quota is reserved via `edge-auth` before dispatch and committed/refunded on outcome; `image_generate` cost scales with the effective quality tier (1×/1×/3×/5×/8× for draft/standard/premium/ultra/ultra_plus); when `model` is set it takes billing precedence over `quality_tier`
+- **Scope + tier enforcement** — `tools/list` is filtered by token scopes; `tools/call` requires the `generate` scope for mutating tools; expensive `image_generate` quality tiers (`premium` and above) are gated to Pro+ plans; specifying `model` directly enforces the same gate via model→tier mapping
 - **Security Constitution compliance** — every tool declares a risk level (`READ_ONLY`, `LOCAL_MUTATION`, `EXTERNAL_MUTATION`); structured audit logging with secret redaction; HMAC-signed identity tokens
 - **Coming-soon gate** — `PUBLIC_SIGNUPS_ENABLED` flag to control public access
 - **MCP JSON-RPC over HTTP** — supports both streaming (SSE) and request/response transport

diff --git a/docs/api-reference.md b/docs/api-reference.md
@@ -208,7 +208,13 @@ Routed to the `IMG_FORGE` service binding (`img-forge-mcp`).
 Generate an image from a text prompt.
 
 - **Risk level**: `EXTERNAL_MUTATION`
-- **Arguments**: `prompt` (string), plus optional model/quality parameters
+- **Arguments**:
+  - `prompt` (string, required) — text description of the image
+  - `quality_tier` (string, optional) — `draft`, `standard` (default), `premium`, `ultra`, `ultra_plus`
+  - `negative_prompt` (string, optional) — things to avoid; effective for `draft` tier only
+  - `aspect_ratio` (string, optional) — `1:1` (default), `3:2`, `2:3`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
+  - `image_size` (string, optional) — `512`, `1K` (default), `2K`, `4K`
+  - `model` (string, optional) — `gemini-3.1-flash-image-preview` (maps to `ultra`), `gemini-3-pro-image-preview` (maps to `ultra_plus`); when set, takes billing and tier-enforcement precedence over `quality_tier`
 
 ### `image_list_models`
 
@@ -362,11 +368,11 @@ The window is fixed (aligned to the start of each 60-second slot), not sliding.
 
 ## Quota & Cost Attribution
 
-Mutating tool calls reserve credits via `AUTH_SERVICE.consumeQuota` before dispatch. The cost table lives in `src/cost-attribution.ts`; `image_generate` cost is `5 × quality multiplier` where multipliers are `draft=1, standard=1, premium=3, ultra=5, ultra_plus=8`. Read-only tools (`*_status`, `*_classify`, `image_list_models`, etc.) are free.
+Mutating tool calls reserve credits via `AUTH_SERVICE.consumeQuota` before dispatch. The cost table lives in `src/cost-attribution.ts`; `image_generate` cost is `5 × quality multiplier` where multipliers are `draft=1, standard=1, premium=3, ultra=5, ultra_plus=8`. When `model` is set, the effective tier is derived from the model (`gemini-3.1-flash-image-preview` → `ultra`, `gemini-3-pro-image-preview` → `ultra_plus`) and takes precedence over `quality_tier` for billing. Read-only tools (`*_status`, `*_classify`, `image_list_models`, etc.) are free.
 
 If quota is exceeded, the call is rejected with `INVALID_PARAMS` and the message `Quota exceeded for <tool>`.
 
-For free and hobby tiers, `image_generate` quality tiers above `standard` are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher` — these calls do not reach the backend or consume quota.
+For free and hobby tiers, `image_generate` quality tiers above `standard` are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher` — these calls do not reach the backend or consume quota. This gate applies whether the tier is set via `quality_tier` or derived from `model`.
 
 ---
 

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -170,7 +170,7 @@ Risk levels drive both audit classification AND authorization:
 
 - **`tools/list` filter** — `READ_ONLY` tools are visible to any authenticated session; tools with any other risk level are hidden from sessions that lack the `generate` scope.
 - **`tools/call` enforcement** — `LOCAL_MUTATION`, `EXTERNAL_MUTATION`, and `DESTRUCTIVE` tools require the `generate` scope and return `INVALID_REQUEST` with audit outcome `insufficient_scope` otherwise.
-- **Tier-restricted quality tiers** — `image_generate` arguments with `quality_tier` of `premium`, `ultra`, or `ultra_plus` require a Pro+ plan; free/hobby calls are rejected at the gateway with audit outcome `tier_denied` (see `enforceTierRestriction` in `src/gateway.ts`).
+- **Tier-restricted quality tiers** — `image_generate` with `quality_tier` of `premium`, `ultra`, or `ultra_plus`, or with `model` set to a Gemini variant, requires a Pro+ plan; the effective tier is resolved via `resolveImageQualityTier` (model wins over `quality_tier`); free/hobby calls are rejected at the gateway with audit outcome `tier_denied` (see `enforceTierRestriction` in `src/gateway.ts`).
 
 ## Audit — `audit.ts`
 
@@ -237,7 +237,7 @@ The gateway-side limiter fires first (immediately after auth resolution) and sho
 
 ### Quota & Cost Attribution
 
-`src/cost-attribution.ts` declares per-tool credit costs and an `image_generate` quality multiplier (`draft=1, standard=1, premium=3, ultra=5, ultra_plus=8` × `image_generate.baseCost: 5`). On `tools/call`:
+`src/cost-attribution.ts` declares per-tool credit costs and an `image_generate` quality multiplier (`draft=1, standard=1, premium=3, ultra=5, ultra_plus=8` × `image_generate.baseCost: 5`). When `model` is set, `resolveImageQualityTier` maps it to the effective tier (`gemini-3.1-flash-image-preview` → `ultra`, `gemini-3-pro-image-preview` → `ultra_plus`) before applying the multiplier — model wins over `quality_tier`. On `tools/call`:
 
 1. Resolve cost via `resolveToolCost(toolName, args)`.
 2. If cost is non-zero, call `AUTH_SERVICE.consumeQuota({tenantId, userId, feature, amount})`. On failure, reject with `INVALID_PARAMS` and audit outcome `tier_denied` (overloaded — see follow-ups).

diff --git a/docs/user-guide.md b/docs/user-guide.md
@@ -237,7 +237,7 @@ The client calls `image_generate` with your prompt. img-forge enhances the promp
 }
 ```
 
-**Quality tiers**: `draft` (fastest, SDXL), `standard` (FLUX Klein, default), `premium` (FLUX Dev), `ultra` (Gemini 2.5 Flash), `ultra_plus` (Gemini 3.1 Flash). See [§5 Quota & Billing](#5-quota--billing) for credit costs and plan availability — `premium` and above require Pro or Enterprise.
+**Quality tiers**: `draft` (fastest, SDXL), `standard` (FLUX Klein, default), `premium` (FLUX Dev), `ultra` (`gemini-3.1-flash-image-preview`), `ultra_plus` (`gemini-3-pro-image-preview`). You can also pass `aspect_ratio` (e.g. `2:3` for portrait), `image_size` (`512`/`1K`/`2K`/`4K`), and `model` directly. When `model` is set it determines the billing tier. See [§5 Quota & Billing](#5-quota--billing) for credit costs and plan availability — `premium` and above require Pro or Enterprise.
 
 ### Classify Intent
 
@@ -344,15 +344,15 @@ Most read-only tools (`*_status`, `*_classify`, `*_summary`, `*_quality`, `*_gov
 
 ### `image_generate` quality multipliers
 
-| Quality tier | Multiplier | Effective cost | Available on |
-|--------------|-----------|----------------|--------------|
-| `draft` | 1× | 5 credits | All tiers |
-| `standard` | 1× | 5 credits | All tiers |
-| `premium` | 3× | 15 credits | Pro + Enterprise only |
-| `ultra` | 5× | 25 credits | Pro + Enterprise only |
-| `ultra_plus` | 8× | 40 credits | Pro + Enterprise only |
+| Quality tier | Model | Multiplier | Effective cost | Available on |
+|--------------|-------|-----------|----------------|--------------|
+| `draft` | SDXL | 1× | 5 credits | All plans |
+| `standard` | FLUX Klein | 1× | 5 credits | All plans |
+| `premium` | FLUX Dev | 3× | 15 credits | Pro + Enterprise only |
+| `ultra` | `gemini-3.1-flash-image-preview` | 5× | 25 credits | Pro + Enterprise only |
+| `ultra_plus` | `gemini-3-pro-image-preview` | 8× | 40 credits | Pro + Enterprise only |
 
-Free and Hobby plans can request `draft` or `standard` only. Calls with higher quality tiers are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher`.
+When `model` is set explicitly, billing uses the model-derived tier regardless of `quality_tier` — `gemini-3-pro-image-preview` always bills at `ultra_plus` (40 credits). Free and Hobby plans can request `draft` or `standard` only; calls with higher tiers (or Gemini models) are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher`.
 
 ### How metering works
 

diff --git a/src/cost-attribution.ts b/src/cost-attribution.ts
@@ -52,6 +52,21 @@ const IMAGE_QUALITY_MULTIPLIER: Record<string, number> = {
   ultra_plus: 8,
 };
 
+// Maps model names to their effective quality tier for quota/audit.
+// model wins over quality_tier when both are present.
+const MODEL_QUALITY_TIER: Record<string, string> = {
+  'gemini-3.1-flash-image-preview': 'ultra',
+  'gemini-3-pro-image-preview':     'ultra_plus',
+};
+
+/** Resolve the effective quality tier from image_generate args. Model wins when set. */
+export function resolveImageQualityTier(args?: Record<string, unknown>): string {
+  if (args?.model) {
+    return MODEL_QUALITY_TIER[args.model as string] ?? (args.quality_tier as string) ?? 'standard';
+  }
+  return (args?.quality_tier as string) ?? 'standard';
+}
+
 /**
  * Resolve the credit cost for a tool call, factoring in quality tier for images.
  */
@@ -65,9 +80,10 @@ export function resolveToolCost(
     return { baseCost: 1, feature: `mcp.${toolName}` };
   }
 
-  // Apply quality multiplier for image_generate
-  if (toolName === 'image_generate' && args?.quality_tier) {
-    const multiplier = IMAGE_QUALITY_MULTIPLIER[args.quality_tier as string] ?? 1;
+  // Apply quality multiplier for image_generate (model wins over quality_tier)
+  if (toolName === 'image_generate') {
+    const effectiveTier = resolveImageQualityTier(args);
+    const multiplier = IMAGE_QUALITY_MULTIPLIER[effectiveTier] ?? 1;
     return { ...base, baseCost: base.baseCost * multiplier };
   }
 

diff --git a/src/gateway.ts b/src/gateway.ts
@@ -14,7 +14,7 @@ import { publishToGitHub } from './scaffold-publish.js';
 import { classifyIntention, type IntentClassification } from './intent-classifier.js';
 import { logDivergence } from './divergence-logger.js';
 import { checkRateLimit, rateLimitHeaders, type RateLimitResult } from './rate-limiter.js';
-import { reserveQuota, settleQuota, buildCostAttribution, isFreeTool } from './cost-attribution.js';
+import { reserveQuota, settleQuota, buildCostAttribution, isFreeTool, resolveImageQualityTier } from './cost-attribution.js';
 
 const MCP_PROTOCOL_VERSION = '2025-03-26';
 const JSON_RPC_PARSE_ERROR = -32700;
@@ -189,7 +189,7 @@ function enforceTierRestriction(
   tier: Tier,
 ): string | null {
   if (toolName !== 'image_generate') return null;
-  const qualityTier = (args?.quality_tier as string) ?? 'standard';
+  const qualityTier = resolveImageQualityTier(args);
   const allowed = TIER_ALLOWED_QUALITY[tier];
   if (!allowed || allowed.has(qualityTier)) return null;
   return `Quality tier "${qualityTier}" requires a Pro plan or higher. Your current plan: ${tier}. Available tiers: ${[...allowed].join(', ')}.`;

diff --git a/src/tool-registry.ts b/src/tool-registry.ts
@@ -141,7 +141,8 @@ const TOOL_SPECS: ToolSpec[] = [
       'Generate an image from a text prompt. Returns a URL to the generated image ' +
       'and metadata about how the prompt was enhanced. Supports 5 quality tiers: ' +
       'draft (fastest, SDXL), standard (FLUX Klein, default), premium (FLUX Dev), ' +
-      'ultra (Gemini 2.5 Flash), ultra_plus (Gemini 3.1 Flash). ' +
+      'ultra (gemini-3.1-flash-image-preview), ultra_plus (gemini-3-pro-image-preview). ' +
+      'Optionally override aspect ratio, output resolution, and model directly. ' +
       'Generation takes 5-30 seconds depending on tier.',
     inputSchema: {
       type: 'object',
@@ -166,6 +167,26 @@ const TOOL_SPECS: ToolSpec[] = [
           type: 'string',
           description: 'Things to avoid in the image (only effective for draft tier with SDXL).',
         },
+        aspect_ratio: {
+          type: 'string',
+          enum: ['1:1', '3:2', '2:3', '3:4', '4:3', '4:5', '5:4', '9:16', '16:9', '21:9'],
+          default: '1:1',
+          description: 'Image aspect ratio. Defaults to 1:1 (square).',
+        },
+        image_size: {
+          type: 'string',
+          enum: ['512', '1K', '2K', '4K'],
+          default: '1K',
+          description: 'Output resolution. Defaults to 1K.',
+        },
+        model: {
+          type: 'string',
+          enum: ['gemini-3.1-flash-image-preview', 'gemini-3-pro-image-preview'],
+          description:
+            'Model override. gemini-3.1-flash-image-preview=ultra tier, ' +
+            'gemini-3-pro-image-preview=ultra_plus tier (Pro plan required). ' +
+            'When set, model takes precedence over quality_tier for billing.',
+        },
       },
       required: ['prompt'],
     },

diff --git a/test/cost-attribution.test.ts b/test/cost-attribution.test.ts
@@ -5,6 +5,7 @@ import {
   reserveQuota,
   settleQuota,
   buildCostAttribution,
+  resolveImageQualityTier,
 } from '../src/cost-attribution.js';
 import type { AuthServiceRpc } from '../src/types.js';
 
@@ -42,6 +43,29 @@ describe('resolveToolCost', () => {
     expect(ultra.baseCost).toBeGreaterThan(draft.baseCost);
   });
 
+  it('applies model-derived tier for image_generate (model wins)', () => {
+    const flash = resolveToolCost('image_generate', { model: 'gemini-3.1-flash-image-preview' });
+    const pro   = resolveToolCost('image_generate', { model: 'gemini-3-pro-image-preview' });
+    expect(flash.baseCost).toBe(5 * 5);   // ultra multiplier
+    expect(pro.baseCost).toBe(5 * 8);     // ultra_plus multiplier
+  });
+
+  it('model wins over quality_tier when both are set', () => {
+    const cost = resolveToolCost('image_generate', {
+      model: 'gemini-3-pro-image-preview',
+      quality_tier: 'standard',
+    });
+    expect(cost.baseCost).toBe(5 * 8); // ultra_plus, not standard
+  });
+
+  it('falls back to quality_tier for unknown model', () => {
+    const cost = resolveToolCost('image_generate', {
+      model: 'unknown-model',
+      quality_tier: 'premium',
+    });
+    expect(cost.baseCost).toBe(5 * 3); // premium multiplier
+  });
+
   it('returns default cost for unknown tools', () => {
     const cost = resolveToolCost('unknown_tool');
     expect(cost.baseCost).toBe(1);
@@ -142,3 +166,40 @@ describe('buildCostAttribution', () => {
     expect(attr.creditCost).toBe(2);
   });
 });
+
+describe('resolveImageQualityTier', () => {
+  it('returns quality_tier when no model', () => {
+    expect(resolveImageQualityTier({ quality_tier: 'premium' })).toBe('premium');
+  });
+
+  it('returns standard when no args', () => {
+    expect(resolveImageQualityTier()).toBe('standard');
+    expect(resolveImageQualityTier({})).toBe('standard');
+  });
+
+  it('maps gemini-3.1-flash-image-preview to ultra', () => {
+    expect(resolveImageQualityTier({ model: 'gemini-3.1-flash-image-preview' })).toBe('ultra');
+  });
+
+  it('maps gemini-3-pro-image-preview to ultra_plus', () => {
+    expect(resolveImageQualityTier({ model: 'gemini-3-pro-image-preview' })).toBe('ultra_plus');
+  });
+
+  it('model wins over quality_tier', () => {
+    expect(resolveImageQualityTier({
+      model: 'gemini-3-pro-image-preview',
+      quality_tier: 'standard',
+    })).toBe('ultra_plus');
+  });
+
+  it('falls back to quality_tier for unknown model', () => {
+    expect(resolveImageQualityTier({
+      model: 'unknown-future-model',
+      quality_tier: 'ultra',
+    })).toBe('ultra');
+  });
+
+  it('falls back to standard for unknown model with no quality_tier', () => {
+    expect(resolveImageQualityTier({ model: 'unknown-future-model' })).toBe('standard');
+  });
+});
diff --git a/test/gateway.test.ts b/test/gateway.test.ts
@@ -441,6 +441,72 @@ describe('handleMcpRequest', () => {
       expect(body.result).toBeTruthy();
     });
 
+    it('denies free-tier user who passes model=gemini-3-pro-image-preview without quality_tier', async () => {
+      const env = makeEnv({
+        AUTH_SERVICE: {
+          ...mockAuthService(),
+          validateApiKey: async () => ({
+            valid: true,
+            tenant_id: 'tenant-1',
+            tier: 'free',
+            scopes: ['generate'],
+          }),
+        },
+      });
+
+      const initReq = rpcRequest('initialize', { protocolVersion: '2025-03-26', capabilities: {}, clientInfo: { name: 'test' } });
+      const initRes = await handleMcpRequest(initReq, env);
+      const sessionId = initRes.headers.get('MCP-Session-Id')!;
+
+      const req = rpcRequest(
+        'tools/call',
+        { name: 'image_generate', arguments: { prompt: 'a tarot card', model: 'gemini-3-pro-image-preview' } },
+        { 'MCP-Session-Id': sessionId },
+      );
+      const res = await handleMcpRequest(req, env);
+      const body = await res.json() as any;
+
+      expect(body.result?.isError ?? body.error).toBeTruthy();
+      const text = body.result?.content?.[0]?.text ?? body.error?.message ?? '';
+      expect(text).toMatch(/pro plan|ultra_plus/i);
+    });
+
+    it('allows pro-tier user who passes model=gemini-3-pro-image-preview', async () => {
+      const env = makeEnv({
+        AUTH_SERVICE: {
+          ...mockAuthService(),
+          validateApiKey: async () => ({
+            valid: true,
+            tenant_id: 'tenant-1',
+            tier: 'pro',
+            scopes: ['generate'],
+          }),
+        },
+        IMG_FORGE: {
+          fetch: async () => new Response(JSON.stringify({
+            jsonrpc: '2.0', id: 1,
+            result: { content: [{ type: 'text', text: 'generated' }] },
+          }), { headers: { 'Content-Type': 'application/json' } }),
+          connect: () => { throw new Error('not implemented'); },
+        } as unknown as Fetcher,
+      });
+
+      const initReq = rpcRequest('initialize', { protocolVersion: '2025-03-26', capabilities: {}, clientInfo: { name: 'test' } });
+      const initRes = await handleMcpRequest(initReq, env);
+      const sessionId = initRes.headers.get('MCP-Session-Id')!;
+
+      const req = rpcRequest(
+        'tools/call',
+        { name: 'image_generate', arguments: { prompt: 'a tarot card', model: 'gemini-3-pro-image-preview' } },
+        { 'MCP-Session-Id': sessionId },
+      );
+      const res = await handleMcpRequest(req, env);
+      const body = await res.json() as any;
+
+      expect(body.result?.isError).toBeFalsy();
+      expect(body.result?.content?.[0]?.text).toBe('generated');
+    });
+
     it('denies all tool calls when token has no scopes', async () => {
       const env = makeEnv({
         AUTH_SERVICE: {