Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ Zero LLM calls for file generation. ~20ms for structure, ~2s with oracle prose.
- **OAuth 2.1 with PKCE** — GitHub SSO, Google SSO, and email/password authentication
- **Backend adapter pattern** — tool catalogs aggregated from multiple service bindings, namespaced to avoid collisions
- **Per-tier rate limiting** — fixed-window per-tenant limits via `RATELIMIT_KV` (free=20/min, hobby=60, pro=300, enterprise=1000); 429 with `Retry-After` and `X-RateLimit-*` headers
- **Cost attribution & quota** — every tool call carries a credit cost; quota is reserved via `edge-auth` before dispatch and committed/refunded on outcome; `image_generate` cost scales with `quality_tier` (1×/1×/3×/5×/8× for draft/standard/premium/ultra/ultra_plus)
- **Scope + tier enforcement** — `tools/list` is filtered by token scopes; `tools/call` requires the `generate` scope for mutating tools; expensive `image_generate` quality tiers (`premium` and above) are gated to Pro+ plans
- **Cost attribution & quota** — every tool call carries a credit cost; quota is reserved via `edge-auth` before dispatch and committed/refunded on outcome; `image_generate` cost scales with the effective quality tier (1×/1×/3×/5×/8× for draft/standard/premium/ultra/ultra_plus); when `model` is set it takes billing precedence over `quality_tier`
- **Scope + tier enforcement** — `tools/list` is filtered by token scopes; `tools/call` requires the `generate` scope for mutating tools; expensive `image_generate` quality tiers (`premium` and above) are gated to Pro+ plans; specifying `model` directly enforces the same gate via model→tier mapping
- **Security Constitution compliance** — every tool declares a risk level (`READ_ONLY`, `LOCAL_MUTATION`, `EXTERNAL_MUTATION`); structured audit logging with secret redaction; HMAC-signed identity tokens
- **Coming-soon gate** — `PUBLIC_SIGNUPS_ENABLED` flag to control public access
- **MCP JSON-RPC over HTTP** — supports both streaming (SSE) and request/response transport
Expand Down
12 changes: 9 additions & 3 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,13 @@ Routed to the `IMG_FORGE` service binding (`img-forge-mcp`).
Generate an image from a text prompt.

- **Risk level**: `EXTERNAL_MUTATION`
- **Arguments**: `prompt` (string), plus optional model/quality parameters
- **Arguments**:
- `prompt` (string, required) — text description of the image
- `quality_tier` (string, optional) — `draft`, `standard` (default), `premium`, `ultra`, `ultra_plus`
- `negative_prompt` (string, optional) — things to avoid; effective for `draft` tier only
- `aspect_ratio` (string, optional) — `1:1` (default), `3:2`, `2:3`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
- `image_size` (string, optional) — `512`, `1K` (default), `2K`, `4K`
- `model` (string, optional) — `gemini-3.1-flash-image-preview` (maps to `ultra`), `gemini-3-pro-image-preview` (maps to `ultra_plus`); when set, takes billing and tier-enforcement precedence over `quality_tier`

### `image_list_models`

Expand Down Expand Up @@ -362,11 +368,11 @@ The window is fixed (aligned to the start of each 60-second slot), not sliding.

## Quota & Cost Attribution

Mutating tool calls reserve credits via `AUTH_SERVICE.consumeQuota` before dispatch. The cost table lives in `src/cost-attribution.ts`; `image_generate` cost is `5 × quality multiplier` where multipliers are `draft=1, standard=1, premium=3, ultra=5, ultra_plus=8`. Read-only tools (`*_status`, `*_classify`, `image_list_models`, etc.) are free.
Mutating tool calls reserve credits via `AUTH_SERVICE.consumeQuota` before dispatch. The cost table lives in `src/cost-attribution.ts`; `image_generate` cost is `5 × quality multiplier` where multipliers are `draft=1, standard=1, premium=3, ultra=5, ultra_plus=8`. When `model` is set, the effective tier is derived from the model (`gemini-3.1-flash-image-preview` → `ultra`, `gemini-3-pro-image-preview` → `ultra_plus`) and takes precedence over `quality_tier` for billing. Read-only tools (`*_status`, `*_classify`, `image_list_models`, etc.) are free.

If quota is exceeded, the call is rejected with `INVALID_PARAMS` and the message `Quota exceeded for <tool>`.

For free and hobby tiers, `image_generate` quality tiers above `standard` are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher` — these calls do not reach the backend or consume quota.
For free and hobby tiers, `image_generate` quality tiers above `standard` are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher` — these calls do not reach the backend or consume quota. This gate applies whether the tier is set via `quality_tier` or derived from `model`.

---

Expand Down
4 changes: 2 additions & 2 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ Risk levels drive both audit classification AND authorization:

- **`tools/list` filter** — `READ_ONLY` tools are visible to any authenticated session; tools with any other risk level are hidden from sessions that lack the `generate` scope.
- **`tools/call` enforcement** — `LOCAL_MUTATION`, `EXTERNAL_MUTATION`, and `DESTRUCTIVE` tools require the `generate` scope and return `INVALID_REQUEST` with audit outcome `insufficient_scope` otherwise.
- **Tier-restricted quality tiers** — `image_generate` arguments with `quality_tier` of `premium`, `ultra`, or `ultra_plus` require a Pro+ plan; free/hobby calls are rejected at the gateway with audit outcome `tier_denied` (see `enforceTierRestriction` in `src/gateway.ts`).
- **Tier-restricted quality tiers** — `image_generate` with `quality_tier` of `premium`, `ultra`, or `ultra_plus`, or with `model` set to a Gemini variant, requires a Pro+ plan; the effective tier is resolved via `resolveImageQualityTier` (model wins over `quality_tier`); free/hobby calls are rejected at the gateway with audit outcome `tier_denied` (see `enforceTierRestriction` in `src/gateway.ts`).

## Audit — `audit.ts`

Expand Down Expand Up @@ -237,7 +237,7 @@ The gateway-side limiter fires first (immediately after auth resolution) and sho

### Quota & Cost Attribution

`src/cost-attribution.ts` declares per-tool credit costs and an `image_generate` quality multiplier (`draft=1, standard=1, premium=3, ultra=5, ultra_plus=8` × `image_generate.baseCost: 5`). On `tools/call`:
`src/cost-attribution.ts` declares per-tool credit costs and an `image_generate` quality multiplier (`draft=1, standard=1, premium=3, ultra=5, ultra_plus=8` × `image_generate.baseCost: 5`). When `model` is set, `resolveImageQualityTier` maps it to the effective tier (`gemini-3.1-flash-image-preview` → `ultra`, `gemini-3-pro-image-preview` → `ultra_plus`) before applying the multiplier — model wins over `quality_tier`. On `tools/call`:

1. Resolve cost via `resolveToolCost(toolName, args)`.
2. If cost is non-zero, call `AUTH_SERVICE.consumeQuota({tenantId, userId, feature, amount})`. On failure, reject with `INVALID_PARAMS` and audit outcome `tier_denied` (overloaded — see follow-ups).
Expand Down
18 changes: 9 additions & 9 deletions docs/user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ The client calls `image_generate` with your prompt. img-forge enhances the promp
}
```

**Quality tiers**: `draft` (fastest, SDXL), `standard` (FLUX Klein, default), `premium` (FLUX Dev), `ultra` (Gemini 2.5 Flash), `ultra_plus` (Gemini 3.1 Flash). See [§5 Quota & Billing](#5-quota--billing) for credit costs and plan availability — `premium` and above require Pro or Enterprise.
**Quality tiers**: `draft` (fastest, SDXL), `standard` (FLUX Klein, default), `premium` (FLUX Dev), `ultra` (`gemini-3.1-flash-image-preview`), `ultra_plus` (`gemini-3-pro-image-preview`). You can also pass `aspect_ratio` (e.g. `2:3` for portrait), `image_size` (`512`/`1K`/`2K`/`4K`), and `model` directly. When `model` is set it determines the billing tier. See [§5 Quota & Billing](#5-quota--billing) for credit costs and plan availability — `premium` and above require Pro or Enterprise.

### Classify Intent

Expand Down Expand Up @@ -344,15 +344,15 @@ Most read-only tools (`*_status`, `*_classify`, `*_summary`, `*_quality`, `*_gov

### `image_generate` quality multipliers

| Quality tier | Multiplier | Effective cost | Available on |
|--------------|-----------|----------------|--------------|
| `draft` | 1× | 5 credits | All tiers |
| `standard` | 1× | 5 credits | All tiers |
| `premium` | 3× | 15 credits | Pro + Enterprise only |
| `ultra` | 5× | 25 credits | Pro + Enterprise only |
| `ultra_plus` | 8× | 40 credits | Pro + Enterprise only |
| Quality tier | Model | Multiplier | Effective cost | Available on |
|--------------|-------|-----------|----------------|--------------|
| `draft` | SDXL | 1× | 5 credits | All plans |
| `standard` | FLUX Klein | 1× | 5 credits | All plans |
| `premium` | FLUX Dev | 3× | 15 credits | Pro + Enterprise only |
| `ultra` | `gemini-3.1-flash-image-preview` | 5× | 25 credits | Pro + Enterprise only |
| `ultra_plus` | `gemini-3-pro-image-preview` | 8× | 40 credits | Pro + Enterprise only |

Free and Hobby plans can request `draft` or `standard` only. Calls with higher quality tiers are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher`.
When `model` is set explicitly, billing uses the model-derived tier regardless of `quality_tier` — `gemini-3-pro-image-preview` always bills at `ultra_plus` (40 credits). Free and Hobby plans can request `draft` or `standard` only; calls with higher tiers (or Gemini models) are rejected at the gateway with `Quality tier "<x>" requires a Pro plan or higher`.

### How metering works

Expand Down
22 changes: 19 additions & 3 deletions src/cost-attribution.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,21 @@ const IMAGE_QUALITY_MULTIPLIER: Record<string, number> = {
ultra_plus: 8,
};

// Maps model names to their effective quality tier for quota/audit.
// model wins over quality_tier when both are present.
const MODEL_QUALITY_TIER: Record<string, string> = {
'gemini-3.1-flash-image-preview': 'ultra',
'gemini-3-pro-image-preview': 'ultra_plus',
};

/** Resolve the effective quality tier from image_generate args. Model wins when set. */
export function resolveImageQualityTier(args?: Record<string, unknown>): string {
if (args?.model) {
return MODEL_QUALITY_TIER[args.model as string] ?? (args.quality_tier as string) ?? 'standard';
}
return (args?.quality_tier as string) ?? 'standard';
}

/**
* Resolve the credit cost for a tool call, factoring in quality tier for images.
*/
Expand All @@ -65,9 +80,10 @@ export function resolveToolCost(
return { baseCost: 1, feature: `mcp.${toolName}` };
}

// Apply quality multiplier for image_generate
if (toolName === 'image_generate' && args?.quality_tier) {
const multiplier = IMAGE_QUALITY_MULTIPLIER[args.quality_tier as string] ?? 1;
// Apply quality multiplier for image_generate (model wins over quality_tier)
if (toolName === 'image_generate') {
const effectiveTier = resolveImageQualityTier(args);
const multiplier = IMAGE_QUALITY_MULTIPLIER[effectiveTier] ?? 1;
return { ...base, baseCost: base.baseCost * multiplier };
}

Expand Down
4 changes: 2 additions & 2 deletions src/gateway.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ import { publishToGitHub } from './scaffold-publish.js';
import { classifyIntention, type IntentClassification } from './intent-classifier.js';
import { logDivergence } from './divergence-logger.js';
import { checkRateLimit, rateLimitHeaders, type RateLimitResult } from './rate-limiter.js';
import { reserveQuota, settleQuota, buildCostAttribution, isFreeTool } from './cost-attribution.js';
import { reserveQuota, settleQuota, buildCostAttribution, isFreeTool, resolveImageQualityTier } from './cost-attribution.js';

const MCP_PROTOCOL_VERSION = '2025-03-26';
const JSON_RPC_PARSE_ERROR = -32700;
Expand Down Expand Up @@ -189,7 +189,7 @@ function enforceTierRestriction(
tier: Tier,
): string | null {
if (toolName !== 'image_generate') return null;
const qualityTier = (args?.quality_tier as string) ?? 'standard';
const qualityTier = resolveImageQualityTier(args);
const allowed = TIER_ALLOWED_QUALITY[tier];
if (!allowed || allowed.has(qualityTier)) return null;
return `Quality tier "${qualityTier}" requires a Pro plan or higher. Your current plan: ${tier}. Available tiers: ${[...allowed].join(', ')}.`;
Expand Down
23 changes: 22 additions & 1 deletion src/tool-registry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,8 @@ const TOOL_SPECS: ToolSpec[] = [
'Generate an image from a text prompt. Returns a URL to the generated image ' +
'and metadata about how the prompt was enhanced. Supports 5 quality tiers: ' +
'draft (fastest, SDXL), standard (FLUX Klein, default), premium (FLUX Dev), ' +
'ultra (Gemini 2.5 Flash), ultra_plus (Gemini 3.1 Flash). ' +
'ultra (gemini-3.1-flash-image-preview), ultra_plus (gemini-3-pro-image-preview). ' +
'Optionally override aspect ratio, output resolution, and model directly. ' +
'Generation takes 5-30 seconds depending on tier.',
inputSchema: {
type: 'object',
Expand All @@ -166,6 +167,26 @@ const TOOL_SPECS: ToolSpec[] = [
type: 'string',
description: 'Things to avoid in the image (only effective for draft tier with SDXL).',
},
aspect_ratio: {
type: 'string',
enum: ['1:1', '3:2', '2:3', '3:4', '4:3', '4:5', '5:4', '9:16', '16:9', '21:9'],
default: '1:1',
description: 'Image aspect ratio. Defaults to 1:1 (square).',
},
image_size: {
type: 'string',
enum: ['512', '1K', '2K', '4K'],
default: '1K',
description: 'Output resolution. Defaults to 1K.',
},
model: {
type: 'string',
enum: ['gemini-3.1-flash-image-preview', 'gemini-3-pro-image-preview'],
description:
'Model override. gemini-3.1-flash-image-preview=ultra tier, ' +
'gemini-3-pro-image-preview=ultra_plus tier (Pro plan required). ' +
'When set, model takes precedence over quality_tier for billing.',
},
},
required: ['prompt'],
},
Expand Down
61 changes: 61 additions & 0 deletions test/cost-attribution.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import {
reserveQuota,
settleQuota,
buildCostAttribution,
resolveImageQualityTier,
} from '../src/cost-attribution.js';
import type { AuthServiceRpc } from '../src/types.js';

Expand Down Expand Up @@ -42,6 +43,29 @@ describe('resolveToolCost', () => {
expect(ultra.baseCost).toBeGreaterThan(draft.baseCost);
});

it('applies model-derived tier for image_generate (model wins)', () => {
const flash = resolveToolCost('image_generate', { model: 'gemini-3.1-flash-image-preview' });
const pro = resolveToolCost('image_generate', { model: 'gemini-3-pro-image-preview' });
expect(flash.baseCost).toBe(5 * 5); // ultra multiplier
expect(pro.baseCost).toBe(5 * 8); // ultra_plus multiplier
});

it('model wins over quality_tier when both are set', () => {
const cost = resolveToolCost('image_generate', {
model: 'gemini-3-pro-image-preview',
quality_tier: 'standard',
});
expect(cost.baseCost).toBe(5 * 8); // ultra_plus, not standard
});

it('falls back to quality_tier for unknown model', () => {
const cost = resolveToolCost('image_generate', {
model: 'unknown-model',
quality_tier: 'premium',
});
expect(cost.baseCost).toBe(5 * 3); // premium multiplier
});

it('returns default cost for unknown tools', () => {
const cost = resolveToolCost('unknown_tool');
expect(cost.baseCost).toBe(1);
Expand Down Expand Up @@ -142,3 +166,40 @@ describe('buildCostAttribution', () => {
expect(attr.creditCost).toBe(2);
});
});

describe('resolveImageQualityTier', () => {
it('returns quality_tier when no model', () => {
expect(resolveImageQualityTier({ quality_tier: 'premium' })).toBe('premium');
});

it('returns standard when no args', () => {
expect(resolveImageQualityTier()).toBe('standard');
expect(resolveImageQualityTier({})).toBe('standard');
});

it('maps gemini-3.1-flash-image-preview to ultra', () => {
expect(resolveImageQualityTier({ model: 'gemini-3.1-flash-image-preview' })).toBe('ultra');
});

it('maps gemini-3-pro-image-preview to ultra_plus', () => {
expect(resolveImageQualityTier({ model: 'gemini-3-pro-image-preview' })).toBe('ultra_plus');
});

it('model wins over quality_tier', () => {
expect(resolveImageQualityTier({
model: 'gemini-3-pro-image-preview',
quality_tier: 'standard',
})).toBe('ultra_plus');
});

it('falls back to quality_tier for unknown model', () => {
expect(resolveImageQualityTier({
model: 'unknown-future-model',
quality_tier: 'ultra',
})).toBe('ultra');
});

it('falls back to standard for unknown model with no quality_tier', () => {
expect(resolveImageQualityTier({ model: 'unknown-future-model' })).toBe('standard');
});
});
66 changes: 66 additions & 0 deletions test/gateway.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,72 @@ describe('handleMcpRequest', () => {
expect(body.result).toBeTruthy();
});

it('denies free-tier user who passes model=gemini-3-pro-image-preview without quality_tier', async () => {
const env = makeEnv({
AUTH_SERVICE: {
...mockAuthService(),
validateApiKey: async () => ({
valid: true,
tenant_id: 'tenant-1',
tier: 'free',
scopes: ['generate'],
}),
},
});

const initReq = rpcRequest('initialize', { protocolVersion: '2025-03-26', capabilities: {}, clientInfo: { name: 'test' } });
const initRes = await handleMcpRequest(initReq, env);
const sessionId = initRes.headers.get('MCP-Session-Id')!;

const req = rpcRequest(
'tools/call',
{ name: 'image_generate', arguments: { prompt: 'a tarot card', model: 'gemini-3-pro-image-preview' } },
{ 'MCP-Session-Id': sessionId },
);
const res = await handleMcpRequest(req, env);
const body = await res.json() as any;

expect(body.result?.isError ?? body.error).toBeTruthy();
const text = body.result?.content?.[0]?.text ?? body.error?.message ?? '';
expect(text).toMatch(/pro plan|ultra_plus/i);
});

it('allows pro-tier user who passes model=gemini-3-pro-image-preview', async () => {
const env = makeEnv({
AUTH_SERVICE: {
...mockAuthService(),
validateApiKey: async () => ({
valid: true,
tenant_id: 'tenant-1',
tier: 'pro',
scopes: ['generate'],
}),
},
IMG_FORGE: {
fetch: async () => new Response(JSON.stringify({
jsonrpc: '2.0', id: 1,
result: { content: [{ type: 'text', text: 'generated' }] },
}), { headers: { 'Content-Type': 'application/json' } }),
connect: () => { throw new Error('not implemented'); },
} as unknown as Fetcher,
});

const initReq = rpcRequest('initialize', { protocolVersion: '2025-03-26', capabilities: {}, clientInfo: { name: 'test' } });
const initRes = await handleMcpRequest(initReq, env);
const sessionId = initRes.headers.get('MCP-Session-Id')!;

const req = rpcRequest(
'tools/call',
{ name: 'image_generate', arguments: { prompt: 'a tarot card', model: 'gemini-3-pro-image-preview' } },
{ 'MCP-Session-Id': sessionId },
);
const res = await handleMcpRequest(req, env);
const body = await res.json() as any;

expect(body.result?.isError).toBeFalsy();
expect(body.result?.content?.[0]?.text).toBe('generated');
});

it('denies all tool calls when token has no scopes', async () => {
const env = makeEnv({
AUTH_SERVICE: {
Expand Down
Loading