bug: Streaming responses record 0 actual tokens — cost and quota tracking broken for all streams

## Bug

Every streaming response is recorded with zero input and output tokens. Cost and quota tracking are entirely inaccurate for streamed requests.

**Location:** `src/factory.ts:1064`

```typescript
// buildFactoryStream() — called for all streaming responses
const usage = { inputTokens: 0, outputTokens: 0, totalTokens: 0, cost: estimatedCost };
```

The token fields are hardcoded to `0`. Only the pre-request cost *estimate* (derived from character count) is stored. No post-stream reconciliation happens.

**Downstream effects:**
- `CreditLedger` spend is based on rough estimates, not actual usage — monthly budgets can drift significantly
- `CostTracker` per-provider totals are wrong for any provider used via streaming
- `QuotaHook` receives `actualCost: estimatedCost` with no token breakdown (`src/factory.ts:1079`)

## Root cause

`generateResponseStream()` returns a `ReadableStream<string>` rather than an `LLMResponse`, so the token counts in the final SSE event are never extracted.

Providers do surface final usage in their SSE streams:
- **Anthropic:** `message_delta` event with `usage.output_tokens`; `message_start` with `usage.input_tokens`
- **OpenAI / Groq / Cerebras:** final `data:` chunk with `usage.prompt_tokens` / `usage.completion_tokens` when `stream_options: { include_usage: true }` is set

## Fix

1. Pass `stream_options: { include_usage: true }` in OpenAI/Groq/Cerebras streaming requests
2. Buffer the final SSE chunk in each provider's stream path and extract usage
3. After the stream drains, call the same `recordQuota` / `costTracker.trackCost` path used by non-streaming responses

## Acceptance criteria

- [ ] Streaming responses record actual `inputTokens` and `outputTokens` post-stream
- [ ] `CreditLedger` and `CostTracker` reflect real token counts for streamed requests
- [ ] Existing streaming tests updated; new tests cover token accumulation across providers

## Found by

Codebase audit (automated) — `src/factory.ts:1047–1090`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Streaming responses record 0 actual tokens — cost and quota tracking broken for all streams #62

Bug

Root cause

Fix

Acceptance criteria

Found by

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: Streaming responses record 0 actual tokens — cost and quota tracking broken for all streams #62

Description

Bug

Root cause

Fix

Acceptance criteria

Found by

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions