Releases · Stackbilt-dev/llm-providers

22 May 11:22

stackbilt-admin

v1.9.0

fb20e0e

v1.9.0 — getRoutingInfo, onIteration abort, deprecation warnings Latest

Latest

What's new

`getRoutingInfo()` — pre-flight routing snapshot

Call once at ingress to get { useCase, provider, model, estimatedInputTokens, modelLifecycle, deprecationWarning, ... } without dispatching. Pair with request.metadata.useCase to pre-classify at the gateway layer and let the catalog engine drive dispatch.

`metadata.useCase` passthrough

resolveUseCase() in the factory now reads request.metadata.useCase directly. Gateways classify once, set the field, and the catalog honours it.

`onIteration` abort — `ToolLoopAbortSignal`

Return { abort: true, reason? } from the onIteration callback in generateResponseWithTools to stop a tool loop immediately. ToolLoopAbortedError is thrown. Void-returning callbacks are unaffected.

Response deprecation annotations

generateResponse() attaches metadata.llmProvidersDeprecationWarning to any response using a compatibility or retired lifecycle model. The two Cerebras models deprecating 2026-05-27 surface warnings starting today.

`VERSION` constant fix

Was hardcoded '0.1.0' since v1.0.0; now tracks the real package version.

Full changelog

See CHANGELOG.md.

Assets 2

22 May 10:39

stackbilt-admin

v1.8.0

6cb2153

v1.8.0 — NVIDIA NIM provider

Adds NvidiaProvider with 9 NVIDIA-hosted models (Llama 3.3/3.1 70B, Llama 4 Maverick, Nemotron 70B/49B/253B, Mistral Large 2, DeepSeek V4
Flash/Pro). Tool calling verified live on Meta Llama and Nemotron families. Also ships the previously uncommitted v1.7.0 Cerebras scope. See CHANGELOG.md
for full details.

Assets 2

15 May 10:04

stackbilt-admin

v1.6.5

56ee587

v1.6.5

1.6.5\n\n### Fixed\n- Published package ESM import resolution by switching runtime relative imports to explicit `.js` specifiers, fixing Node `ERR_MODULE_NOT_FOUND` for installed consumers.\n\n### Added\n- Tarball consumer smoke test (`npm run test:package`) that packs, installs in a clean temp project, and verifies both `require` and `import` entrypoints.\n- CI and publish workflow gates now run `npm run test:package` before release publish.

Assets 2

27 Apr 23:23

stackbilt-admin

v1.6.0

e91148e

v1.6.0 — SSE validation, cache hints, schema canary

What's new

Streaming schema validation (#41)

All four providers now surface malformed SSE frames as SchemaDriftError and fire onSchemaDrift instead of swallowing silently. Anthropic additionally validates content_block_delta event shape and delta.text type; future tool-streaming delta types are skipped via forward-compat discriminator.

Cache-aware routing (#52)

New CacheHints type — LLMRequest.cache is a no-op for callers that don't set it
Anthropic: strategy: 'provider-prefix' wraps the system prompt as a content block with cache_control: { type: 'ephemeral' } and marks the last tool as a breakpoint
OpenAI / Groq / Cerebras: automatic caching with no request-side translation needed
Cached token counts normalized into TokenUsage: cachedInputTokens, cacheReadInputTokens, cacheCreationInputTokens
supportsPromptCache flag added to ModelCapabilities

Schema drift canary (#39 Part 2)

extractShape(obj) — flat path → type map from any response object
compareShapes(golden, live) — diffs two shape maps into { added, removed, changed }
runCanaryCheck(provider, golden, liveResponse) — one-shot canary returning a CanaryReport
Golden fixtures committed for all five providers under src/__tests__/fixtures/response-shapes/
All three utilities exported from the package root

Previously merged, now documented

Factory-level streaming with fallback (#26) — generateResponseStream uses the same circuit-breaker and fallback chain as generateResponse
Tool-use loop helper (#28) — generateResponseWithTools with ToolLoopLimitError, ToolLoopAbortedError, iteration/cost caps, and abort-signal support
Cloudflare AI Gateway metadata forwarding (#29) — cf-aig-* headers forwarded only when baseUrl matches the Gateway pattern
Cloudflare LoRA / fine-tune forwarding (#51) — LLMRequest.lora forwarded to Workers AI binding

Bug fixes

stop_sequence schema false positive — was typed as string; real Anthropic API returns null when no stop sequence triggers, causing SchemaDriftError on every normal response. Fixed to string-or-null.
AnthropicProvider.getProviderBalance() — was calling a non-existent endpoint (/v1/organizations/cost_report). Now returns unavailable with a message directing users to the Admin API, matching the Groq pattern.

Full changelog

See CHANGELOG.md for the complete entry.

Assets 2

27 Apr 12:05

stackbilt-admin

v1.5.1

68f6daf

v1.5.1 — fix Cloudflare llama-3.2 vision silent empty response

Fixed

analyzeImage() silent empty response on Cloudflare — @cf/meta/llama-3.2-11b-vision-instruct via the Workers AI binding requires a raw { image: number[], prompt, max_tokens } input shape, not the OpenAI-compatible messages/image_url format. The chat path returns choices[0].message.content === null via the binding, causing extractText() to silently return "". The provider now detects this model and dispatches to the raw binding format. Other vision models are unaffected. Fixes #53.

Full changelog: https://github.com/Stackbilt-dev/llm-providers/blob/main/CHANGELOG.md

Assets 2

23 Apr 11:58

stackbilt-admin

v1.5.0

cebcede

v1.5.0

Consolidates the unreleased 1.4.0 scope and undocumented features into a single minor release. 1.4.0 was tagged in package.json but never published to npm; consumers upgrading from 1.3.0 receive all of the following atomically.

Added

Declarative model catalog (src/model-catalog.ts) — semantic catalog for provider/model metadata, recommendation use cases, lifecycle status, and runtime scoring
Runtime recommendation API — LLMProviders#getRecommendedModel(request, useCase?) exposes the same routing logic the factory uses internally
Schema drift envelope validation — OpenAIProvider, GroqProvider, CerebrasProvider, and AnthropicProvider now validate response envelopes at the provider boundary, throwing SchemaDriftError on mismatch instead of corrupting downstream consumers silently
LLMProviders.fromEnv() static factory — auto-discovers providers from Cloudflare Workers env bindings without manual wiring
Model drift test — asserts every provider's models[] is symmetrically covered by its capabilities map
Catalog tests — coverage for retired-model exclusion, health-aware ranking, request-shape use-case inference

Changed

Factory routing selects provider/model pairs from the catalog instead of hardcoded ordering
Health-aware dispatch considers circuit-breaker state including degraded and recovering providers, not just fully open
Budget-aware dispatch — with a CreditLedger attached, selection demotes providers under high utilization or near projected depletion
Provider defaults for OpenAI, Anthropic, Cloudflare, Cerebras, and Groq resolve through the shared catalog
Cloudflare model recommendation prefers modern active baselines (Gemma 4, GPT-OSS) instead of legacy TinyLlama/Qwen heuristics
Recommendation exports exclude retired targets (e.g. gpt-4o) while preserving deprecated constants for compatibility

Deprecated

MODELS.CLAUDE_3_HAIKU — migrate to CLAUDE_HAIKU_4_5 or CLAUDE_3_5_HAIKU
MODELS.GPT_4O — migrate to GPT_4O_MINI or a current GPT-4 successor

Removed

claude-3-haiku-20240307, gpt-4o, and dead alias gpt-4-turbo-preview dropped from provider models[] and capabilities tables. Arbitrary-string passthrough on request inputs is unchanged — consumers pinning older MODELS enum values via string literals are not affected.

Full changelog: CHANGELOG.md

Assets 2

17 Apr 00:05

stackbilt-admin

v1.3.0

bf713cd

v1.3.0 — Cloudflare Workers AI vision support

Added

Cloudflare Workers AI vision support — CloudflareProvider now accepts request.images and routes to vision-capable models. Previously image data was silently dropped on the CF path.
Three new CF vision models:
- @cf/google/gemma-4-26b-a4b-it — 256K context, vision + function calling + reasoning
- @cf/meta/llama-4-scout-17b-16e-instruct — natively multimodal, tool calling
- @cf/meta/llama-3.2-11b-vision-instruct — image understanding
CloudflareProvider.supportsVision = true — factory's analyzeImage now dispatches to CF when configured.
Factory default vision fallback — getDefaultVisionModel() falls back to @cf/google/gemma-4-26b-a4b-it when neither Anthropic nor OpenAI is configured, enabling CF-only deployments to use analyzeImage().

Changed

Images are passed to CF using the OpenAI-compatible image_url content-part shape (base64 data URIs). HTTP image URLs throw a helpful ConfigurationError — fetch the image and pass bytes in image.data.
Attempting request.images on a non-vision CF model throws a ConfigurationError naming the vision-capable alternatives.

Usage

factory.analyzeImage({
  image: { data: base64, mimeType: 'image/jpeg' },
  prompt: 'Extract recipe data',
  model: '@cf/google/gemma-4-26b-a4b-it',
});

See #43 for details.

Assets 2

01 Apr 15:54

stackbilt-admin

v1.1.0

e0df15e

v1.1.0 — Multi-Modal: Image Generation

Image Generation Provider

@stackbilt/llm-providers is now multi-modal — text + image inference under one package.

New: `ImageProvider`

import { ImageProvider } from '@stackbilt/llm-providers';

const img = new ImageProvider({
  cloudflareAi: env.AI,
  geminiApiKey: env.GEMINI_API_KEY,
});

const result = await img.generateImage({
  prompt: 'a mountain landscape at sunset',
  model: 'flux-dev',
});
// result.image: ArrayBuffer, result.responseTime, result.provider

Built-in Models

Model	Provider	Use Case
`sdxl-lightning`	Cloudflare	Fast drafts, free tier
`flux-klein`	Cloudflare	Balanced quality/speed
`flux-dev`	Cloudflare	Highest CF quality
`gemini-flash-image`	Google	Text rendering capable
`gemini-flash-image-preview`	Google	Latest preview model

Extracted from img-forge production codebase. Battle-tested response normalization handles all Workers AI return formats.

Full changelog: CHANGELOG.md

Assets 2

01 Apr 14:12

stackbilt-admin

v1.0.0

b474b6a

v1.0.0 — Production Release

First stable release. Production-tested in AEGIS cognitive kernel since v1.72.0.

Highlights

Zero runtime dependencies — supply chain security by design
5 providers: OpenAI, Anthropic, Cloudflare Workers AI, Cerebras, Groq
LLMProviders.fromEnv() — one-line multi-provider setup
Graduated circuit breakers — automatic failover with half-open probe recovery
CreditLedger — per-provider budget tracking with threshold alerts + burn rate projection
npm provenance — every version cryptographically linked to its source commit

Install

npm install @stackbilt/llm-providers

Quick Start

import { LLMProviders } from '@stackbilt/llm-providers';

const llm = LLMProviders.fromEnv(process.env);
const response = await llm.generateResponse({
  messages: [{ role: 'user', content: 'Hello!' }],
});

See README for full documentation.
See SECURITY.md for supply chain security policy.

Assets 2

Releases: Stackbilt-dev/llm-providers

v1.9.0 — getRoutingInfo, onIteration abort, deprecation warnings

What's new

getRoutingInfo() — pre-flight routing snapshot

metadata.useCase passthrough

onIteration abort — ToolLoopAbortSignal

Response deprecation annotations

VERSION constant fix

Full changelog

Uh oh!

v1.8.0 — NVIDIA NIM provider

Uh oh!

v1.6.5

Uh oh!

v1.6.0 — SSE validation, cache hints, schema canary

What's new

Streaming schema validation (#41)

Cache-aware routing (#52)

Schema drift canary (#39 Part 2)

Previously merged, now documented

Bug fixes

Full changelog

Uh oh!

v1.5.1 — fix Cloudflare llama-3.2 vision silent empty response

Fixed

Uh oh!

v1.5.0

Added

Changed

Deprecated

Removed

Uh oh!

v1.3.0 — Cloudflare Workers AI vision support

Added

Changed

Usage

Uh oh!

v1.1.0 — Multi-Modal: Image Generation

Image Generation Provider

New: ImageProvider

Built-in Models

Uh oh!

v1.0.0 — Production Release

Highlights

Install

Quick Start

Uh oh!

`getRoutingInfo()` — pre-flight routing snapshot

`metadata.useCase` passthrough

`onIteration` abort — `ToolLoopAbortSignal`

`VERSION` constant fix

New: `ImageProvider`