docs: add openai-bridge-demo.mjs (sister of embeddinggemma-demo) by MauricioPerera · Pull Request #9 · MauricioPerera/just-bash-data

MauricioPerera · 2026-05-05T10:59:47Z

Summary

Adds a second end-to-end smoke demo that targets openai-workers-ai-bridge (an OpenAI-compatible Worker in front of Workers AI) instead of calling the Workers AI REST API directly. Same multilingual cat/rocket/bread corpus, same vec store + cross-lingual search — different embedding backend.

Why a sister demo, not a replacement

The existing embeddinggemma-demo.mjs stays as the zero-extra-infrastructure path (just CF account + token).
The bridge demo is for users who already deploy the bridge to share one OpenAI base URL across LangChain / n8n / LibreChat / OpenAI SDK.
Bridge does server-side Matryoshka truncation + L2 renorm when you pass the OpenAI dimensions parameter, so the demo creates a smaller vec collection (256d default) and skips the client-side truncate(vec, dim) + zero-pad gymnastics from the original.
Bridge edge-caches identical embeddings — re-runs cost 0 neurons.

What this PR contains

examples/smoke/openai-bridge-demo.mjs — runnable end-to-end demo (mirrors the structure of the existing direct-REST demo)
Small "End-to-end demos" section in README.md indexing both files side by side with a one-line "why pick this one" each
No code changes, no test changes

Run it

node examples/smoke/openai-bridge-demo.mjs \
  --bridge=https://openai-workers-ai-bridge.<sub>.workers.dev/v1 \
  --key=sk-cfwai-...

The equivalent flow has been verified end-to-end against a deployed bridge in openai-workers-ai-bridge#examples/rag-with-js-vector-store.mjs — query "How do I store embeddings on a tight memory budget?" surfaces the 3 storage-focused docs in correct order (PolarQuant 0.64, BGE-small 0.62, Matryoshka 0.58) using the same OpenAI SDK + dimensions=256 + EmbeddingGemma path this demo exercises.

Test plan

Deploy openai-workers-ai-bridge or fork the existing template
`pnpm install && pnpm build`
Run with --bridge=... --key=...
Confirm the cross-lingual search surfaces the right concepts at k=3 (same expectations as the direct-REST demo)

🤖 Generated with Claude Code

…dex both Adds a second end-to-end smoke demo that targets the openai-workers-ai-bridge OpenAI-compatible Worker instead of the Workers AI REST API directly. Same multilingual cat/rocket/bread corpus, same vec store + cross-lingual search. Why a sister demo rather than replacing the existing one: - the direct-REST demo stays as the zero-infra path (just a CF account and an API token) - the bridge demo is for users who already deploy the bridge to share one OpenAI base URL across LangChain / n8n / LibreChat / OpenAI SDK - the bridge does Matryoshka truncation + L2 renorm server-side when you pass `dimensions`, so the demo creates a smaller `vec` collection (256d default) and skips the client-side `truncate(vec, dim)` + zero-pad gymnastics from the original demo - the bridge edge-caches identical embeddings, so re-runs cost 0 neurons README gets a small "End-to-end demos" section indexing both files side by side with a one-line "why pick this one" each. No code changes, no test changes; this PR is docs + a single new mjs file.

…enAI SDK + bridge

gemini-code-assist

Code Review

This pull request introduces end-to-end demo scripts, including a new OpenAI bridge demo that highlights server-side Matryoshka truncation and edge caching. The README has been updated to document these additions. Regarding the implementation, a high-severity issue was found in the DiskFs.readFile method, which incorrectly defaults to UTF-8 encoding; this behavior deviates from standard file system expectations and will lead to data corruption when handling binary files used by the vector store.

gemini-code-assist · 2026-05-05T11:01:14Z

+  async readFile(p, opts) {
+    const enc = typeof opts === "string" ? opts : opts?.encoding ?? "utf8";
+    return fsp.readFile(p, enc);
+  }


The DiskFs.readFile implementation incorrectly defaults to utf8 encoding when no options are provided. In the IFileSystem interface (and standard Node.js fs), readFile should return a Uint8Array (or Buffer) if no encoding is specified. Defaulting to utf8 will cause data corruption when reading binary files, such as the .bin files used by the vector store for quantized embeddings.

async readFile(p, opts) { const enc = typeof opts === "string" ? opts : opts?.encoding; return fsp.readFile(p, enc); }

MauricioPerera added 2 commits May 5, 2026 04:49

feat: openai-bridge-demo.mjs — sister of embeddinggemma-demo using Op…

5c99f86

…enAI SDK + bridge

gemini-code-assist Bot reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add openai-bridge-demo.mjs (sister of embeddinggemma-demo)#9

docs: add openai-bridge-demo.mjs (sister of embeddinggemma-demo)#9
MauricioPerera wants to merge 2 commits into
mainfrom
docs/openai-bridge-example

MauricioPerera commented May 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MauricioPerera commented May 5, 2026

Summary

Why a sister demo, not a replacement

What this PR contains

Run it

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant