WEBWAIFU 3 is a complete rewrite of WEBWAIFU V2. Same concept — a browser-based AI companion with a 3D avatar — but rebuilt from scratch with a proper framework, typed codebase, and a more focused feature set.
V2 was vanilla JS with no build system, supported both VRM and Live2D, used Edge TTS, and ran on Netlify. V3 drops the cruft, picks better defaults, and ships as a real SvelteKit app.
Primary routes:
/main companion UI/managerprovider config, memory controls, voice management, and data tools
| V2 | V3 | |
|---|---|---|
| Framework | Vanilla JS, no build | SvelteKit 2 + Vite 7 + TypeScript |
| Avatar | VRM + Live2D (Pixi.js) | VRM only — deeper Three.js integration, post-processing, animation sequencer |
| TTS | Edge TTS (free) + Fish Audio | Kokoro (local, runs on WebGPU/WASM) + Fish Audio (realtime PCM streaming) |
| LLM | Gemini, OpenAI, OpenRouter, Ollama | OpenAI, OpenRouter, Ollama, LM Studio — all via Vercel AI SDK Responses API |
| STT | Whisper tiny | Whisper tiny with silence trimming + transcript sanitization |
| Memory | Embeddings + summarization | Same core but proper Web Worker isolation, hybrid mode, configurable summarization LLM |
| Lip sync | Phoneme (Edge TTS) + amplitude (Fish) | Approximate phoneme mapping + PCM amplitude analysis (both providers) |
| Deploy | Netlify serverless | Vercel (adapter-vercel) |
| State | localStorage + IndexedDB | Svelte 5 runes + IndexedDB (StorageManager singleton) |
| Persistence | Partial | Full — every setting, conversation, VRM binary, voice list persisted |
Dropped: Live2D, Gemini, Edge TTS, DistilBERT, Pixi.js, Netlify functions. Added: Kokoro local TTS, LM Studio, realtime Fish PCM streaming, post-processing pipeline, animation sequencer, character system, TTS formatting rules auto-injection, semantic memory with vector search.
- Providers:
ollama,lmstudio,openai,openrouter - Streaming token output wired into TTS sentence accumulator
- Per-request Ollama tuning:
num_ctx,flash_attn,kv_cache_type - Character-based system prompts with user nickname support
- Auto-injected TTS formatting rules when voice is enabled (no emojis, spoken prose, proper punctuation)
- Kokoro: local TTS via Web Worker, runs on WebGPU with WASM fallback, configurable device + precision (fp32/fp16/q8/q4/q4f16)
- Fish Audio: cloud TTS with realtime PCM streaming over WebSocket, configurable latency mode
- Sentence accumulator splits LLM output into natural TTS chunks
- Fish voice model operations from manager UI: list, search, create, delete
- Whisper model:
Xenova/whisper-tiny.enin a Web Worker - Silence trimming before transcription to reduce hallucinations
- Transcript sanitization (filters repeated-char artifacts)
- Push-to-talk with optional auto-send and mic permission pre-check
- Embeddings model:
Xenova/all-MiniLM-L6-v2(384-dim) in a Web Worker - Modes:
auto-prune,auto-summarize,hybrid(default) - Cosine similarity search injects relevant history into prompt context
- Optional summarization LLM with separate provider/model/key configuration
- Model can be loaded/unloaded on demand to free GPU memory
- VRM load from built-in asset or user upload (binary persisted in IndexedDB)
- Animation playlist/sequencer with crossfade controls
- Realistic material toggle (PBR path)
- Post-processing: bloom, chromatic aberration, film grain, glitch, FXAA/SMAA/TAA, bleach bypass, color correction, outline
- Adjustable key/fill/rim/hemi/ambient lighting
- Lip sync driven from both HTMLAudioElement (Kokoro) and PCM AudioBufferSourceNode (Fish) playback paths
- All settings saved in IndexedDB via StorageManager singleton
- Provider defaults, visual settings, active tab, conversation state, Fish voice lists all persisted
- Conversation auto-save on every user + assistant message
- Conversation export (
JSON,TXT) - Data tools in manager: export all, import, clear history, factory reset
- Custom VRM binary persisted in IndexedDB
- Node.js (current LTS recommended)
- npm
- Modern browser with WebGL + WebAudio support
- WebGPU recommended for Kokoro TTS (falls back to WASM automatically)
- At least one chat backend:
- Local (
OllamaorLM Studio) - Cloud (
OpenAIorOpenRouter)
- Local (
npm install
npm run devDev URL: https://localhost:5173
Note: HTTPS in development is provided by @vitejs/plugin-basic-ssl.
- Install Ollama and pull a model (example:
ollama pull llama3.2). - Enable "Allow through network" in Ollama settings.
- Set CORS origins so the browser can access Ollama.
Mac/Linux:
OLLAMA_ORIGINS=* ollama serveWindows:
- Add system environment variable
OLLAMA_ORIGINS=*. - Restart Ollama.
- Download a model.
- Start local server (default
http://localhost:1234). - Enable CORS in LM Studio server settings.
- Open
/manager. - Add API key.
- Select provider and model defaults.
- Add Fish API key in
/manager. - Fish requests are proxied through server routes:
POST /api/tts/fish(single request)POST /api/tts/fish-stream(realtime WebSocket streaming, PCM)
On first use, browser-side model downloads may occur and be cached:
| Model | Size | Purpose | Runtime |
|---|---|---|---|
| Kokoro 82M ONNX | ~86 MB | Local TTS | WebGPU / WASM |
| Whisper tiny.en | ~40 MB | Local STT | Web Worker |
| MiniLM-L6-v2 | ~23 MB | Embeddings / memory | Web Worker |
Models are loaded on demand — Whisper and embeddings only init when you use them. Kokoro inits automatically when TTS is enabled with the Kokoro provider.
- Keys are stored in browser IndexedDB only
- Keys are sent only to selected providers and required proxy endpoints
- API key inputs use CSS text-security masking to prevent browser password manager interference
- Fish TTS requires API key transit through your deployed SvelteKit server route
- Use scoped keys and provider spending limits for production
npm run dev # Dev server with HTTPS
npm run build # Production build
npm run preview # Preview production build
npm run check # Svelte type checking- Frontend: SvelteKit 2, Svelte 5 runes, TypeScript
- 3D:
three,@pixiv/three-vrm - LLM: Vercel AI SDK (
ai,@ai-sdk/openai) — Responses API - STT/Memory models:
@huggingface/transformersin Web Workers - TTS:
kokoro-js(local WebGPU/WASM),fish-audio(cloud WebSocket) - Persistence: IndexedDB via
src/lib/storage/index.ts - Analytics: Vercel Web Analytics
Current project config uses @sveltejs/adapter-vercel (svelte.config.js).
If you deploy to a different target, switch adapters and ensure the Fish API routes (src/routes/api/tts/) are deployed server-side.
Live: webwaifu3.vercel.app
This repository currently does not include a LICENSE file. Add one before public distribution.