Skip to content

Add OPFS model caching to Flare engine for instant repeat loads #297

@sauravpanda

Description

@sauravpanda

Summary

Integrate Flare's OPFS caching APIs into the FlareEngine adapter so models are cached after first download and load instantly on repeat visits.

How it works

Flare already has OPFS caching built into its WASM module:

  • is_model_cached(name) — check if model exists in OPFS
  • cache_model(name, bytes) — save model to OPFS
  • load_cached_model(name) — load from OPFS (3-4x faster than IndexedDB)

Integration in FlareEngine.loadModel()

async loadModel(modelId: string, options?: LoadOptions): Promise<void> {
    const modelName = modelId + '.gguf';
    
    // Check OPFS cache first
    const cached = await load_cached_model(modelName);
    if (cached) {
        this.engine = FlareEngine.load(cached);
        options?.onProgress?.(1, 1); // instant
        return;
    }
    
    // Download with progress
    const bytes = await fetchWithProgress(url, options?.onProgress);
    this.engine = FlareEngine.load(bytes);
    
    // Cache for next time (fire-and-forget)
    cache_model(modelName, bytes).catch(() => {});
}

User experience

  • First load: normal download with progress bar
  • Second load: instant (< 100ms from OPFS)
  • Show "Cached" badge on models that are already stored

Depends on

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    flare-integrationFlare WASM inference engine integration

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions