LiquidONNX

Try LFM • Documentation • LEAP • Blog

LiquidONNX

ONNX export and inference tools for LFM2 models.

1. Supported Models

Family	Quant Formats
LFM2.5, LFM2	fp32, fp16, q4, q8
LFM2.5-VL, LFM2-VL	fp32, fp16, q4, q8
LFM2-MoE	fp32, fp16, q4, q4f16
LFM2.5-Audio	fp32, fp16, q4, q8

2. Installation

git clone https://github.com/Liquid4All/onnx-export.git
cd onnx-export
uv sync

# For GPU inference support
uv sync --extra gpu

# For development (testing, benchmarking)
uv sync --extra dev

3. Export

3.1 LFM2 Text Models

# All precisions
uv run lfm2-export LiquidAI/LFM2.5-1.2B-Instruct --precision

3.2 LFM2-VL Vision-Language Models

# All precisions
uv run lfm2-vl-export LiquidAI/LFM2.5-VL-1.6B --precision

# Conv2d vision format (alternative to default tiled)
uv run lfm2-vl-export LiquidAI/LFM2.5-VL-1.6B --vision-format conv2d

3.3 LFM2-MoE Mixture of Experts

# All precisions
uv run lfm2-moe-export LiquidAI/LFM2-MoE-8B-A1B --precision

4. Inference

All inference commands provide interactive multi-turn chat with streaming output. They automatically detect CUDA availability and fall back to CPU if needed.

4.1 Text Generation

# Interactive chat (starts conversation loop)
uv run lfm2-infer --model ./exports/LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx

# Single prompt (non-interactive)
uv run lfm2-infer --model ./exports/LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx \
    --prompt "Explain quantum computing"

# Force CPU execution
uv run lfm2-infer --model ./exports/LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx --cpu

4.2 Vision-Language

# Single image analysis
uv run lfm2-vl-infer --model ./exports/LFM2.5-VL-1.6B-ONNX \
    --images photo.jpg \
    --prompt "What do you see in this image?"

# Multi-image comparison (up to 2 images)
uv run lfm2-vl-infer --model ./exports/LFM2.5-VL-1.6B-ONNX \
    --images image1.jpg image2.jpg \
    --prompt "Compare these two images"

# Text-only (no images)
uv run lfm2-vl-infer --model ./exports/LFM2.5-VL-1.6B-ONNX \
    --prompt "Hello, how are you?"

Note: VL inference requires the model directory path (not a single .onnx file) since it loads multiple components: embed_tokens.onnx, embed_images.onnx, and decoder.onnx.

4.3 MoE

# Interactive chat
uv run lfm2-moe-infer --model ./exports/LFM2-MoE-8B-A1B-ONNX/onnx/model_q4.onnx

# Force CPU (when model does not fit VRAM)
uv run lfm2-moe-infer --model ./exports/LFM2-MoE-8B-A1B-ONNX/onnx/model_q4.onnx --cpu

4.4 Audio (ASR, TTS, Interleaved)

LFM2.5-Audio is a multimodal audio-language model supporting three modes:

ASR (Automatic Speech Recognition): Transcribe audio to text
TTS (Text-to-Speech): Generate audio from text
Interleaved: Mixed text and audio input/output for conversational audio

The model uses 5 ONNX components:

decoder.onnx - LFM2 language model backbone
audio_encoder.onnx - Conformer encoder for ASR input
audio_embedding.onnx - Audio code embeddings for TTS/interleaved
audio_detokenizer.onnx - Converts audio codes to STFT features
vocoder_depthformer.onnx - Autoregressive audio codebook prediction

# ASR: Transcribe audio to text
uv run lfm2-audio-infer LFM2.5-Audio-1.5B-ONNX --mode asr \
    --audio input.wav --precision q4

# TTS: Generate speech from text
uv run lfm2-audio-infer LFM2.5-Audio-1.5B-ONNX --mode tts \
    --prompt "Hello, how are you today?" \
    --system "Perform TTS. Use the UK female voice." \
    --output output.wav --precision q4

# Interleaved: Audio input with text+audio response
uv run lfm2-audio-infer LFM2.5-Audio-1.5B-ONNX --mode interleaved \
    --audio question.wav --output response.wav --precision q4

# Interactive chat mode (multi-turn with stateful KV cache)
uv run lfm2-audio-infer LFM2.5-Audio-1.5B-ONNX --mode interleaved --chat \
    --output output.wav --precision q4
# Commands in chat mode:
#   /audio <file> [text] - Send audio with optional text
#   <text>               - Send text message
#   reset                - Clear conversation state
#   quit                 - Exit

Note: Audio inference requires the model directory path (not a single .onnx file) since it loads multiple components. Use --precision to select quantization level (fp16, q4, q8).

5. Testing

Tests verify ONNX exports against PyTorch reference models.

# Install dev dependencies
uv sync --extra dev

# LFM2 text model tests
uv run pytest tests/test_lfm2/test_decoder.py -v -k "q4"

# LFM2-VL vision-language tests
uv run pytest tests/test_lfm2_vl/test_decoder.py -v -k "450M"
uv run pytest tests/test_lfm2_vl/test_vision_encoder.py -v

# LFM2-MoE tests
uv run pytest tests/test_lfm2_moe/test_decoder.py -v

Benchmarking, compare the CPU

# Text model benchmark
uv run lfm2-bench --model LiquidAI/LFM2.5-1.2B-Instruct \
    --onnx ./exports/LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx

6. Pre-exported Models

6.1 LiquidAI

Text models:

Vision-Language:

LiquidAI/LFM2.5-VL-1.6B-ONNX

Audio:

LiquidAI/LFM2.5-Audio-1.5B-ONNX

6.2 onnx-community

Text models:

Specialized:

Vision-Language:

MoE:

onnx-community/LFM2-8B-A1B-ONNX

Note: The onnx-community models are exported using Transformers.js tooling with a different export pipeline. This project aims to produce compatible graph structures and file naming conventions to ensure interoperability with Transformers.js and other ONNX consumers.

7. Acknowledgements

Special thanks to Joshua Lochner for his work on Transformers.js and the onnx-community models, which inspired and informed this project's ONNX export approach.

8. License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github/workflows		.github/workflows
samples/audio		samples/audio
scripts		scripts
src/liquidonnx		src/liquidonnx
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiquidONNX

1. Supported Models

2. Installation

3. Export

3.1 LFM2 Text Models

3.2 LFM2-VL Vision-Language Models

3.3 LFM2-MoE Mixture of Experts

4. Inference

4.1 Text Generation

4.2 Vision-Language

4.3 MoE

4.4 Audio (ASR, TTS, Interleaved)

5. Testing

6. Pre-exported Models

6.1 LiquidAI

6.2 onnx-community

7. Acknowledgements

8. License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LiquidONNX

1. Supported Models

2. Installation

3. Export

3.1 LFM2 Text Models

3.2 LFM2-VL Vision-Language Models

3.3 LFM2-MoE Mixture of Experts

4. Inference

4.1 Text Generation

4.2 Vision-Language

4.3 MoE

4.4 Audio (ASR, TTS, Interleaved)

5. Testing

6. Pre-exported Models

6.1 LiquidAI

6.2 onnx-community

7. Acknowledgements

8. License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages