Skip to content

Releases: 6block/openmodel

M1 v1.1.1 - Streaming, Token Counting & Multi-Partition Fix

21 May 06:02

Choose a tag to compare

Patch release for the M1 Sidecar Worker. Includes all images (scheduler, inference, foc-bridge).

Changes

py-inference

  • Real token counting (prompt_tokens, completion_tokens) from vLLM RequestOutput
  • SSE streaming support (POST /v1/chat/completions with stream: true)
  • Accurate finish_reason (stop vs length)
  • top_p and stop parameter passthrough
  • vLLM reload retry (3 attempts with 5s backoff, auto-restart on failure)
  • Longer GPU cleanup wait (2s → 5s) for reliable CUDA resource release

go-scheduler

  • Fix: WindowPoSt proof detection now waits for ALL partitions to complete before resuming GPU (previously resumed after first partition, causing conflicts on large miners with multiple partitions per deadline)

foc-bridge

  • No changes (included for completeness)

Images

File Size Component
openmodel-scheduler.tar.gz 14 MB go-scheduler
openmodel-foc-bridge.tar.gz 83 MB foc-bridge
openmodel-inference.tar.gz.part-aa 1.9 GB py-inference (part 1/4)
openmodel-inference.tar.gz.part-ab 1.9 GB py-inference (part 2/4)
openmodel-inference.tar.gz.part-ac 1.9 GB py-inference (part 3/4)
openmodel-inference.tar.gz.part-ad 793 MB py-inference (part 4/4)

Download & Load

# Scheduler
docker load -i openmodel-scheduler.tar.gz

# Foc-bridge
docker load -i openmodel-foc-bridge.tar.gz

# Inference (reassemble then load)
cat openmodel-inference.tar.gz.part-* > openmodel-inference.tar.gz
docker load -i openmodel-inference.tar.gz

# Restart services
docker compose restart

Upgrade from v1.0.0

Replace all three images and restart. No config or docker-compose.yml changes required.

This release supersedes v1.1.0.

M1 Release - Sidecar Worker

15 Apr 02:46

Choose a tag to compare

Docker images for OpenModel Sidecar Worker (scheduler, inference, foc-bridge).

To reassemble the inference image:

cat openmodel-inference.tar.part_* > openmodel-inference.tar
docker load -i openmodel-inference.tar

See README for full deployment instructions.