Skip to content

Releases: undreamai/LLMUnity

Release v3.0.3

08 Mar 21:18

Choose a tag to compare

πŸš€ Features

  • Support Qwen 3.5 models (PR: #391)
  • Upgrade LlamaLib to v2.0.5 (llama.cpp b8209) (PR: #391)
  • add button to redownload LlamaLib (PR: #393)

Release v3.0.2

27 Feb 14:05

Choose a tag to compare

πŸš€ Features

  • Cache LlamaLib to prevent re-downloads (PR: #386)
  • Implement strategies for context overflow (chat truncation, chat summarization) (PR: #384)
  • Upgrade LlamaLib to v2.0.4 (PR: #384)
  • Re-introduce UI dropdown for level of debug messages (PR: #384)

πŸ› Fixes

  • Fix context overflow with caching and overflow strategies (PR: #384)
  • Ensure macOS build includes the required runtime library (PR: #382)
  • Fix inference for AMD GPUs using Vulkan (PR: #384)

Release v3.0.1

28 Jan 14:26

Choose a tag to compare

πŸš€ Features

  • add Unity.Nuget.Newtonsoft-Json in the assembly definition (PR: #379)
  • Update LlamaLib to v2.0.2 (llama.cpp b7777) (PR: #380)

πŸ› Fixes

  • fix running in Editor with Android/iOS platform selected (PR: #378)

Release v3.0.0

12 Jan 22:19

Choose a tag to compare

Rewrite the LLM backend, LlamaLib, as a standalone C++/C# library and adaptation of LLMUnity.

πŸš€ Features

  • implement LlamaLib as object-oriented C++/C# library
  • update llama.cpp to b7664
  • fix Vulkan GPU backend
  • Android 16kb support
  • fix iOS - xcode building
  • fix RAG functionality for iOS
  • polish samples
  • optimise streaming functionality and implement callbacks on the C++ end
  • remove chat templates from LLMUnity, use the llama.cpp templating
  • implement property checks
  • common handlng for both json and gbnf grammars
  • simplify integration of tinyBLAS (light GPU backend for Nvidia GPUs)
  • transition of client / server functionality in LlamaLib

Release v2.5.2

29 May 18:31

Choose a tag to compare

πŸš€ Features

  • Support Android x86-64 architecture (Magic Leap 2) (PR: #344)
  • Combine ARM and Intel architectures of macOS (PR: #345)

Release v2.5.1

05 May 12:15

Choose a tag to compare

πŸš€ Features

  • Allow JSON schema grammars (PR: #333)
  • Add support for Qwen3 models (PR: #335)
  • Add support for BitNet models (PR: #334)
  • Upgrade LlamaLib to v1.2.5 (llama.cpp b5261) (PR: #335)

πŸ› Fixes

  • Fix Unity Editor hanging after stopping a completion and restarting scene (PR: #335)

Release v2.5.0

28 Mar 14:55

Choose a tag to compare

πŸš€ Features

  • VisionOS support (PR: #299)
  • Add support for Gemma 3 and Phi 4 models (PR: #327)
  • Fix Android support for older devices (use ARMv8-A instead of ARMv8.4-A) (PR: #325)
  • Upgrade LlamaLib to v1.2.4 (llama.cpp b4969) (PR: #325)
  • Default number of predicted tokens (num_predict) to infinity (-1) (PR: #328)

Release v2.4.2

19 Feb 13:07

Choose a tag to compare

πŸš€ Features

  • Integrate DeepSeek models (PR: #312)
  • Update LlamaLib to v1.2.3 (llama.cpp b4688) (PR: #312)
  • Drop CUDA 11.7.1 support (PR: #312)
  • Add warm-up function for provided prompt (PR: #301)
  • Add documentation in Unity tooltips (PR: #302)

πŸ› Fixes

  • Fix code signing on iOS (PR: #298)
  • Persist debug mode and use of extras to the build (PR: #304)
  • Fix dependency resolution for full CUDA and vulkan architectures (PR: #313)

Release v2.4.1

18 Dec 11:27

Choose a tag to compare

πŸš€ Features

  • Static library linking on mobile (fixes iOS signing) (PR: #289)

πŸ› Fixes

  • Fix support for extras (flash attention, iQ quants) (PR: #292)

Release v2.4.0

02 Dec 16:40

Choose a tag to compare

πŸš€ Features

  • iOS deployment (PR: #267)
  • Improve building process (PR: #282)
  • Add structured output / function calling sample (PR: #281)
  • Update LlamaLib to v1.2.0 (llama.cpp b4218) (PR: #283)

πŸ› Fixes

  • Clear temp build directory before building (PR: #278)

πŸ“¦ General

  • Remove support for extras (flash attention, iQ quants) (PR: #284)
  • remove support for LLM base prompt (PR: #285)