bernardladenthin · bernardladenthin · Jul 4, 2026 · Jul 4, 2026
@@ -1392,8 +1392,24 @@ versioning** here), so a version change must be applied to **all three poms in l
 The safe way is `mvn -q versions:set -DnewVersion=X.Y.Z -DgenerateBackupPoms=false` from the repo
 root (it updates the parent and every child `<parent>` reference at once). Changing only the root
 `<version>` leaves the children pointing at a non-existent parent and **fails the reactor build**
-(`Could not find artifact net.ladenthin:llama-parent:pom:X.Y.Z`). The README version examples and
-badge still need the usual manual update. (If single-source ergonomics are wanted, the Maven
+(`Could not find artifact net.ladenthin:llama-parent:pom:X.Y.Z`).
+
+`versions:set` only rewrites the **poms**. The **two README files** that carry hardcoded
+release-version dependency snippets must be bumped **manually and in the same commit** — miss either
+and the published docs point consumers at the previous release. (The `llama-langchain4j/README.md`
+snippet was exactly the one forgotten on the `5.0.4 → 5.0.5` bump; it is listed here so it is not
+missed again.)
+
+- **`README.md`** (root) — the install snippet, the two classifier-example snippets (default + the
+  `<classifier>` template), and the `llama-langchain4j` snippet. The Maven Central **badge**
+  auto-pulls the latest released version, so leave it. The **`-SNAPSHOT` line** in the "Snapshot
+  builds" section documents the snapshot channel — set it to the *next* dev version, not the release.
+  (The per-classifier snippets were **deduplicated** to a single canonical + template pair, so the
+  release version now appears in only ~4 spots here, not ~20 — the runtime details live once in the
+  classifier table.)
+- **`llama-langchain4j/README.md`** — its own `<dependency>` snippet.
+
+(If single-source ergonomics are wanted, the Maven
 CI-friendly `${revision}` property + `flatten-maven-plugin` would let a bump touch only the root —
 that plugin is not configured today, so do not rely on "root only".)
 

@@ -199,141 +199,27 @@ exclusive — and optionally a CPU Windows build.
 > inference is verified locally / on self-hosted hardware. As with every GPU JAR,
 > the vendor runtime is supplied by the consumer's driver/toolkit and is not bundled.
 
+For the default CPU JAR, omit the `<classifier>`. For a GPU/accelerator or
+alternate-CPU build, add the `<classifier>` for your platform from the table
+above — the backend, target platform and runtime requirement are all listed
+there. Pick **at most one** classifier (they are mutually exclusive):
+
 ```xml
-<!-- CPU (default) -->
+<!-- Default (CPU) — no classifier -->
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
     <version>5.0.5</version>
 </dependency>
 
-<!-- CUDA on Linux x86-64 (requires CUDA 13 runtime on the host) -->
+<!-- GPU / accelerator or alternate-CPU build: add the <classifier> from the
+     table above. Example shown — CUDA 13 on Linux x86-64. -->
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama</artifactId>
     <version>5.0.5</version>
     <classifier>cuda13-linux-x86-64</classifier>
 </dependency>
-
-<!-- OpenCL/Adreno on Android (requires device-provided OpenCL ICD) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>opencl-android-aarch64</classifier>
-</dependency>
-
-<!-- CUDA on Windows x86-64 (requires CUDA 13 Toolkit on the host) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>cuda13-windows-x86-64</classifier>
-</dependency>
-
-<!-- Vulkan on Windows x86-64 (NVIDIA/AMD/Intel; vulkan-1.dll from the driver) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>vulkan-windows-x86-64</classifier>
-</dependency>
-
-<!-- Vulkan on Linux x86-64 (NVIDIA/AMD/Intel; libvulkan.so.1 from the driver) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>vulkan-linux-x86-64</classifier>
-</dependency>
-
-<!-- Vulkan on Linux aarch64 (libvulkan.so.1 from the device/driver) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>vulkan-linux-aarch64</classifier>
-</dependency>
-
-<!-- OpenCL on Windows x86-64 (requires a driver-provided OpenCL ICD) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>opencl-windows-x86-64</classifier>
-</dependency>
-
-<!-- Windows CPU natives built with the MSVC / Visual Studio generator -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>msvc-windows</classifier>
-</dependency>
-
-<!-- ROCm/HIP on Linux x86-64 (requires an AMD ROCm runtime on the host) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>rocm-linux-x86-64</classifier>
-</dependency>
-
-<!-- ROCm/HIP on Windows x86-64 (requires the AMD HIP SDK runtime on the host) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>rocm-windows-x86-64</classifier>
-</dependency>
-
-<!-- SYCL (Intel oneAPI, fp16) on Linux x86-64 (requires the oneAPI/Level-Zero runtime) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>sycl-fp16-linux-x86-64</classifier>
-</dependency>
-
-<!-- SYCL (Intel oneAPI, fp32) on Linux x86-64 (requires the oneAPI/Level-Zero runtime) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>sycl-fp32-linux-x86-64</classifier>
-</dependency>
-
-<!-- SYCL (Intel oneAPI) on Windows x86-64 (requires the oneAPI/Level-Zero runtime) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>sycl-windows-x86-64</classifier>
-</dependency>
-
-<!-- OpenCL/Adreno on Windows-on-ARM aarch64 (Snapdragon X; device-provided OpenCL ICD) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>opencl-windows-aarch64</classifier>
-</dependency>
-
-<!-- OpenVINO on Linux x86-64 (requires the Intel OpenVINO runtime on the host) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>openvino-linux-x86-64</classifier>
-</dependency>
-
-<!-- OpenVINO on Windows x86-64 (requires the Intel OpenVINO runtime on the host) -->
-<dependency>
-    <groupId>net.ladenthin</groupId>
-    <artifactId>llama</artifactId>
-    <version>5.0.5</version>
-    <classifier>openvino-windows-x86-64</classifier>
-</dependency>
 ```
 
 > [!IMPORTANT]

@@ -62,7 +62,7 @@ ScoringModel reranker     = new JllamaScoringModel(rerankLlama);
 <dependency>
     <groupId>net.ladenthin</groupId>
     <artifactId>llama-langchain4j</artifactId>
-    <version>5.0.4</version>
+    <version>5.0.5</version>
 </dependency>
 ```