Ollama setup via Docker PLUS manage your LLM models UI#576
Ollama setup via Docker PLUS manage your LLM models UI#576
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds Ollama LLM container support to 4CAT's Docker stack, along with an admin UI to manage LLM models (pull, delete, enable/disable). The LLM model refresh logic is extracted from the old refresh_items worker into a dedicated OllamaManager worker. A new llm.enabled_models configuration setting allows admins to control which available models are exposed to users.
Changes:
- New
OllamaManagerbackend worker for refreshing, pulling, and deleting Ollama models via the Ollama HTTP API - New
/admin/llm/admin panel (views_llm.py+llm-server.html) for managing LLM models, gated by both admin privileges andllm.access - New
docker-compose_ollama.ymloverride for running Ollama as a Docker sidecar, with auto-configuration indocker_setup.py
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
backend/workers/ollama_manager.py |
New worker for Ollama model refresh/pull/delete operations |
backend/workers/refresh_items.py |
LLM refresh logic removed; worker now does nothing |
webtool/views/views_llm.py |
New Flask blueprint for the admin LLM management panel |
webtool/templates/controlpanel/llm-server.html |
New admin panel template for model listing and actions |
webtool/templates/controlpanel/layout.html |
Adds "LLM Server" nav link when llm.access is enabled |
webtool/__init__.py |
Registers the new views_llm blueprint |
processors/machine_learning/llm_prompter.py |
Filters available models by enabled list before showing to users |
common/lib/config_definition.py |
Adds llm.enabled_models config definition |
docker/docker_setup.py |
Auto-configures LLM settings when Ollama is detected on Docker network |
docker-compose_ollama.yml |
New Docker Compose override for the Ollama sidecar service |
docker/README.md |
Documents the Ollama Docker setup and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| elif task == "delete": | ||
| success = self.delete_model(model_name) | ||
| if success: | ||
| self.refresh_models() |
There was a problem hiding this comment.
When a model is successfully deleted from the Ollama server, refresh_models() updates llm.available_models to remove the deleted model, but llm.enabled_models is never cleaned up. This means deleted models accumulate as stale entries in llm.enabled_models. While this doesn't cause an immediate runtime error (since llm_prompter.py intersects the two lists), it's misleading: after a delete-and-refresh cycle, the model would disappear from the available models table in the UI, but it remains in the enabled list. If the model is later re-pulled, it would reappear as already enabled, which could be surprising.
The delete_model() method (or the work() method after a successful delete) should remove the model from llm.enabled_models, or at minimum refresh_models() should reconcile llm.enabled_models to remove entries no longer present in llm.available_models.
| ### Configuring 4CAT to use Ollama | ||
|
|
||
| 1. Log in as admin and open **Control Panel → Settings**. | ||
| 2. Set the following LLM fields: | ||
|
|
||
| | Setting | Value | | ||
| |---|---| | ||
| | LLM Provider Type | `ollama` | | ||
| | LLM Server URL | `http://ollama:11434` | | ||
| | LLM Access | enabled | | ||
|
|
||
| 3. Save settings. | ||
| 4. Open **Control Panel → LLM Server** (visible once *LLM Access* is enabled). | ||
| 5. Use the **Refresh** button to load available models, then **Pull** a model | ||
| (e.g. `llama3.2:3b`) to download it from the Ollama library. | ||
| 6. Enable the models you want to make available to users. |
There was a problem hiding this comment.
The docker/README.md section "Configuring 4CAT to use Ollama" (steps 1–3) instructs users to manually set the LLM Provider Type, LLM Server URL, and LLM Access fields in the Control Panel Settings. However, docker/docker_setup.py now automatically detects the Ollama sidecar on first startup and configures these settings without user intervention. The README should mention this auto-configuration so users know they can skip steps 1–3 on a fresh install with the Ollama override.
| def work(self): | ||
| # Refresh items | ||
| self.refresh_settings() | ||
|
|
||
| self.job.finish() | ||
|
|
||
| def refresh_settings(self): | ||
| """ | ||
| Refresh settings | ||
| """ | ||
| # LLM server settings | ||
| llm_provider = self.config.get("llm.provider_type", "none").lower() | ||
| llm_server = self.config.get("llm.server", "") | ||
|
|
||
| # For now we only support the Ollama API | ||
| if llm_provider == "ollama" and llm_server: | ||
| headers = {"Content-Type": "application/json"} | ||
| llm_api_key = self.config.get("llm.api_key", "") | ||
| llm_auth_type = self.config.get("llm.auth_type", "") | ||
| if llm_api_key and llm_auth_type: | ||
| headers[llm_auth_type] = llm_api_key | ||
|
|
||
| available_models = {} | ||
| try: | ||
| response = requests.get(f"{llm_server}/api/tags", headers=headers, timeout=10) | ||
| if response.status_code == 200: | ||
| settings = response.json() | ||
| for model in settings.get("models", []): | ||
| model = model["name"] | ||
| try: | ||
| model_metadata = requests.post(f"{llm_server}/api/show", headers=headers, json={"model": model}, timeout=10).json() | ||
| available_models[model] = { | ||
| "name": f"{model_metadata['model_info']['general.basename']} ({model_metadata['details']['parameter_size']} parameters)", | ||
| "model_card": f"https://ollama.com/library/{model}", | ||
| "provider": "local" | ||
| } | ||
|
|
||
| except (requests.RequestException, json.JSONDecodeError, KeyError) as e: | ||
| self.log.debug(f"Could not get metadata for model {model} from Ollama - skipping (error: {e})") | ||
|
|
||
| self.config.set("llm.available_models", available_models) | ||
| self.log.debug("Refreshed LLM server settings cache") | ||
| else: | ||
| self.log.warning(f"Could not refresh LLM server settings cache - server returned status code {response.status_code}") | ||
|
|
||
| except requests.RequestException as e: | ||
| self.log.warning(f"Could not refresh LLM server settings cache - request error: {str(e)}") | ||
|
No newline at end of file |
There was a problem hiding this comment.
The ItemUpdater worker now has an empty work() method but still schedules itself every 60 seconds via ensure_job. This means a job is created and claimed every minute to do absolutely nothing. If this worker exists solely as a placeholder for future use, it should either be removed or the scheduling interval should be reduced. If it's no longer needed, consider removing the worker entirely to avoid unnecessary job queue churn.
|
@copilot open a new pull request to apply changes based on the comments in this thread:
|
|
@dale-wahl I've opened a new pull request, #581, to work on those changes. Once the pull request is ready, I'll request review from you. |
…config docs (#581) * Initial plan * Fix stale enabled models, disable refresh_items scheduling, update README docs Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com>
That's right, boys and girls, now you can spin up an Ollama container right beside your 4CAT containers. Lil admin UI action to pull and delete models on it (should work with other Ollama servers as well) as well as enable and disable models (should work with
tagsas well, but did not test that).The gist:
docker compose -f docker-compose.yml -f docker-compose_ollama.yml up -dIt's that simple! (Or almost that simple; you do need to un-comment some lines if you want it to use GPU, but a) works without GPU--albeit slowly--and b) doesn't crash for those GPU-less users.)
You're welcome.