Skip to content

Ollama setup via Docker PLUS manage your LLM models UI#576

Open
dale-wahl wants to merge 8 commits intomasterfrom
ollama_management
Open

Ollama setup via Docker PLUS manage your LLM models UI#576
dale-wahl wants to merge 8 commits intomasterfrom
ollama_management

Conversation

@dale-wahl
Copy link
Member

That's right, boys and girls, now you can spin up an Ollama container right beside your 4CAT containers. Lil admin UI action to pull and delete models on it (should work with other Ollama servers as well) as well as enable and disable models (should work with tags as well, but did not test that).

The gist: docker compose -f docker-compose.yml -f docker-compose_ollama.yml up -d

It's that simple! (Or almost that simple; you do need to un-comment some lines if you want it to use GPU, but a) works without GPU--albeit slowly--and b) doesn't crash for those GPU-less users.)

You're welcome.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Ollama LLM container support to 4CAT's Docker stack, along with an admin UI to manage LLM models (pull, delete, enable/disable). The LLM model refresh logic is extracted from the old refresh_items worker into a dedicated OllamaManager worker. A new llm.enabled_models configuration setting allows admins to control which available models are exposed to users.

Changes:

  • New OllamaManager backend worker for refreshing, pulling, and deleting Ollama models via the Ollama HTTP API
  • New /admin/llm/ admin panel (views_llm.py + llm-server.html) for managing LLM models, gated by both admin privileges and llm.access
  • New docker-compose_ollama.yml override for running Ollama as a Docker sidecar, with auto-configuration in docker_setup.py

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
backend/workers/ollama_manager.py New worker for Ollama model refresh/pull/delete operations
backend/workers/refresh_items.py LLM refresh logic removed; worker now does nothing
webtool/views/views_llm.py New Flask blueprint for the admin LLM management panel
webtool/templates/controlpanel/llm-server.html New admin panel template for model listing and actions
webtool/templates/controlpanel/layout.html Adds "LLM Server" nav link when llm.access is enabled
webtool/__init__.py Registers the new views_llm blueprint
processors/machine_learning/llm_prompter.py Filters available models by enabled list before showing to users
common/lib/config_definition.py Adds llm.enabled_models config definition
docker/docker_setup.py Auto-configures LLM settings when Ollama is detected on Docker network
docker-compose_ollama.yml New Docker Compose override for the Ollama sidecar service
docker/README.md Documents the Ollama Docker setup and usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +46 to +49
elif task == "delete":
success = self.delete_model(model_name)
if success:
self.refresh_models()
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a model is successfully deleted from the Ollama server, refresh_models() updates llm.available_models to remove the deleted model, but llm.enabled_models is never cleaned up. This means deleted models accumulate as stale entries in llm.enabled_models. While this doesn't cause an immediate runtime error (since llm_prompter.py intersects the two lists), it's misleading: after a delete-and-refresh cycle, the model would disappear from the available models table in the UI, but it remains in the enabled list. If the model is later re-pulled, it would reappear as already enabled, which could be surprising.

The delete_model() method (or the work() method after a successful delete) should remove the model from llm.enabled_models, or at minimum refresh_models() should reconcile llm.enabled_models to remove entries no longer present in llm.available_models.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +79
### Configuring 4CAT to use Ollama

1. Log in as admin and open **Control Panel → Settings**.
2. Set the following LLM fields:

| Setting | Value |
|---|---|
| LLM Provider Type | `ollama` |
| LLM Server URL | `http://ollama:11434` |
| LLM Access | enabled |

3. Save settings.
4. Open **Control Panel → LLM Server** (visible once *LLM Access* is enabled).
5. Use the **Refresh** button to load available models, then **Pull** a model
(e.g. `llama3.2:3b`) to download it from the Ollama library.
6. Enable the models you want to make available to users.
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker/README.md section "Configuring 4CAT to use Ollama" (steps 1–3) instructs users to manually set the LLM Provider Type, LLM Server URL, and LLM Access fields in the Control Panel Settings. However, docker/docker_setup.py now automatically detects the Ollama sidecar on first startup and configures these settings without user intervention. The README should mention this auto-configuration so users know they can skip steps 1–3 on a fresh install with the Ollama override.

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 27
def work(self):
# Refresh items
self.refresh_settings()

self.job.finish()

def refresh_settings(self):
"""
Refresh settings
"""
# LLM server settings
llm_provider = self.config.get("llm.provider_type", "none").lower()
llm_server = self.config.get("llm.server", "")

# For now we only support the Ollama API
if llm_provider == "ollama" and llm_server:
headers = {"Content-Type": "application/json"}
llm_api_key = self.config.get("llm.api_key", "")
llm_auth_type = self.config.get("llm.auth_type", "")
if llm_api_key and llm_auth_type:
headers[llm_auth_type] = llm_api_key

available_models = {}
try:
response = requests.get(f"{llm_server}/api/tags", headers=headers, timeout=10)
if response.status_code == 200:
settings = response.json()
for model in settings.get("models", []):
model = model["name"]
try:
model_metadata = requests.post(f"{llm_server}/api/show", headers=headers, json={"model": model}, timeout=10).json()
available_models[model] = {
"name": f"{model_metadata['model_info']['general.basename']} ({model_metadata['details']['parameter_size']} parameters)",
"model_card": f"https://ollama.com/library/{model}",
"provider": "local"
}

except (requests.RequestException, json.JSONDecodeError, KeyError) as e:
self.log.debug(f"Could not get metadata for model {model} from Ollama - skipping (error: {e})")

self.config.set("llm.available_models", available_models)
self.log.debug("Refreshed LLM server settings cache")
else:
self.log.warning(f"Could not refresh LLM server settings cache - server returned status code {response.status_code}")

except requests.RequestException as e:
self.log.warning(f"Could not refresh LLM server settings cache - request error: {str(e)}")

No newline at end of file
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ItemUpdater worker now has an empty work() method but still schedules itself every 60 seconds via ensure_job. This means a job is created and claimed every minute to do absolutely nothing. If this worker exists solely as a placeholder for future use, it should either be removed or the scheduling interval should be reduced. If it's no longer needed, consider removing the worker entirely to avoid unnecessary job queue churn.

Copilot uses AI. Check for mistakes.
@dale-wahl
Copy link
Member Author

@copilot open a new pull request to apply changes based on the comments in this thread:

  • remove enabled models via refresh_models, if they have been deleted.
  • update docs to reflex docker auto setting those variables, but ensure it is also clear how to manually set them to for another Ollama server
  • keep the refresh_items worker for later use, but comment out the ensure job or have it prematurely return since it is at the moment doing nothing.

Copy link
Contributor

Copilot AI commented Mar 10, 2026

@dale-wahl I've opened a new pull request, #581, to work on those changes. Once the pull request is ready, I'll request review from you.

…config docs (#581)

* Initial plan

* Fix stale enabled models, disable refresh_items scheduling, update README docs

Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: dale-wahl <32108944+dale-wahl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants