Skip to content

docs/scripts: remove stale Ollama references (llama.cpp is sole backend)#65

Merged
AlienWalker1995 merged 1 commit into
mainfrom
chore/remove-ollama-refs-pr1
Jul 1, 2026
Merged

docs/scripts: remove stale Ollama references (llama.cpp is sole backend)#65
AlienWalker1995 merged 1 commit into
mainfrom
chore/remove-ollama-refs-pr1

Conversation

@AlienWalker1995

Copy link
Copy Markdown
Owner

What & why

The inference backend was switched from Ollama to llama.cpp (service llamacpp, fronted by the LiteLLM model-gateway), but stale Ollama references lingered across docs and wrapper scripts — declared config drifting from reality. This is PR1 of 2: scrub the runtime/infra-level references. PR2 will decommission the dashboard's dead Ollama model-management subsystem (dashboard/app.py, static/index.html, tests) and the docker-compose.yml OLLAMA_* env shims.

Changes

  • Delete overrides/ollama-expose.yml — dead override (added a host port to a service that no longer exists).
  • Scripts/probes: compose, compose.ps1, doctor.ps1, doctor.sh, smoke_test.ps1, detect_hardware.py (docstring) — drop ollama service examples + the :11434 health probes; remove now-orphaned helper functions.
  • Docs: README.md, SECURITY.md, docs/GETTING_STARTED.md, docs/configuration.md, docs/data.md, and the PRD set — reflect llama.cpp/LiteLLM as the sole backend; inference chain reduced from "llama.cpp / Ollama / vLLM" to llama.cpp; host-tools reach models via the gateway at 127.0.0.1:11435/v1.

Verification

  • Every identifier introduced (gguf-puller, GGUF_MODELS, llamacpp-embed, LLAMACPP_URL, LLAMACPP_EMBED_URL, CLAUDE_CODE_LOCAL_MODEL, models/gguf/, port 11435) verified to exist in the real compose/config.
  • model-gateway confirmed to be LiteLLM (litellm_config.yaml, no main.py) — the old doc's main.py/provider-prefix description was itself stale and is corrected.
  • ASCII architecture diagram edits checked for column alignment.
  • detect_hardware.py still parses.

Deferred / follow-ups

  • PR2: dashboard Ollama subsystem + tests + docker-compose.yml OLLAMA_* env vars + README "Ollama models" section + component-dashboard-ui.md.
  • Out of scope (noticed): stale vLLM references (nonexistent overrides/vllm.yml / removed profile) — warrant a separate cleanup.
  • CHANGELOG.md intentionally untouched (historical record).

🤖 Generated with Claude Code

…ackend)

The inference backend was switched from Ollama to llama.cpp (service `llamacpp`
fronted by the LiteLLM `model-gateway`), but Ollama references lingered across
docs and wrapper scripts. This PR scrubs the runtime/infra-level references so
declared config matches reality.

- Delete dead overrides/ollama-expose.yml (added a port to a nonexistent service)
- Wrappers/probes: compose, compose.ps1, doctor.ps1/.sh, smoke_test.ps1,
  detect_hardware.py docstring — drop ollama service examples + health probes
- Docs (README, GETTING_STARTED, configuration, data, SECURITY, and the PRD set)
  updated to llama.cpp/LiteLLM; inference chain reduced to llama.cpp

Deferred to a follow-up PR (kept in sync with the code they describe):
- dashboard/app.py + static/index.html Ollama model-management subsystem + tests
- docker-compose.yml OLLAMA_* env shims, README "Ollama models" section,
  component-dashboard-ui.md
Untouched: CHANGELOG.md (historical). Note: stale vLLM references also exist and
warrant a separate cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AlienWalker1995 AlienWalker1995 merged commit dceb7e8 into main Jul 1, 2026
5 checks passed
@AlienWalker1995 AlienWalker1995 deleted the chore/remove-ollama-refs-pr1 branch July 1, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant