adding support for multiple ollama #19

agatons · 2026-01-15T13:27:03Z

No description provided.

Copilot

Pull request overview

This pull request adds support for multiple Ollama instances to enable load balancing across multiple Ollama servers. The implementation uses a channel-based pooling mechanism to distribute requests across available instances.

Changes:

Refactored the Ollama client to support multiple URIs with a channel-based instance pool for concurrent request handling
Modified the CLI configuration to accept multiple Ollama URLs instead of a single URL
Added three new Qwen3 embedding models (0.6b, 4b, 8b variants)

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
services/ollama/ollama.go	Refactored to support multiple URIs with pooling mechanism and improved response body handling
services/ollama/llm.go	Added instance selection logic for generation requests with enhanced logging
services/ollama/models.go	Added three new Qwen3 embedding model variants
bellmand/bellamnd.go	Updated CLI to accept multiple Ollama URLs via StringSliceFlag

Comments suppressed due to low confidence (1)

services/ollama/models.go:170

The three new Qwen3 embedding models (EmbedModel_qwen3_06b, EmbedModel_qwen3_4b, EmbedModel_qwen3_8b) are defined but not added to the EmbedModels map. This means they cannot be discovered or used by the application. Add entries for all three models to the EmbedModels map.

var EmbedModels = map[string]embed.Model{
	EmbedModel_nomic_embed_text.Name:        EmbedModel_nomic_embed_text,
	EmbedModel_mxbai_embed_large.Name:       EmbedModel_mxbai_embed_large,
	EmbedModel_paraphrase_multilingual.Name: EmbedModel_paraphrase_multilingual,
	EmbedModel_bge_large.Name:               EmbedModel_bge_large,
	EmbedModel_bge_m3.Name:                  EmbedModel_bge_m3,
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-15T13:29:48Z

services/ollama/ollama.go

+	// Acquire an available instance (blocks if all instances are busy)
+	instanceIdx := <-g.available
+	defer func() {
+		// Return the instance to the pool when done
+		g.available <- instanceIdx
+	}()
+
+	uri := g.uris[instanceIdx]
+	g.log("[embed] using instance", "index", instanceIdx, "uri", uri, "request", reqc)
+
 	var embeddings [][]float64
 	var tokenTotal int
 	for _, text := range request.Texts {


The instance is acquired once at the function level but multiple HTTP requests are made in the loop (lines 83-130). This holds the instance for the entire duration of processing all texts sequentially, preventing other goroutines from using it. Consider either: (1) moving the instance acquisition inside the loop to allow interleaving of requests across instances, or (2) keeping the current design if you want to ensure all texts in a single request use the same instance for consistency.

Copilot · 2026-01-15T13:29:49Z

services/ollama/ollama.go

 		if resp.StatusCode != http.StatusOK {
 			d, _ := io.ReadAll(resp.Body)
+			resp.Body.Close()
 			return nil, fmt.Errorf("unexpected status code, %d, %s", resp.StatusCode, string(d))
 		}

 		var respModel embedResponse
 		err = json.NewDecoder(resp.Body).Decode(&respModel)
+		resp.Body.Close()


There is a missing check after resp.Body.Close() on line 118. If the JSON decoding on line 117 returns an error, the response body is still closed, but the error handling on line 120 continues without knowing if the close operation succeeded. While the current logic works, the response body could potentially be closed twice if an error occurs during JSON decoding and the function returns early.

adding support for multiple ollama

1e4f49b

agatons requested a review from Copilot January 15, 2026 13:27

Copilot started reviewing on behalf of agatons January 15, 2026 13:27 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding support for multiple ollama #19

adding support for multiple ollama #19

agatons commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adding support for multiple ollama #19

Are you sure you want to change the base?

adding support for multiple ollama #19

Conversation

agatons commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants