feat(neosantara): add Neosantara integration as OpenAI-compatible pro…#20641
feat(neosantara): add Neosantara integration as OpenAI-compatible pro…#20641ErRickow wants to merge 7 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile OverviewGreptile Summary
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| docs/my-website/docs/providers/neosantara.md | Adds Neosantara provider documentation with SDK/proxy usage examples; content-only change (minor: missing trailing newline). |
| docs/my-website/sidebars.js | Adds Neosantara doc page to providers sidebar list. |
| litellm/init.py | Registers Neosantara model sets into provider/model maps by reading model cost data (works, but relies on model_cost integrity). |
| litellm/constants.py | Adds neosantara to chat/embedding provider lists and to openai_compatible providers/endpoints; adds neosantara embedding model set. |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Adds auto-detection for Neosantara when api_base matches https://api.neosantara.xyz/v1. |
| litellm/llms/openai_like/providers.json | Adds neosantara provider but appears to corrupt JSON structure (sarvam becomes nested/indented incorrectly), likely breaking JSON provider registry loading. |
| litellm/main.py | Extends embedding routing to treat neosantara/openai-compatible/JSON registry providers like OpenAI embeddings; change is OK but relies on valid provider registry. |
| litellm/types/utils.py | Adds NEOSANTARA to LlmProviders enum. |
| model_prices_and_context_window.json | Adds large Neosantara model cost map but introduces many duplicated keys and missing trailing newline; likely causes inconsistent model->provider resolution and pricing data. |
| provider_endpoints_support.json | Adds Neosantara endpoint support entry but omits the full endpoint key set present for other providers (may break tooling expecting those keys). |
| tests/test_litellm/test_neosantara.py | Adds tests for provider mapping and bridges, but mocks may not match actual schema (responses/messages) and uses global env without cleanup. |
Sequence Diagram
sequenceDiagram
participant User as Client
participant Lite as litellm
participant Prov as get_llm_provider()
participant Reg as JSONProviderRegistry/providers.json
participant HTTP as httpx/OpenAI-like transport
User->>Lite: completion(model="neosantara/...", messages, api_key?)
Lite->>Prov: resolve provider/model/api_base
Prov-->>Lite: provider="neosantara" (or inferred from api_base)
Lite->>Reg: load OpenAI-like provider config
Reg-->>Lite: base_url=https://api.neosantara.xyz/v1, param_mappings
Lite->>HTTP: POST /chat/completions (OpenAI-compatible)
HTTP-->>Lite: OpenAI-like response
Lite-->>User: ModelResponse
User->>Lite: embedding(model="neosantara/...", input)
Lite->>Prov: resolve provider/model/api_base
Prov-->>Lite: provider="neosantara"
Lite->>HTTP: POST /embeddings (OpenAI-compatible)
HTTP-->>Lite: embedding response
Lite-->>User: EmbeddingResponse
model_prices_and_context_window.json
Outdated
| "neosantara/nusantara-base": { | ||
| "max_input_tokens": 64000, | ||
| "max_output_tokens": 2048, | ||
| "input_cost_per_token": 0.00000001875, |
There was a problem hiding this comment.
Duplicate model cost keys
This commit introduces duplicated Neosantara entries in model_prices_and_context_window.json (e.g., neosantara/nusantara-base, neosantara/archipelago-70b, etc. appear multiple times). In JSON, duplicate keys are undefined/implementation-dependent; whichever occurrence is last wins, which can silently change max token limits/costs and provider detection via add_known_models().
Fix: de-duplicate so each model key appears exactly once, and validate that the resulting map is deterministic.
provider_endpoints_support.json
Outdated
| "neosantara": { | ||
| "display_name": "Neosantara (`neosantara`)", | ||
| "url": "https://docs.litellm.ai/docs/providers/neosantara", | ||
| "endpoints": { | ||
| "chat_completions": true, | ||
| "messages": true, | ||
| "responses": true, | ||
| "embeddings": true |
There was a problem hiding this comment.
Missing endpoint flags keys
The new neosantara entry’s endpoints object only includes a subset of keys (chat_completions/messages/responses/embeddings), while neighboring providers include the full flag set (e.g., image_generations, audio_transcriptions, moderations, batches, rerank, a2a, interactions, etc.). Any code that assumes these keys exist (rather than using .get(..., False)) will raise KeyError/fail when iterating providers.
Fix: include the full set of endpoint boolean keys with explicit true/false values, matching the schema used for other providers.
| def test_neosantara_responses_api_bridge(): | ||
| """ | ||
| Test that Neosantara works with litellm.responses() API bridge. | ||
| """ | ||
| os.environ["NEOSANTARA_API_KEY"] = "sk-1234" | ||
|
|
||
| with pytest.MonkeyPatch().context() as m: | ||
| def mock_send(self, request, **kwargs): | ||
| return httpx.Response( | ||
| 200, | ||
| content='{"choices": [{"message": {"content": "Hello from responses API"}, "finish_reason": "stop", "index": 0}], "usage": {"prompt_tokens": 10, "completion_tokens": 10, "total_tokens": 20}, "object": "chat.completion", "id": "resp-123", "created": 123456789}'.encode(), | ||
| request=request | ||
| ) | ||
|
|
||
| m.setattr("httpx.Client.send", mock_send) | ||
|
|
||
| response = litellm.responses( | ||
| model="neosantara/claude-3-haiku", | ||
| input="hi", | ||
| api_key="sk-1234" | ||
| ) | ||
|
|
||
| assert response.model == "neosantara/claude-3-haiku" | ||
| assert response.output[0].content[0].text == "Hello from responses API" | ||
|
|
There was a problem hiding this comment.
Bridge tests mock wrong schema
test_neosantara_responses_api_bridge() mocks an OpenAI chat.completion payload and then asserts on litellm.responses() output shape. responses() expects a Responses API object (with output blocks), and anthropic_messages() expects an Anthropic messages-shaped response. If the bridge code doesn’t transform chat.completion into those shapes, these tests will fail (or pass for the wrong reasons if the bridge returns raw dicts).
Fix: mock the actual wire-format each bridge expects/produces (Responses API + Anthropic Messages), or assert on the exact transformation behavior implemented in the bridge layer.
Additional Comments (1)
Fix: ensure (Also verify the file parses as JSON after the edit.) |
|
@greptile-apps can you re-review this? |
Greptile OverviewGreptile SummaryThis PR adds Neosantara as an OpenAI-compatible provider by registering a JSON provider config ( Provider routing is primarily handled via Tests and metadata files were added/updated to reflect supported endpoints and model pricing/context windows, but the new tests’ HTTP mocking and bridge-schema assertions do not appear aligned with LiteLLM’s actual request path and bridge output shapes. Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| docs/my-website/docs/providers/neosantara.md | Adds Neosantara provider documentation with SDK and proxy examples; no functional code changes. |
| docs/my-website/sidebars.js | Adds Neosantara doc to providers sidebar; ensure build tooling validates JS syntax in CI. |
| litellm/init.py | No Neosantara-specific logic changes observed; relies on existing provider loading and model registration paths. |
| litellm/constants.py | Registers 'neosantara' in openai-compatible lists and embedding providers; includes a likely duplicate entry ('baseten') pre-existing, and adds neosantara endpoint substring matching. |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Adds api_base auto-detection for https://api.neosantara.xyz/v1 -> neosantara; bug: the endpoint constant string does not match the substring list entry, so auto-detection will not fire. |
| litellm/llms/openai_like/providers.json | Adds JSON-configured provider entry for neosantara with base_url and env var; used by JSONProviderRegistry. |
| litellm/main.py | No Neosantara-specific logic found in scanned sections; large file, but provider routing relies on get_llm_provider logic. |
| litellm/types/utils.py | Adds NEOSANTARA to LlmProviders enum; integrates into provider_list / validation paths. |
| model_prices_and_context_window.json | Adds Neosantara model pricing/context entries; prior duplicate-key issue was noted in review threads and should remain de-duplicated and deterministic. |
| provider_endpoints_support.json | Adds neosantara provider endpoint support flags; prior review notes missing full endpoint key set—ensure schema-complete endpoints map. |
| tests/test_litellm/test_neosantara.py | Adds tests for provider mapping, request formatting, and bridges; tests assume httpx.Client.send patching is sufficient and include bridge-shape assertions that likely don't match actual bridge outputs. |
Sequence Diagram
sequenceDiagram
participant U as Caller
participant GL as get_llm_provider()
participant JPR as JSONProviderRegistry
participant OH as OpenAI-like handler
participant NX as Neosantara API
U->>GL: "get_llm_provider(model=neosantara/..., api_base=None)"
GL->>JPR: "exists('neosantara')"
JPR-->>GL: "true"
GL-->>U: "provider=neosantara, api_base=https://api.neosantara.xyz/v1"
U->>GL: "get_llm_provider(model=claude-3-haiku, api_base=https://api.neosantara.xyz/v1)"
GL-->>U: "expected provider=neosantara"
Note over GL,U: "In this PR, substring list vs. endpoint comparison mismatch can prevent auto-detect"
U->>OH: "completion(model=neosantara/claude-3-haiku)"
OH->>NX: "POST /v1/chat/completions"
NX-->>OH: "chat.completion JSON"
OH-->>U: "ModelResponse"
| elif endpoint == "https://api.inference.wandb.ai/v1": | ||
| custom_llm_provider = "wandb" | ||
| dynamic_api_key = get_secret_str("WANDB_API_KEY") | ||
| elif endpoint == "https://api.neosantara.xyz/v1": | ||
| custom_llm_provider = "neosantara" | ||
| dynamic_api_key = get_secret_str("NEOSANTARA_API_KEY") | ||
|
|
||
| if api_base is not None and not isinstance(api_base, str): |
There was a problem hiding this comment.
Neosantara auto-detect never triggers
get_llm_provider() checks for endpoint in litellm.openai_compatible_endpoints: if endpoint in api_base, but the Neosantara entry in openai_compatible_endpoints is "https://api.neosantara.xyz/v1" while this branch compares against endpoint == "https://api.neosantara.xyz/v1" (note the extra https:// prefix in the list and also that other entries are often host/path substrings). As written, endpoint in api_base will not match when users pass api_base="https://api.neosantara.xyz/v1" unless the list and the elif endpoint == ... constant are aligned; this makes the documented auto-detection path fail (and your test_neosantara_auto_detection_api_base will fail in real execution).
There was a problem hiding this comment.
Looking at the code, the issue is that the auto-detection happens inside the loop that checks if endpoint in api_base, but your elif check uses exact equality == instead of the substring check.
The fix is simple - change your condition to match the pattern used by other providers:
| elif endpoint == "https://api.inference.wandb.ai/v1": | |
| custom_llm_provider = "wandb" | |
| dynamic_api_key = get_secret_str("WANDB_API_KEY") | |
| elif endpoint == "https://api.neosantara.xyz/v1": | |
| custom_llm_provider = "neosantara" | |
| dynamic_api_key = get_secret_str("NEOSANTARA_API_KEY") | |
| if api_base is not None and not isinstance(api_base, str): | |
| elif "api.neosantara.xyz" in endpoint: | |
| custom_llm_provider = "neosantara" | |
| dynamic_api_key = get_secret_str("NEOSANTARA_API_KEY") |
This way:
- The outer loop condition
if endpoint in api_basewill match whenapi_base="https://api.neosantara.xyz/v1" - Your inner check
"api.neosantara.xyz" in endpointwill then correctly identify it as Neosantara
This pattern matches how other providers like Fireworks and Together AI handle detection in the same file.
| # We mock the actual call to avoid network requests | ||
| with pytest.MonkeyPatch().context() as m: | ||
| def mock_send(self, request, **kwargs): | ||
| # Verify request URL and headers | ||
| assert "api.neosantara.xyz" in str(request.url) | ||
| assert request.headers["Authorization"] == "Bearer sk-1234" | ||
|
|
||
| # Verify request body is OpenAI format | ||
| body = json.loads(request.read()) | ||
| assert "messages" in body | ||
| assert body["model"] == "claude-3-haiku" | ||
|
|
||
| return httpx.Response( | ||
| 200, | ||
| content='{"choices": [{"message": {"content": "Hello world"}, "finish_reason": "stop", "index": 0}], "usage": {"prompt_tokens": 10, "completion_tokens": 10, "total_tokens": 20}, "object": "chat.completion"}'.encode(), | ||
| request=request | ||
| ) | ||
|
|
||
| m.setattr("httpx.Client.send", mock_send) | ||
|
|
||
| response = litellm.completion( | ||
| model="neosantara/claude-3-haiku", | ||
| messages=[{"role": "user", "content": "hi"}], | ||
| api_key="sk-1234" | ||
| ) | ||
|
|
||
| assert response.choices[0].message.content == "Hello world" | ||
| assert response._hidden_params["custom_llm_provider"] == "neosantara" | ||
| assert response._hidden_params["api_base"] == "https://api.neosantara.xyz/v1" | ||
|
|
There was a problem hiding this comment.
Monkeypatch may not intercept HTTP
These tests patch httpx.Client.send / httpx.AsyncClient.send, but LiteLLM often routes requests through its own HTTPHandler / AsyncHTTPHandler (or an AsyncClient instance), so patching the class method may not affect the actual call path. If the patched method isn’t used, the test can attempt a real network request (and/or fail for the wrong reason). Consider patching the concrete request function LiteLLM uses (e.g., the module-level client/handler used in OpenAI-like providers) so the tests deterministically intercept outbound traffic.
Also appears in: tests/test_litellm/test_neosantara.py:61-85, :105-131, :139-171.
| def test_neosantara_responses_api_bridge(): | ||
| """ | ||
| Test that Neosantara works with litellm.responses() API bridge. | ||
| """ | ||
| os.environ["NEOSANTARA_API_KEY"] = "sk-1234" | ||
|
|
||
| with pytest.MonkeyPatch().context() as m: | ||
| def mock_send(self, request, **kwargs): | ||
| # Verify the bridge transformed 'input' into OpenAI 'messages' | ||
| body = json.loads(request.read()) | ||
| assert "messages" in body | ||
| assert body["messages"][0]["content"] == "hi from responses" | ||
| assert "/v1/chat/completions" in str(request.url) | ||
|
|
||
| return httpx.Response( | ||
| 200, | ||
| content='{"choices": [{"message": {"content": "Hello from bridge"}, "finish_reason": "stop", "index": 0}], "usage": {"prompt_tokens": 10, "completion_tokens": 10, "total_tokens": 20}, "object": "chat.completion", "id": "chatcmpl-123", "created": 1677610602}'.encode(), | ||
| request=request | ||
| ) | ||
|
|
||
| m.setattr("httpx.Client.send", mock_send) | ||
|
|
||
| response = litellm.responses( | ||
| model="neosantara/claude-3-haiku", | ||
| input="hi from responses", | ||
| api_key="sk-1234" | ||
| ) | ||
|
|
||
| # Verify result is transformed to Responses API schema | ||
| assert response.model == "neosantara/claude-3-haiku" | ||
| assert response.output[0].content[0].text == "Hello from bridge" | ||
| assert hasattr(response, "created_at") |
There was a problem hiding this comment.
Bridge response schema mismatch
test_neosantara_responses_api_bridge() mocks a chat.completion-shaped response ({"choices": ...}) but then asserts litellm.responses() returns a Responses API object with output[0].content[0].text and created_at. Unless the bridge layer actually transforms chat.completions into Responses API output blocks, this will fail (or pass only if responses() is returning an unexpected raw/compat object). The mock payload/assertions need to match the real wire format and transformation implemented by the bridge.
Also appears in: tests/test_litellm/test_neosantara.py:133-170 for anthropic_messages().
|
@greptile-apps re review the changes |
Greptile OverviewGreptile SummaryAdded Neosantara as a new OpenAI-compatible LLM provider for the Indonesian market. Integration includes 42 models (chat and embeddings), provider auto-detection via Key changes:
Previous review concerns addressed:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/llms/openai_like/providers.json | Added neosantara provider configuration with correct base_url, api_key_env, and param_mappings |
| litellm/constants.py | Registered neosantara in all required provider lists and endpoints with correct configurations |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Added auto-detection logic for neosantara when api_base contains api.neosantara.xyz/v1 |
| litellm/main.py | Added neosantara to embedding condition (redundant with openai_compatible_providers check), also added broader compatibility checks |
| tests/test_litellm/test_neosantara.py | Comprehensive tests covering provider detection, completions, embeddings, and bridge APIs with proper mocking |
Sequence Diagram
sequenceDiagram
participant User
participant LiteLLM as litellm.completion()
participant GetProvider as get_llm_provider()
participant Registry as JSONProviderRegistry
participant OpenAI as OpenAILikeAPI
participant Neosantara as api.neosantara.xyz
User->>LiteLLM: completion(model="neosantara/claude-3-haiku")
LiteLLM->>GetProvider: get_llm_provider("neosantara/claude-3-haiku")
GetProvider->>GetProvider: Parse prefix "neosantara/"
GetProvider->>Registry: Check providers.json for "neosantara"
Registry-->>GetProvider: Found: base_url, api_key_env, param_mappings
GetProvider-->>LiteLLM: provider="neosantara", api_base="https://api.neosantara.xyz/v1"
alt Auto-detection via api_base
User->>LiteLLM: completion(model="claude-3-haiku", api_base="https://api.neosantara.xyz/v1")
LiteLLM->>GetProvider: get_llm_provider(model, api_base)
GetProvider->>GetProvider: Loop through openai_compatible_endpoints
GetProvider->>GetProvider: if "api.neosantara.xyz/v1" in api_base
GetProvider-->>LiteLLM: provider="neosantara", dynamic_api_key
end
LiteLLM->>OpenAI: Route to OpenAI-compatible handler
OpenAI->>OpenAI: Transform params (max_completion_tokens → max_tokens)
OpenAI->>Neosantara: POST /v1/chat/completions
Note over Neosantara: OpenAI-compatible endpoint
Neosantara-->>OpenAI: Chat completion response
OpenAI-->>LiteLLM: Parse and transform response
LiteLLM-->>User: ModelResponse object
litellm/main.py
Outdated
| or custom_llm_provider == "together_ai" | ||
| or custom_llm_provider == "nvidia_nim" | ||
| or custom_llm_provider == "litellm_proxy" | ||
| or custom_llm_provider == "neosantara" |
There was a problem hiding this comment.
redundant - neosantara is already in openai_compatible_providers (checked on line 4760)
|
@jquinter hi, i was resolving the commented code review there |
|
@greptile-apps re review the pr |
Greptile OverviewGreptile SummaryThis PR adds Neosantara, an Indonesian LLM gateway, as a new OpenAI-compatible provider to LiteLLM. The implementation follows established patterns and includes all necessary components. Key changes:
Previous issues resolved:
The implementation is clean, follows LiteLLM conventions, and integrates seamlessly with existing OpenAI-compatible provider infrastructure. Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| litellm/llms/openai_like/providers.json | Adds Neosantara provider config with base URL, API key env var, and parameter mappings |
| litellm/constants.py | Registers neosantara in provider lists, endpoints, and embedding models |
| litellm/litellm_core_utils/get_llm_provider_logic.py | Adds auto-detection logic for Neosantara based on api_base URL |
| model_prices_and_context_window.json | Adds 21 Neosantara models with pricing and token limits (duplicates resolved) |
| provider_endpoints_support.json | Adds complete Neosantara endpoint flags with all required keys |
| tests/test_litellm/test_neosantara.py | Comprehensive test suite with provider detection, chat, embeddings, and bridge tests |
Sequence Diagram
sequenceDiagram
participant User
participant LiteLLM
participant get_llm_provider
participant JSONProviderRegistry
participant OpenAILike
participant Neosantara
User->>LiteLLM: completion(model="neosantara/claude-3-haiku")
LiteLLM->>get_llm_provider: get_llm_provider(model)
alt Model has neosantara/ prefix
get_llm_provider->>JSONProviderRegistry: Check providers.json
JSONProviderRegistry-->>get_llm_provider: base_url, api_key_env, param_mappings
get_llm_provider-->>LiteLLM: provider="neosantara", api_base="https://api.neosantara.xyz/v1"
else api_base contains "api.neosantara.xyz/v1"
get_llm_provider->>get_llm_provider: Auto-detect from openai_compatible_endpoints
get_llm_provider-->>LiteLLM: provider="neosantara", dynamic_api_key from NEOSANTARA_API_KEY
end
LiteLLM->>OpenAILike: Route to OpenAI-compatible handler
OpenAILike->>OpenAILike: Apply param_mappings (max_completion_tokens → max_tokens)
OpenAILike->>Neosantara: POST https://api.neosantara.xyz/v1/chat/completions
Neosantara-->>OpenAILike: OpenAI-compatible response
OpenAILike-->>LiteLLM: Formatted response
LiteLLM-->>User: ModelResponse
Neosantara is an LLM Gateaway from Indonesian that provides OpenAI Compatible and Anthropic Compatibe endpoints
Maybe u need verify this changes at
litellm/main.pyIf this doesn't correct let me know and will be make a changes againRelevant issues
None
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
📖 Documentation
✅ Test
Changes
Registered Neosantara in [
litellm/llms/openai_like/providers.json].Added NEOSANTARA to LlmProviders enum in [
litellm/types/utils.py].Added neosantara to LITELLM_CHAT_PROVIDERS, openai_compatible_providers, and openai_compatible_endpoints in [
litellm/constants.py].Implemented auto-detection logic in
get_llm_provider_logic.pyto recognize the provider automatically when api_base="https://api.neosantara.xyz/v1" is used.Synchronized
model_prices_and_context_window.jsonwith Neosantara ModelsCreated comprehensive documentation in
docs/my-website/docs/providers/neosantara.md:Added Neosantara to the documentation sidebar.
Testing
Created
tests/test_litellm/test_neosantara.pywith 6 tests:litellm.responses()andlitellm.anthropic_messages().