From bcf79246425a2911609b0d521d2c39236d1e1240 Mon Sep 17 00:00:00 2001 From: NiveditJain Date: Mon, 8 Dec 2025 21:23:00 +0530 Subject: [PATCH 1/4] Enhance inference documentation with batch API support Added details on the batch inference API format compatible with OpenAI and other providers. Included instructions for uploading JSONL files, making inference requests, and retrieving output files. Updated example requests and responses for clarity. --- docs/docs/inference.md | 67 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/docs/docs/inference.md b/docs/docs/inference.md index 068cf851..3cfd6a0d 100644 --- a/docs/docs/inference.md +++ b/docs/docs/inference.md @@ -122,3 +122,70 @@ Example response: - The object_id is a unique identifier for the object. It is used to track the status of the object. In case of any failure individual object status will be available to track the failure. > **Note**: Auto retry policy will be triggered for transient failures without any additional cost. + +Exosphere inference APIs also supports the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to `/infer/` API. + +### `PUT /v0/files/` +This API is used to upload a file to the server. Example request: +```bash +curl -X PUT https://models.exosphere.host/v0/files/mydata.jsonl \ + -H "Authorization: Bearer " \ + -F file="@mydata.jsonl" +``` +Example response: +```json +{ + "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852", + "file_name": "mydata.jsonl", + "bytes": 1000, + "mime_type": "application/jsonl" +} +``` + +expected file content should like: +```jsonl +{"key": "object-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}, "model": "deepseek:r1-32b"}} +{"key": "object-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}], "generation_config": {"temperature": 0.7}, "model": "openai:gpt-4o"}} +``` + +Now you can pass the file_id to the `/infer/` API to run inference on the file. Example request: +```bash +curl -X POST https://models.exosphere.host/v0/infer/ \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer " \ + -d '[ + { + "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852", + "sla": 60 + } + ]' +``` + +You can further request outputs as a file by pass the header `Output-Format: jsonl` to the API. Example request: +```bash +curl -X POST https://models.exosphere.host/v0/infer/ \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer " \ + -H "Output-Format: jsonl" \ + -d '[ + { + "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852", + "sla": 60 + } + ]' +``` +Example response: +```json +{ + "status": "completed", + "task_id": "2f92fc35-07d6-4737-aefa-8ddffd32f3fc", + "total_items": 2, + "output_url": "https://files.exosphere.host/v0/files/ae0b977c-76a0-4d71-81a5-05a6d8844852.jsonl" +} +``` + +You can download the output file from the `output_url` and the content should like: +```jsonl +{"key": "object-1", "output": {"type": "text", "text": "Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy."}} +{"key": "object-2", "output": {"type": "text", "text": "The main ingredients in a Margherita pizza are tomato sauce, mozzarella cheese, and basil."}} +``` \ No newline at end of file From 3565c5692631db424f21cac37d35ec6293726344 Mon Sep 17 00:00:00 2001 From: Nivedit Jain Date: Mon, 8 Dec 2025 21:25:15 +0530 Subject: [PATCH 2/4] Update docs/docs/inference.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- docs/docs/inference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/inference.md b/docs/docs/inference.md index 3cfd6a0d..3633c2fb 100644 --- a/docs/docs/inference.md +++ b/docs/docs/inference.md @@ -123,7 +123,7 @@ Example response: > **Note**: Auto retry policy will be triggered for transient failures without any additional cost. -Exosphere inference APIs also supports the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to `/infer/` API. +Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API. ### `PUT /v0/files/` This API is used to upload a file to the server. Example request: From b70df9dfbdecc79caf469745815e710a9c76fcee Mon Sep 17 00:00:00 2001 From: Nivedit Jain Date: Mon, 8 Dec 2025 21:25:24 +0530 Subject: [PATCH 3/4] Update docs/docs/inference.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- docs/docs/inference.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/inference.md b/docs/docs/inference.md index 3633c2fb..d86aea4e 100644 --- a/docs/docs/inference.md +++ b/docs/docs/inference.md @@ -142,7 +142,7 @@ Example response: } ``` -expected file content should like: +The expected file content should look like: ```jsonl {"key": "object-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}, "model": "deepseek:r1-32b"}} {"key": "object-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}], "generation_config": {"temperature": 0.7}, "model": "openai:gpt-4o"}} From 03c65d6d8f07c76bedebf876304858aab6f62a6f Mon Sep 17 00:00:00 2001 From: NiveditJain Date: Mon, 8 Dec 2025 21:26:38 +0530 Subject: [PATCH 4/4] Fix typos in inference documentation for clarity --- docs/docs/inference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/docs/inference.md b/docs/docs/inference.md index d86aea4e..573db486 100644 --- a/docs/docs/inference.md +++ b/docs/docs/inference.md @@ -161,7 +161,7 @@ curl -X POST https://models.exosphere.host/v0/infer/ \ ]' ``` -You can further request outputs as a file by pass the header `Output-Format: jsonl` to the API. Example request: +You can further request outputs as a file by passing the header `Output-Format: jsonl` to the API. Example request: ```bash curl -X POST https://models.exosphere.host/v0/infer/ \ -H "Content-Type: application/json" \ @@ -184,7 +184,7 @@ Example response: } ``` -You can download the output file from the `output_url` and the content should like: +You can download the output file from the `output_url` and the content should look like: ```jsonl {"key": "object-1", "output": {"type": "text", "text": "Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy."}} {"key": "object-2", "output": {"type": "text", "text": "The main ingredients in a Margherita pizza are tomato sauce, mozzarella cheese, and basil."}}