Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions docs/docs/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,70 @@ Example response:
- The object_id is a unique identifier for the object. It is used to track the status of the object. In case of any failure individual object status will be available to track the failure.

> **Note**: Auto retry policy will be triggered for transient failures without any additional cost.

Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.

### `PUT /v0/files/`
Comment on lines +125 to +128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add blank line before heading (MD022).

There should be a blank line between the paragraph and the section heading.

 > **Note**: Auto retry policy will be triggered for transient failures without any additional cost.
 
 Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.
+
 ### `PUT /v0/files/`
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.
### `PUT /v0/files/`
Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.
### `PUT /v0/files/`
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

128-128: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

🤖 Prompt for AI Agents
In docs/docs/inference.md around lines 125 to 128, there is no blank line
between the preceding paragraph and the "### `PUT /v0/files/`" heading which
violates MD022; insert a single blank line between the paragraph that ends with
"...pass the file to the `/infer/` API." and the "### `PUT /v0/files/`" heading
so the heading is separated by an empty line.

This API is used to upload a file to the server. Example request:
```bash
curl -X PUT https://models.exosphere.host/v0/files/mydata.jsonl \
-H "Authorization: Bearer <your-api-key>" \
-F file="@mydata.jsonl"
```
Example response:
```json
{
"file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
"file_name": "mydata.jsonl",
"bytes": 1000,
"mime_type": "application/jsonl"
}
```

The expected file content should look like:
```jsonl
{"key": "object-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}, "model": "deepseek:r1-32b"}}
{"key": "object-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}], "generation_config": {"temperature": 0.7}, "model": "openai:gpt-4o"}}
```

Now you can pass the file_id to the `/infer/` API to run inference on the file. Example request:
```bash
curl -X POST https://models.exosphere.host/v0/infer/ \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-d '[
{
"file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
"sla": 60
}
]'
```

You can further request outputs as a file by passing the header `Output-Format: jsonl` to the API. Example request:
```bash
curl -X POST https://models.exosphere.host/v0/infer/ \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-H "Output-Format: jsonl" \
-d '[
{
"file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
"sla": 60
}
]'
```
Example response:
```json
{
"status": "completed",
"task_id": "2f92fc35-07d6-4737-aefa-8ddffd32f3fc",
"total_items": 2,
"output_url": "https://files.exosphere.host/v0/files/ae0b977c-76a0-4d71-81a5-05a6d8844852.jsonl"
}
```

You can download the output file from the `output_url` and the content should look like:
```jsonl
{"key": "object-1", "output": {"type": "text", "text": "Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy."}}
{"key": "object-2", "output": {"type": "text", "text": "The main ingredients in a Margherita pizza are tomato sauce, mozzarella cheese, and basil."}}
```