Enhance inference documentation with batch API support by NiveditJain · Pull Request #583 · exospherehost/runtime

NiveditJain · 2025-12-08T15:53:19Z

Added details on the batch inference API format compatible with OpenAI and other providers. Included instructions for uploading JSONL files, making inference requests, and retrieving output files. Updated example requests and responses for clarity.

coderabbitai · 2025-12-08T15:53:29Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Summary by CodeRabbit

Documentation
- Added comprehensive guidance for batch inference using JSONL format, including file upload procedures, request/response examples, and batch output configuration.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds documentation describing batch inference via JSONL: how to upload files with PUT /v0/files/, invoke /v0/infer/ using a file_id, specify Output-Format: jsonl, and retrieve completed JSONL outputs. Contains example requests, responses, and notes that existing inference endpoints/status logic remain unchanged.

Changes

Cohort / File(s)	Summary
Batch inference docs `docs/docs/inference.md`	Added documentation for batch inference using JSONL: file upload (PUT /v0/files/), invoking inference with `file_id` at `/v0/infer/`, use of `Output-Format: jsonl`, example request/response payloads, status tracking, and sample completed output with `output_url`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Verify example requests/responses match actual API behavior and schemas
Confirm endpoint paths and HTTP methods are correct
Ensure JSONL examples are syntactically valid and representative

Poem

🐰 I nibbled text and found a clue,

Batch files line up, each one true,
PUT and infer, the steps align,
JSONL outputs, tidy, fine,
Hopping on docs — a helpful chew.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Enhance inference documentation with batch API support' directly and accurately summarizes the main change: adding documentation for batch API inference capabilities.
Description check	✅ Passed	The description is directly related to the changeset, detailing the batch inference API documentation additions including JSONL file handling, requests, responses, and output file retrieval.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch inference-standards

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2025-12-08T15:53:32Z

Summary of Changes

Hello @NiveditJain, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the inference documentation by introducing a new section dedicated to batch inference API support. It provides clear, step-by-step instructions and practical examples for users to leverage batch processing, including file uploads, request formatting, and output retrieval, aligning the API's capabilities with common industry standards.

Highlights

Batch Inference API Documentation: Added comprehensive documentation for the batch inference API, detailing its compatibility with formats used by OpenAI and other providers.
JSONL File Upload: Included instructions and examples for uploading JSONL files containing multiple inference requests using the PUT /v0/files/ endpoint.
Batch Inference Requests: Provided guidance on how to initiate batch inference using the /infer/ API by passing a file_id obtained from the file upload.
Output File Retrieval: Documented the process for requesting and retrieving inference outputs as JSONL files, including an example of downloading results from a provided output_url.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request enhances the inference documentation by adding details about the batch inference API. The changes are clear and provide good examples for the new workflow. I've made a few suggestions to correct minor grammatical errors and improve the overall readability of the documentation.

docs/docs/inference.md

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cf44b38 and bcf7924.

📒 Files selected for processing (1)

docs/docs/inference.md (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/docs/inference.md

128-128: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

130-130: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

134-134: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

136-136: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

146-146: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

152-152: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

165-165: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

176-176: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

178-178: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

188-188: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

191-191: Files should end with a single newline character

(MD047, single-trailing-newline)

🔇 Additional comments (1)

docs/docs/inference.md (1)

128-143: Markdown formatting issue is incorrect; blank line already exists before heading.

The review identifies a non-existent markdown formatting problem. A blank line already exists at line 128 before the ### PUT /v0/files/ heading at line 129, so the proposed diff would introduce a duplicate blank line.

The curl syntax concern about using -F (multipart form-data) with -X PUT is valid to consider, but requires verification against the actual server implementation expectations, which is not visible in the repository.

Likely an incorrect or invalid review comment.

docs/docs/inference.md

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

docs/docs/inference.md (4)

129-149: Fix markdown formatting: add blank lines before code blocks (MD031).

Multiple code blocks are missing required blank lines. Additionally, the text on line 145 can be improved for clarity.

 This API is used to upload a file to the server. Example request:
+
 ```bash
 curl -X PUT https://models.exosphere.host/v0/files/mydata.jsonl \
   -H "Authorization: Bearer <your-api-key>" \
   -F file="@mydata.jsonl"

Example response:
+

{
    "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
    "file_name": "mydata.jsonl",
    "bytes": 1000,
    "mime_type": "application/jsonl"
}

-The expected file content should look like:
+Expected file format (JSONL):
+

{"key": "object-1", "request": {"contents": [{"parts": [{"text": "Describe the process of photosynthesis."}]}], "generation_config": {"temperature": 0.7}, "model": "deepseek:r1-32b"}}
{"key": "object-2", "request": {"contents": [{"parts": [{"text": "What are the main ingredients in a Margherita pizza?"}]}], "generation_config": {"temperature": 0.7}, "model": "openai:gpt-4o"}}


---

`151-162`: **Add blank line before code block (MD031).**

The bash code block needs a blank line before it.



```diff
 Now you can pass the file_id to the `/infer/` API to run inference on the file. Example request:
+
 ```bash
 curl -X POST https://models.exosphere.host/v0/infer/ \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer <your-api-key>" \
   -d '[
     {
         "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
         "sla": 60
     }
   ]'


---

`164-186`: **Add missing blank lines around code blocks (MD031).**

Multiple code blocks in this section lack required blank lines before and after.



```diff
 You can further request outputs as a file by passing the header `Output-Format: jsonl` to the API. Example request:
+
 ```bash
 curl -X POST https://models.exosphere.host/v0/infer/ \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer <your-api-key>" \
   -H "Output-Format: jsonl" \
   -d '[
     {
         "file_id": "ae0b977c-76a0-4d71-81a5-05a6d8844852",
         "sla": 60
     }
   ]'

Example response:
+

{
    "status": "completed",
    "task_id": "2f92fc35-07d6-4737-aefa-8ddffd32f3fc",
    "total_items": 2,
    "output_url": "https://files.exosphere.host/v0/files/ae0b977c-76a0-4d71-81a5-05a6d8844852.jsonl"
}


---

`187-191`: **Fix markdown formatting and improve text clarity (MD031, MD047).**

Add blank line before the final code block, improve text clarity, and add trailing newline to the file.



```diff
-You can download the output file from the `output_url` and the content should look like:
+You can download the output file from the `output_url`. Expected output format (JSONL):
+
 ```jsonl
 {"key": "object-1", "output": {"type": "text", "text": "Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy."}}
 {"key": "object-2", "output": {"type": "text", "text": "The main ingredients in a Margherita pizza are tomato sauce, mozzarella cheese, and basil."}}


</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: CodeRabbit UI

**Review profile**: ASSERTIVE

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between bcf79246425a2911609b0d521d2c39236d1e1240 and 03c65d6d8f07c76bedebf876304858aab6f62a6f.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `docs/docs/inference.md` (1 hunks)

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>docs/docs/inference.md</summary>

128-128: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

---

130-130: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

134-134: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

136-136: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

146-146: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

152-152: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

165-165: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

176-176: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

178-178: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

188-188: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

191-191: Files should end with a single newline character

(MD047, single-trailing-newline)

</details>

</details>

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

coderabbitai · 2025-12-08T15:59:32Z

docs/docs/inference.md

+
+Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.
+
+### `PUT /v0/files/`


⚠️ Potential issue | 🟡 Minor

Add blank line before heading (MD022).

There should be a blank line between the paragraph and the section heading.

> **Note**: Auto retry policy will be triggered for transient failures without any additional cost. Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API. + ### `PUT /v0/files/`

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.

### `PUT /v0/files/`

Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.

### `PUT /v0/files/`

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

128-128: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

🤖 Prompt for AI Agents

In docs/docs/inference.md around lines 125 to 128, there is no blank line between the preceding paragraph and the "### `PUT /v0/files/`" heading which violates MD022; insert a single blank line between the paragraph that ends with "...pass the file to the `/infer/` API." and the "### `PUT /v0/files/`" heading so the heading is separated by an empty line.

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

docs/docs/inference.md Outdated Show resolved Hide resolved

docs/docs/inference.md Outdated Show resolved Hide resolved

docs/docs/inference.md Outdated Show resolved Hide resolved

docs/docs/inference.md Outdated Show resolved Hide resolved

NiveditJain and others added 2 commits December 8, 2025 21:25

Update docs/docs/inference.md

3565c56

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update docs/docs/inference.md

b70df9d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

docs/docs/inference.md Outdated Show resolved Hide resolved

docs/docs/inference.md Outdated Show resolved Hide resolved

docs/docs/inference.md Outdated Show resolved Hide resolved

Fix typos in inference documentation for clarity

03c65d6

NiveditJain merged commit 464b667 into main Dec 8, 2025
4 checks passed

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

NiveditJain deleted the inference-standards branch December 8, 2025 16:18


		Exosphere inference APIs also support the standard batch inference API format used by OpenAI, Gemini, and other providers. You can upload a JSONL file containing multiple inference requests, similar to OpenAI's batch API format and pass the file to the `/infer/` API.

		### `PUT /v0/files/`

Conversation

NiveditJain commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

coderabbitai bot commented Dec 8, 2025 •

edited

Loading