docs(integrations): add LlamaIndex integration guide by Johnny-zbb · Pull Request #285 · TencentCloud/CubeSandbox

Johnny-zbb · 2026-05-16T08:26:51Z

Closes #244

fslongjin · 2026-05-16T09:51:29Z

/cubebot review

github-actions · 2026-05-16T09:51:44Z

Claude finished @fslongjin's task in 57s —— View job

Review: LlamaIndex Integration Guide

This is a well-structured documentation PR that adds a bilingual (EN/ZH) LlamaIndex integration guide following the project's contribution guidelines. The code examples are clear and the guide covers the full lifecycle from setup to advanced use cases.

What's good

Bilingual compliance: Both English and Chinese files submitted as required by the contribution guidelines.
Follows the template: Covers all required sections (Integration Target, Prerequisites, Steps, Key Code, Caveats, References).
Index pages updated: Both docs/guide/integrations/index.md and docs/zh/guide/integrations/index.md have the new entry appended.
Useful Before/After comparison: The exec() → Sandbox.run_code() diff clearly communicates the value proposition.
Practical advanced scenarios: Network isolation, persistent sandbox, and pause/resume sections add real depth.
Good caveats section: Documents cold start latency, SDK scope limits, and network policy constraints honestly.

Issues to address

Unused import json in the create_cube_tool snippet (docs/guide/integrations/llamaindex.md:53 and the Chinese equivalent):
```
import json          # ← unused in this function
from llama_index.core.tools import FunctionTool
from cubesandbox import Sandbox, Config
```
The json module is only used in the "Going Further" network policy snippet (a separate code block), not in create_cube_tool itself. Remove this import from the create_cube_tool block, or move it to the network policy snippet.
Sandbox.create(metadata=...) without config (llamaindex.md:192-196 and ZH equivalent):
```
with Sandbox.create(
    metadata={"network-policy": "deny-all"}
) as sb:
```
Earlier examples always pass config=cfg to Sandbox.create(). If the sandbox requires Config (template_id, api_url, etc.), this code would fail at runtime. Either:
- Add a note that Config defaults are used, or
- Include a config=cfg parameter alongside metadata.
Optional: template_id example value — The doc uses python:3.12-slim as the example template ID. If Cube Sandbox uses a different ID format than Docker image tags, this could mislead users. Worth verifying the actual template ID format.
Optional: Redundant stop-word filter in the WordFreq example (llamaindex.md:157,161):
```
self.skip = {'the','a','an','is','are','was','were','in','on','at','to','of','for'}
# ...
if w and w not in self.skip and len(w) > 3:
```
Since len(w) > 3 already excludes all 1-3 character words, most entries in self.skip (the, a, an, is, in, on, at, to, of, for) will never match. Only are and was (length 4) provide actual filtering. Consider either removing the len(w) > 3 check or trimming the skip set to only 4+ letter stop words — whichever better reflects your intent.

Overall, this is a solid contribution. The content is accurate, well-organized, and follows project conventions. The issues above are minor and relate mostly to code quality in the examples.

kinwin-ustc · 2026-05-16T10:06:21Z

Can you share your environment? The cold start taking 1-2 seconds is too slow for us, better stateful sandbox processing, as well as domain-based network access control, are in our plan and will be released soon

fslongjin · 2026-05-16T10:08:24Z

@Johnny-zbb Thanks for submitting this LlamaIndex integration guide — the document structure is solid, and the bilingual (EN/ZH) coverage follows our contribution guidelines.

I want to emphasize that integration tutorials must be verified against a real CubeSandbox deployment before submission, otherwise incorrect examples can mislead other users and cause significant confusion. After reviewing the code examples in this PR, I found several issues that suggest they haven't been run against an actual CubeSandbox instance:

Critical: Template ID format is incorrect

The document uses template_id="python:3.12-slim" in multiple places, but CubeSandbox template IDs are auto-generated in the format tpl-<hex> (e.g. tpl-748094d2f2374b0a8a37e6ec), not Docker image references. Users need to create a template via cubemastercli tpl create-from-image first, then copy the actual template_id from the command output. Passing "python:3.12-slim" would cause the API to return a 404 (TemplateNotFoundError).

See: Creating Templates from OCI Images — Step 3 shows the correct usage: export CUBE_TEMPLATE_ID=tpl-xxx.

Critical: Bare image lacks `envd` daemon

Even if someone tried to create a template from the python:3.12-slim image, it doesn't include envd — the CubeSandbox protocol endpoint — and would fail the readiness probe (:49983/health → 204). The correct approach is to:

Build on top of cubesandbox-base (FROM ghcr.io/tencentcloud/cubesandbox-base:2026.16), or
Inject envd and cube-entrypoint.sh via COPY --from=cubesandbox-base in your Dockerfile

See: Bring Your Own Image (envd)

Other code issues

Sandbox.create(metadata=...) missing config — The network isolation examples in "Going Further" call Sandbox.create(metadata={"network-policy": "deny-all"}) without passing template or config, which raises ValueError when env vars aren't set.
Unused import json — The create_cube_tool function imports json but never uses it (it's only used in a later network policy snippet).
Prefer the SDK's native network parameter — Use Sandbox.create(network={"allow_out": [...], "deny_out": [...]}) instead of metadata={"network-policy": ...}. The network parameter is type-checked and more explicit.

Suggestions

To ensure document quality, I'd recommend:

Running the examples end-to-end against a real CubeSandbox deployment
Replacing all instances of "python:3.12-slim" with either a real tpl-* template ID or the <your-template-id> placeholder, and noting that readers must create a template first
Adding a template creation step to the Prerequisites section

Happy to help if you have questions about any of these points!

- Replace invalid template_id='python:3.12-slim' with <your-template-id> placeholder - Add template creation instructions to Prerequisites section - Use SDK native network parameter for network isolation (network={"allow_out": [...]}) - Add config=cfg to Sandbox.create() calls in network examples - Remove unused 'import json' statement - Fix metadata={} usage to SDK-native parameters

- Complete Python example for LlamaIndex + CubeSandbox integration - Demonstrates RAG workflow with secure code execution - Includes network isolation examples - Bilingual README (English + Chinese) - Environment configuration templates

Johnny-zbb · 2026-05-16T14:44:23Z

Can you share your environment? The cold start taking 1-2 seconds is too slow for us, better stateful sandbox processing, as well as domain-based network access control, are in our plan and will be released soon

Do you mean that you also wrote the guide on adding LlamaIndex integration?

docs(integrations): add LlamaIndex integration guide

446fc7e

Johnny-zbb requested a review from tinklone as a code owner May 16, 2026 08:26

Johnny-zbb mentioned this pull request May 16, 2026

[good first issue] docs: Help us build integration guides with mainstream AI agent frameworks #244

Open

Johnny-zbb added 2 commits May 16, 2026 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(integrations): add LlamaIndex integration guide#285

docs(integrations): add LlamaIndex integration guide#285
Johnny-zbb wants to merge 3 commits into
TencentCloud:masterfrom
Johnny-zbb:docs/add-llamaindex-integration

Johnny-zbb commented May 16, 2026 •

edited

Loading

Uh oh!

fslongjin commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026 •

edited

Loading

Uh oh!

kinwin-ustc commented May 16, 2026

Uh oh!

fslongjin commented May 16, 2026

Uh oh!

Johnny-zbb commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Johnny-zbb commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fslongjin commented May 16, 2026

Uh oh!

github-actions Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: LlamaIndex Integration Guide

What's good

Issues to address

Uh oh!

kinwin-ustc commented May 16, 2026

Uh oh!

fslongjin commented May 16, 2026

Critical: Template ID format is incorrect

Critical: Bare image lacks envd daemon

Other code issues

Suggestions

Uh oh!

Johnny-zbb commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Johnny-zbb commented May 16, 2026 •

edited

Loading

github-actions Bot commented May 16, 2026 •

edited

Loading

Critical: Bare image lacks `envd` daemon