Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ For developers who want to contribute to our code, here is the guidance:
+ A function in plan

## 2. Install environment for development
+ We strongly suggest you to read our **[Document](http://xxx/docs/)** before developing
+ We strongly suggest you to read our **[Document](../docs/src/content/docs/en/index.md)** before developing
+ For setting environment, please check our **[Readme file](/README.md)**

## 3. Build our project
Expand Down
4 changes: 2 additions & 2 deletions .github/CONTRIBUTING_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ xLLM致力于为每一位用户和开发者提供开放的XX,因此无论您
+ 计划实现的功能

## 2. 配置开发环境
+ 在开发之前,可以参考我们的 **[文档](http://xxx/docs/)**
+ 在开发之前,可以参考我们的 **[文档](../docs/src/content/docs/zh/index.md)**
+ 关于环境配置,参见 **[Readme file](/README.md)**

## 3. 项目构建和运行
Expand All @@ -45,4 +45,4 @@ xLLM致力于为每一位用户和开发者提供开放的XX,因此无论您
## 4. 测试

在pr提交之后,我们会对代码进行格式化及进一步测试。
我们的测试目前还很不完善,因此欢迎开发者为测试作出贡献!
我们的测试目前还很不完善,因此欢迎开发者为测试作出贡献!
65 changes: 65 additions & 0 deletions .github/workflows/deploy_docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Deploy Docs

on:
push:
branches: [main]
paths:
- 'docs/**'
- '.github/workflows/deploy_docs.yml'
pull_request:
branches: [main]
paths:
- 'docs/**'
- '.github/workflows/deploy_docs.yml'
workflow_dispatch:

permissions:
contents: read
pages: write
id-token: write

concurrency:
group: pages
cancel-in-progress: false

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: docs/package-lock.json

- name: Install dependencies
working-directory: docs
run: npm ci

- name: Build
working-directory: docs
env:
BASE_PATH: /xllm
SITE_URL: https://jd-opensource.github.io/xllm
run: npm run build

- name: Upload Pages artifact
uses: actions/upload-pages-artifact@v3
with:
path: docs/dist

deploy:
if: github.event_name != 'pull_request'
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,19 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. -->

[English](./README.md) | [中文](./docs/project/README_zh.md)
[English](./README.md) | [中文](./README.zh-CN.md)

<div align="center">
<img src="docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
<img src="docs/src/content/docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">

[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://xllm.readthedocs.io/zh-cn/latest/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm)
[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://jd-opensource.github.io/xllm/en/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm)

</div>

---------------------

<p align="center">
| <a href="https://xllm.readthedocs.io/zh-cn/latest/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
| <a href="https://jd-opensource.github.io/xllm/en/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
</p>


Expand All @@ -43,7 +43,7 @@ limitations under the License. -->
**xLLM** is an **efficient LLM inference framework**, specifically optimized for **Chinese AI accelerators**, enabling enterprise-grade deployment with enhanced efficiency and reduced cost. The framework adopts a **service-engine decoupled** inference architecture, achieving breakthrough efficiency through several technologies: at the service layer, including elastic scheduling of online/offline requests, dynamic PD disaggregation, a hybrid EPD mechanism for multimodal and high-availability fault tolerance; and at the engine layer, combined with technologies such as multi-stream parallel computing, graph fusion optimization, speculative inference, dynamic load balancing and global KV cache management. The overall architecture is shown below:

<div align="center">
<img src="docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
<img src="docs/src/content/docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
</div>

**xLLM** already supports efficient deployment of mainstream large models (such as *DeepSeek-V3.1*, *Qwen2/3*, etc.) on Chinese AI accelerators, empowering enterprises to implement high-performance, low-cost AI large model applications. xLLM has been fully deployed in JD.com’s real core retail businesses, covering a variety of scenarios including intelligent customer service, risk control, supply chain optimization, ad recommendation, and more.
Expand Down Expand Up @@ -88,13 +88,13 @@ limitations under the License. -->
| ILU | BI150 | |
| MUSA | S5000 | |

Besides, please check the supported models on different hardwares at [Supported Models List](docs/en/supported_models.md).
Besides, please check the supported models on different hardwares at [Supported Models List](docs/src/content/docs/en/supported_models.md).

---

## Quick Start

Please refer to [Quick Start](docs/en/getting_started/quick_start.md) for more details.
Please refer to [Quick Start](docs/src/content/docs/en/getting_started/quick_start.md) for more details.

---

Expand All @@ -114,15 +114,15 @@ There are several ways you can contribute to xLLM:
+ Send your pull request

We appreciate all kinds of contributions! 🎉🎉🎉
If you have problems about development, please check our document: **[Document](https://xllm.readthedocs.io/zh-cn/latest)**
If you have problems about development, please check our document: **[Document](https://jd-opensource.github.io/xllm/en/)**

---

## Community & Support
If you encounter any issues along the way, you are welcomed to submit reproducible steps and log snippets in the project's Issues area, or contact the xLLM Core team directly via your internal Slack. In addition, we have established official WeChat groups. You can access the following QR code to join. Welcome to contact us!

<div align="center">
<img src="docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
<img src="docs/src/content/docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
</div>

## Acknowledgment
Expand Down
18 changes: 9 additions & 9 deletions docs/project/README_zh.md → README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,18 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. -->

[English](../../README.md) | [中文](./README_zh.md)
[English](./README.md) | [中文](./README.zh-CN.md)

<div align="center">
<img src="../assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
<img src="docs/src/content/docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">

[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://xllm.readthedocs.io/zh-cn/latest/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm)
[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://jd-opensource.github.io/xllm/zh/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm)

</div>

---------------------
<p align="center">
| <a href="https://xllm.readthedocs.io/zh-cn/latest/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
| <a href="https://jd-opensource.github.io/xllm/zh/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
</p>

### 📢 新闻
Expand All @@ -41,7 +41,7 @@ limitations under the License. -->
**xLLM** 是一个高效的开源大模型推理框架,专为**国产芯片**优化设计,提供企业级的服务部署,使得性能更高、成本更低。该框架采用**服务-引擎分离的推理架构**,通过服务层的在离线请求弹性调度、动态PD分离、EPD混合机制及高可用容错设计,结合引擎层的多流并行计算、图融合优化、投机推理、动态负载均衡及全局KV缓存管理,实现推理效率突破性提升。xLLM整体架构和功能如下图所示:

<div align="center">
<img src="../assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
<img src="docs/src/content/docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
</div>

**xLLM** 已支持主流大模型(如 *DeepSeek-V3.1*,*Qwen2/3*等)在国产芯片上的高效部署,助力企业实现高性能、低成本的 AI 大模型应用落地。xLLM已全面落地京东零售核心业务,涵盖智能客服、风控、供应链优化、广告推荐等多种场景。
Expand Down Expand Up @@ -85,13 +85,13 @@ xLLM 提供了强大的智能计算能力,通过硬件系统的算力优化与
| ILU | BI150 | |
| MUSA | S5000 | |

此外,请在[模型支持列表](../zh/supported_models.md)查看不同硬件上的模型支持情况。
此外,请在[模型支持列表](docs/src/content/docs/zh/supported_models.md)查看不同硬件上的模型支持情况。

---

## 快速开始

请参考[快速开始文档](../zh/getting_started/quick_start.md)。
请参考[快速开始文档](docs/src/content/docs/zh/getting_started/quick_start.md)。

---

Expand All @@ -111,7 +111,7 @@ xLLM 提供了强大的智能计算能力,通过硬件系统的算力优化与
+ 提出pull request

感谢您的贡献! 🎉🎉🎉
如果您在开发中遇到问题,请参阅**[xLLM中文指南](https://xllm.readthedocs.io/zh-cn/latest)**
如果您在开发中遇到问题,请参阅**[xLLM中文指南](https://jd-opensource.github.io/xllm/zh/)**

---

Expand All @@ -120,7 +120,7 @@ xLLM 提供了强大的智能计算能力,通过硬件系统的算力优化与
如果您有企业内部Slack,请直接联系xLLM Core团队。另外,我们建立了官方微信群,可以访问以下二维码加入。欢迎沟通和联系我们:

<div align="center">
<img src="../assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
<img src="docs/src/content/docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
</div>

---
Expand Down
21 changes: 21 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# build output
dist/
# generated types
.astro/

# dependencies
node_modules/

# logs
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*


# environment variables
.env
.env.production

# macOS-specific files
.DS_Store
21 changes: 21 additions & 0 deletions docs/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 xLLM-AI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
96 changes: 96 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# xLLM Documentation

[English](./README.md) | [简体中文](./README.zh-CN.md)

This directory contains the Astro + Starlight documentation site for
[xLLM](https://github.com/jd-opensource/xllm), an LLM inference framework for
high-performance serving on domestic AI accelerators.

The site is built with Starlight and `starlight-theme-rapide`. It includes a
custom header, bilingual navigation, and a page-level `Copy page` action for
copying documentation content as Markdown.

## Documentation Structure

The documentation is maintained in two parallel language trees:

- English: `src/content/docs/en`
- Simplified Chinese: `src/content/docs/zh`

The root path redirects to the English documentation. When the site is deployed
under a base path such as `/xllm`, the same route structure is prefixed by that
base path.

- `/` redirects to `/en/`
- `/en/` serves the English documentation
- `/zh/` serves the Simplified Chinese documentation

When adding or moving pages, keep matching relative paths in both language
trees so Starlight can switch between languages for the same topic.

## Project Layout

```text
.
├── astro.config.mjs # Starlight, locale, sidebar, and component config
├── package.json # npm scripts and dependencies
├── src/
│ ├── assets/ # Site-level assets such as the logo
│ ├── components/ # Starlight component overrides
│ ├── content/
│ │ └── docs/
│ │ ├── en/ # English documentation
│ │ ├── zh/ # Simplified Chinese documentation
│ │ └── assets/ # Documentation images and diagrams
│ ├── pages/index.astro # Redirect from / to /en/
│ └── styles/theme.css # Project theme customizations
└── public/ # Static public assets
```

## Local Development

From the xLLM repository root, enter this directory first:

```sh
cd docs
```

Install dependencies:

```sh
npm install
```

Start the local development server:

```sh
npm run dev
```

Build the production site:

```sh
npm run build
```

Preview the production build:

```sh
npm run preview
```

## Editing Documentation

- Put user-facing content under `src/content/docs/en` and
`src/content/docs/zh`.
- Keep English and Chinese files aligned by path when a page exists in both
languages.
- Store shared documentation images in `src/content/docs/assets`.
- Update the `sidebar` section in `astro.config.mjs` when adding new sections
that should appear in navigation.
- Run `npm run build` before submitting changes to catch broken routes,
frontmatter errors, and Starlight content issues.

## Related Repository

- xLLM source code: <https://github.com/jd-opensource/xllm>
Loading
Loading