jd-opensource · XuZhang99 · May 6, 2026
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -33,7 +33,7 @@ For developers who want to contribute to our code, here is the guidance:
     + A function in plan
 
 ## 2. Install environment for development
-+ We strongly suggest you to read our **[Document](http://xxx/docs/)** before developing
++ We strongly suggest you to read our **[Document](../docs/src/content/docs/en/index.md)** before developing
 + For setting environment, please check our  **[Readme file](/README.md)**
 
 ## 3. Build our project

diff --git a/.github/CONTRIBUTING_zh.md b/.github/CONTRIBUTING_zh.md
@@ -36,7 +36,7 @@ xLLM致力于为每一位用户和开发者提供开放的XX，因此无论您
     + 计划实现的功能
 
 ## 2. 配置开发环境
-+ 在开发之前，可以参考我们的 **[文档](http://xxx/docs/)**
++ 在开发之前，可以参考我们的 **[文档](../docs/src/content/docs/zh/index.md)**
 + 关于环境配置，参见 **[Readme file](/README.md)**
 
 ## 3. 项目构建和运行
@@ -45,4 +45,4 @@ xLLM致力于为每一位用户和开发者提供开放的XX，因此无论您
 ## 4. 测试
 
 在pr提交之后，我们会对代码进行格式化及进一步测试。
-我们的测试目前还很不完善，因此欢迎开发者为测试作出贡献！
+我们的测试目前还很不完善，因此欢迎开发者为测试作出贡献！
diff --git a/.github/workflows/deploy_docs.yml b/.github/workflows/deploy_docs.yml
@@ -0,0 +1,65 @@
+name: Deploy Docs
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - 'docs/**'
+      - '.github/workflows/deploy_docs.yml'
+  pull_request:
+    branches: [main]
+    paths:
+      - 'docs/**'
+      - '.github/workflows/deploy_docs.yml'
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages
+  cancel-in-progress: false
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: 22
+          cache: npm
+          cache-dependency-path: docs/package-lock.json
+
+      - name: Install dependencies
+        working-directory: docs
+        run: npm ci
+
+      - name: Build
+        working-directory: docs
+        env:
+          BASE_PATH: /xllm
+          SITE_URL: https://jd-opensource.github.io/xllm
+        run: npm run build
+
+      - name: Upload Pages artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: docs/dist
+
+  deploy:
+    if: github.event_name != 'pull_request'
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/README.md b/README.md
@@ -12,19 +12,19 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. -->
 
-[English](./README.md) | [中文](./docs/project/README_zh.md)
+[English](./README.md) | [中文](./README.zh-CN.md)
 
 <div align="center">
-<img src="docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
+<img src="docs/src/content/docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
 
-[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://xllm.readthedocs.io/zh-cn/latest/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm) 
+[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://jd-opensource.github.io/xllm/en/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm) 
 
 </div>
 
 ---------------------
 
 <p align="center">
-| <a href="https://xllm.readthedocs.io/zh-cn/latest/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
+| <a href="https://jd-opensource.github.io/xllm/en/"><b>Documentation</b></a> | <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
 </p>
 
 
@@ -43,7 +43,7 @@ limitations under the License. -->
 **xLLM** is an **efficient LLM inference framework**, specifically optimized for **Chinese AI accelerators**, enabling enterprise-grade deployment with enhanced efficiency and reduced cost. The framework adopts a **service-engine decoupled** inference architecture, achieving breakthrough efficiency through several  technologies: at the service layer, including elastic scheduling of online/offline requests, dynamic PD disaggregation, a hybrid EPD mechanism for multimodal and high-availability fault tolerance; and at the engine layer, combined with technologies such as multi-stream parallel computing, graph fusion optimization, speculative inference, dynamic load balancing and global KV cache management. The overall architecture is shown below:
 
 <div align="center">
-<img src="docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
+<img src="docs/src/content/docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
 </div>
 
 **xLLM** already supports efficient deployment of mainstream large models (such as *DeepSeek-V3.1*, *Qwen2/3*, etc.) on Chinese AI accelerators, empowering enterprises to implement high-performance, low-cost AI large model applications. xLLM has been fully deployed in JD.com’s real core retail businesses, covering a variety of scenarios including intelligent customer service, risk control, supply chain optimization, ad recommendation, and more.
@@ -88,13 +88,13 @@ limitations under the License. -->
 | ILU      | BI150   |                 |
 | MUSA     | S5000   |                 |
 
-Besides, please check the supported models on different hardwares at [Supported Models List](docs/en/supported_models.md).
+Besides, please check the supported models on different hardwares at [Supported Models List](docs/src/content/docs/en/supported_models.md).
 
 ---
 
 ## Quick Start
 
-Please refer to [Quick Start](docs/en/getting_started/quick_start.md) for more details.
+Please refer to [Quick Start](docs/src/content/docs/en/getting_started/quick_start.md) for more details.
 
 --- 
 
@@ -114,15 +114,15 @@ There are several ways you can contribute to xLLM:
     + Send your pull request
 
 We appreciate all kinds of contributions! 🎉🎉🎉
-If you have problems about development, please check our document: **[Document](https://xllm.readthedocs.io/zh-cn/latest)**
+If you have problems about development, please check our document: **[Document](https://jd-opensource.github.io/xllm/en/)**
 
 ---
 
 ## Community & Support
 If you encounter any issues along the way, you are welcomed to submit reproducible steps and log snippets in the project's Issues area, or contact the xLLM Core team directly via your internal Slack. In addition, we have established official WeChat groups. You can access the following QR code to join. Welcome to contact us!
 
 <div align="center">
-  <img src="docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
+  <img src="docs/src/content/docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
 </div>
 
 ## Acknowledgment

diff --git a/docs/project/README_zh.md → README.zh-CN.md b/docs/project/README_zh.md → README.zh-CN.md
@@ -12,18 +12,18 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. -->
 
-[English](../../README.md) | [中文](./README_zh.md)
+[English](./README.md) | [中文](./README.zh-CN.md)
 
 <div align="center">
-<img src="../assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
+<img src="docs/src/content/docs/assets/logo_with_llm.png" alt="xLLM" style="width:50%; height:auto;">
 
-[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://xllm.readthedocs.io/zh-cn/latest/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm) 
+[![Document](https://img.shields.io/badge/Document-black?logo=html5&labelColor=grey&color=red)](https://jd-opensource.github.io/xllm/zh/) [![Docker](https://img.shields.io/badge/Docker-black?logo=docker&labelColor=grey&color=%231E90FF)](https://hub.docker.com/r/xllm/xllm-ai) [![License](https://img.shields.io/badge/license-Apache%202.0-brightgreen?labelColor=grey)](https://opensource.org/licenses/Apache-2.0) [![report](https://img.shields.io/badge/Technical%20Report-red?logo=arxiv&logoColor=%23B31B1B&labelColor=%23F0EBEB&color=%23D42626)](https://arxiv.org/abs/2510.14686) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/jd-opensource/xllm) 
 
 </div>
 
 ---------------------
 <p align="center">
-| <a href="https://xllm.readthedocs.io/zh-cn/latest/"><b>Documentation</b></a> |  <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
+| <a href="https://jd-opensource.github.io/xllm/zh/"><b>Documentation</b></a> |  <a href="https://arxiv.org/abs/2510.14686"><b>Technical Report</b></a> |
 </p>
 
 ### 📢 新闻
@@ -41,7 +41,7 @@ limitations under the License. -->
 **xLLM** 是一个高效的开源大模型推理框架，专为**国产芯片**优化设计，提供企业级的服务部署，使得性能更高、成本更低。该框架采用**服务-引擎分离的推理架构**，通过服务层的在离线请求弹性调度、动态PD分离、EPD混合机制及高可用容错设计，结合引擎层的多流并行计算、图融合优化、投机推理、动态负载均衡及全局KV缓存管理，实现推理效率突破性提升。xLLM整体架构和功能如下图所示：
 
 <div align="center">
-<img src="../assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
+<img src="docs/src/content/docs/assets/xllm_arch.png" alt="xllm_arch" style="width:90%; height:auto;">
 </div>
 
 **xLLM** 已支持主流大模型（如 *DeepSeek-V3.1*，*Qwen2/3*等）在国产芯片上的高效部署，助力企业实现高性能、低成本的 AI 大模型应用落地。xLLM已全面落地京东零售核心业务，涵盖智能客服、风控、供应链优化、广告推荐等多种场景。
@@ -85,13 +85,13 @@ xLLM 提供了强大的智能计算能力，通过硬件系统的算力优化与
 | ILU      | BI150  |                 |
 | MUSA     | S5000  |                 |
 
-此外，请在[模型支持列表](../zh/supported_models.md)查看不同硬件上的模型支持情况。
+此外，请在[模型支持列表](docs/src/content/docs/zh/supported_models.md)查看不同硬件上的模型支持情况。
 
 ---
 
 ## 快速开始
 
-请参考[快速开始文档](../zh/getting_started/quick_start.md)。
+请参考[快速开始文档](docs/src/content/docs/zh/getting_started/quick_start.md)。
 
 ---
 
@@ -111,7 +111,7 @@ xLLM 提供了强大的智能计算能力，通过硬件系统的算力优化与
     + 提出pull request
 
 感谢您的贡献！ 🎉🎉🎉
-如果您在开发中遇到问题，请参阅**[xLLM中文指南](https://xllm.readthedocs.io/zh-cn/latest)**
+如果您在开发中遇到问题，请参阅**[xLLM中文指南](https://jd-opensource.github.io/xllm/zh/)**
 
 ---
 
@@ -120,7 +120,7 @@ xLLM 提供了强大的智能计算能力，通过硬件系统的算力优化与
 如果您有企业内部Slack，请直接联系xLLM Core团队。另外，我们建立了官方微信群，可以访问以下二维码加入。欢迎沟通和联系我们:
 
 <div align="center">
-  <img src="../assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
+  <img src="docs/src/content/docs/assets/wechat_qrcode.png" alt="qrcode3" width="50%" />
 </div>
 
 ---

diff --git a/docs/.gitignore b/docs/.gitignore
@@ -0,0 +1,21 @@
+# build output
+dist/
+# generated types
+.astro/
+
+# dependencies
+node_modules/
+
+# logs
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+pnpm-debug.log*
+
+
+# environment variables
+.env
+.env.production
+
+# macOS-specific files
+.DS_Store
diff --git a/docs/LICENSE b/docs/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 xLLM-AI
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,96 @@
+# xLLM Documentation
+
+[English](./README.md) | [简体中文](./README.zh-CN.md)
+
+This directory contains the Astro + Starlight documentation site for
+[xLLM](https://github.com/jd-opensource/xllm), an LLM inference framework for
+high-performance serving on domestic AI accelerators.
+
+The site is built with Starlight and `starlight-theme-rapide`. It includes a
+custom header, bilingual navigation, and a page-level `Copy page` action for
+copying documentation content as Markdown.
+
+## Documentation Structure
+
+The documentation is maintained in two parallel language trees:
+
+- English: `src/content/docs/en`
+- Simplified Chinese: `src/content/docs/zh`
+
+The root path redirects to the English documentation. When the site is deployed
+under a base path such as `/xllm`, the same route structure is prefixed by that
+base path.
+
+- `/` redirects to `/en/`
+- `/en/` serves the English documentation
+- `/zh/` serves the Simplified Chinese documentation
+
+When adding or moving pages, keep matching relative paths in both language
+trees so Starlight can switch between languages for the same topic.
+
+## Project Layout
+
+```text
+.
+├── astro.config.mjs          # Starlight, locale, sidebar, and component config
+├── package.json              # npm scripts and dependencies
+├── src/
+│   ├── assets/               # Site-level assets such as the logo
+│   ├── components/           # Starlight component overrides
+│   ├── content/
+│   │   └── docs/
+│   │       ├── en/           # English documentation
+│   │       ├── zh/           # Simplified Chinese documentation
+│   │       └── assets/       # Documentation images and diagrams
+│   ├── pages/index.astro     # Redirect from / to /en/
+│   └── styles/theme.css      # Project theme customizations
+└── public/                   # Static public assets
+```
+
+## Local Development
+
+From the xLLM repository root, enter this directory first:
+
+```sh
+cd docs
+```
+
+Install dependencies:
+
+```sh
+npm install
+```
+
+Start the local development server:
+
+```sh
+npm run dev
+```
+
+Build the production site:
+
+```sh
+npm run build
+```
+
+Preview the production build:
+
+```sh
+npm run preview
+```
+
+## Editing Documentation
+
+- Put user-facing content under `src/content/docs/en` and
+  `src/content/docs/zh`.
+- Keep English and Chinese files aligned by path when a page exists in both
+  languages.
+- Store shared documentation images in `src/content/docs/assets`.
+- Update the `sidebar` section in `astro.config.mjs` when adding new sections
+  that should appear in navigation.
+- Run `npm run build` before submitting changes to catch broken routes,
+  frontmatter errors, and Starlight content issues.
+
+## Related Repository
+
+- xLLM source code: <https://github.com/jd-opensource/xllm>