diff --git a/.sisyphus/plans/hello-world-cli-chain.md b/.sisyphus/plans/hello-world-cli-chain.md new file mode 100644 index 0000000000..479085c0a9 --- /dev/null +++ b/.sisyphus/plans/hello-world-cli-chain.md @@ -0,0 +1,146 @@ +# Minimal CLI Hello World Workflow Demo + +## TL;DR +> **Summary**: Demonstrate the smallest plausible OpenCode workflow for a trivial task: Prometheus creates this plan, `/start-work` launches a Sisyphus work session from it, and the executor runs a single shell command that prints `helloworld` to stdout. +> **Deliverables**: +> - A completed work session that runs exactly one stdout-producing command +> - Evidence files capturing stdout and the non-zero edge case +> - Parallel final verification review results +> **Effort**: Quick +> **Parallel**: NO +> **Critical Path**: 1 → F1/F2/F3/F4 + +## Context +### Original Request +Create a minimal calling-chain example for the task: output `helloworld` in the command line. + +### Interview Summary +- The user challenged whether Prometheus, Sisyphus, and Hephaestus form a real chain. +- Confirmed evidence supports orchestration hints, not a proven universal runtime call graph. +- This plan therefore models a **minimal plausible workflow**, not a claim about internal implementation. +- The chosen planning location is `openimsdk-docs/.sisyphus/plans/` because it is the only verified `.sisyphus` directory in the workspace and its `plans/` subdirectory is empty. + +### Metis Review (gaps addressed) +- Guardrail applied: do **not** expand this into a proof of actual runtime internals. +- Guardrail applied: treat this as a command-only demo with **no git commit**. +- Default applied: success evidence must include stdout content and shell exit code behavior. + +## Work Objectives +### Core Objective +Produce a minimal executor-ready workflow that prints `helloworld` in a command-line session and captures machine-verifiable evidence. + +### Deliverables +- `stdout` evidence showing exactly `helloworld` +- Failure-case evidence showing a non-zero exit and no false positive success claim +- Final review outputs from F1-F4 + +### Definition of Done (verifiable conditions with commands) +- Running `printf 'helloworld\n'` exits with code `0`. +- Captured stdout is exactly `helloworld` followed by a newline. +- The failure scenario uses `exit 1` and records a non-zero status. +- Evidence files exist at the exact paths defined in this plan. + +### Must Have +- Single-command implementation path +- No source-code edits +- No dependency installation +- Evidence recorded under `.sisyphus/evidence/` +- Explicit statement that the workflow is a planning/execution demo, not proof of hidden orchestration internals + +### Must NOT Have (guardrails, AI slop patterns, scope boundaries) +- Do not create a script file, program file, Makefile, test file, or README change for this task. +- Do not replace `printf` with a multi-step shell script unless Task 1 fails for environment-specific reasons. +- Do not add package managers, runtimes, containers, or CI steps. +- Do not claim that this plan proves a real internal `Prometheus -> Sisyphus -> Hephaestus` runtime chain. +- Do not create any git commit. + +## Verification Strategy +> ZERO HUMAN INTERVENTION — all verification is agent-executed. +- Test decision: none + shell-command verification only +- QA policy: Every task has agent-executed scenarios +- Evidence: `.sisyphus/evidence/task-1-hello-world-cli.txt` and `.sisyphus/evidence/task-1-hello-world-cli-error.txt` + +## Execution Strategy +### Parallel Execution Waves +> Target: 5-8 tasks per wave. <3 per wave (except final) = under-splitting. +> This is an intentional exception because the task is a single atomic shell action. + +Wave 1: Task 1 (single command execution + evidence capture) +Wave 2: Final verification wave F1-F4 in parallel + +### Dependency Matrix (full, all tasks) +- Task 1: no prerequisites +- F1: blocked by Task 1 +- F2: blocked by Task 1 +- F3: blocked by Task 1 +- F4: blocked by Task 1 + +### Agent Dispatch Summary (wave → task count → categories) +- Wave 1 → 1 task → `quick` +- Wave 2 → 4 tasks → `oracle`, `unspecified-high`, `unspecified-high`, `deep` + +## TODOs +> Implementation + Test = ONE task. Never separate. +> EVERY task MUST have: Agent Profile + Parallelization + QA Scenarios. + +- [ ] 1. Run the minimal CLI command and capture evidence + + **What to do**: Start work from this plan, then execute exactly `printf 'helloworld\n'` from the repository root `/Users/openim/Work/openimsdk-docs`. Capture stdout to `.sisyphus/evidence/task-1-hello-world-cli.txt`. Separately execute `sh -c 'exit 1'` and capture its exit-status evidence to `.sisyphus/evidence/task-1-hello-world-cli-error.txt`. If stdout capture requires redirection, use a single shell invocation that preserves the exact printed bytes. + **Must NOT do**: Do not create any intermediate script file. Do not use `echo` if it risks shell-specific escape behavior. Do not install tools. Do not edit tracked source files. + + **Recommended Agent Profile**: + - Category: `quick` — Reason: single atomic shell task with no code changes + - Skills: `[]` — no specialized skill required + - Omitted: [`git-master`, `playwright`] — no git workflow or browser verification needed + + **Parallelization**: Can Parallel: NO | Wave 1 | Blocks: F1, F2, F3, F4 | Blocked By: none + + **References** (executor has NO interview context — be exhaustive): + - Pattern: `/Users/openim/Work/openimsdk-docs/.sisyphus/plans/hello-world-cli-chain.md` — execute exactly the command and evidence policy defined in this plan + - Pattern: `/Users/openim/Work/openimsdk-docs/.sisyphus` — verified existing planning root for this workspace + - Pattern: `/Users/openim/Work/openimsdk-docs/.sisyphus/plans` — verified existing empty plans directory chosen for this trivial workflow demo + - External: visible command metadata `/start-work` — starts a Sisyphus work session from a Prometheus plan; use it as the workflow entrypoint for this demo + + **Acceptance Criteria** (agent-executable only): + - [ ] `pwd` during execution is `/Users/openim/Work/openimsdk-docs` + - [ ] `printf 'helloworld\n'` exits with status `0` + - [ ] Captured stdout content equals exactly `helloworld` plus one trailing newline + - [ ] `.sisyphus/evidence/task-1-hello-world-cli.txt` exists and contains the successful output evidence + - [ ] `sh -c 'exit 1'` returns non-zero + - [ ] `.sisyphus/evidence/task-1-hello-world-cli-error.txt` exists and records the failure-case status + + **QA Scenarios** (MANDATORY — task incomplete without these): + ``` + Scenario: Happy path + Tool: Bash + Steps: Run from `/Users/openim/Work/openimsdk-docs`: `printf 'helloworld\n' | tee .sisyphus/evidence/task-1-hello-world-cli.txt >/dev/null`; then validate the file content matches exactly `helloworld` with newline. + Expected: Command exits 0; evidence file exists; stdout payload is exact with no extra prefix/suffix. + Evidence: .sisyphus/evidence/task-1-hello-world-cli.txt + + Scenario: Failure/edge case + Tool: Bash + Steps: Run from `/Users/openim/Work/openimsdk-docs`: `sh -c 'exit 1'`; capture the non-zero status in `.sisyphus/evidence/task-1-hello-world-cli-error.txt` using a command that writes the observed exit code. + Expected: Exit status is non-zero; no false success message is recorded; error evidence file exists. + Evidence: .sisyphus/evidence/task-1-hello-world-cli-error.txt + ``` + + **Commit**: NO | Message: `n/a` | Files: none + +## Final Verification Wave (MANDATORY — after ALL implementation tasks) +> 4 review agents run in PARALLEL. ALL must APPROVE. Present consolidated results to user and get explicit "okay" before completing. +> **Do NOT auto-proceed after verification. Wait for user's explicit approval before marking work complete.** +> **Never mark F1-F4 as checked before getting user's okay.** Rejection or user feedback -> fix -> re-run -> present again -> wait for okay. +- [ ] F1. Plan Compliance Audit — oracle +- [ ] F2. Code Quality Review — unspecified-high +- [ ] F3. Real Manual QA — unspecified-high +- [ ] F4. Scope Fidelity Check — deep + +## Commit Strategy +- No commit is permitted for this task because the planned implementation does not require tracked file changes. +- If execution creates only `.sisyphus/evidence/*`, leave them uncommitted unless the user explicitly requests a documentation/evidence commit. + +## Success Criteria +- The workflow can be described concretely as: Prometheus produced this plan; `/start-work` is the entry command for a Sisyphus work session; the executor performs one shell command that prints `helloworld`. +- The task completes without creating or editing source code. +- Evidence is binary and machine-checkable. +- No claim is made beyond the verified workflow metadata available in this environment. diff --git a/docs/guides/gettingStarted/admin.assets/change-password.png b/docs/guides/gettingStarted/admin.assets/change-password.png new file mode 100644 index 0000000000..6afcd29f97 Binary files /dev/null and b/docs/guides/gettingStarted/admin.assets/change-password.png differ diff --git a/docs/guides/gettingStarted/admin.assets/dashboard.png b/docs/guides/gettingStarted/admin.assets/dashboard.png new file mode 100644 index 0000000000..b40a056400 Binary files /dev/null and b/docs/guides/gettingStarted/admin.assets/dashboard.png differ diff --git a/docs/guides/gettingStarted/admin.assets/login.png b/docs/guides/gettingStarted/admin.assets/login.png new file mode 100644 index 0000000000..73f8ca5314 Binary files /dev/null and b/docs/guides/gettingStarted/admin.assets/login.png differ diff --git a/docs/guides/gettingStarted/admin.assets/user-list.png b/docs/guides/gettingStarted/admin.assets/user-list.png new file mode 100644 index 0000000000..28d2fe1669 Binary files /dev/null and b/docs/guides/gettingStarted/admin.assets/user-list.png differ diff --git a/docs/guides/gettingStarted/admin.md b/docs/guides/gettingStarted/admin.md index 97e6264e2e..a58c788143 100644 --- a/docs/guides/gettingStarted/admin.md +++ b/docs/guides/gettingStarted/admin.md @@ -1,80 +1,89 @@ --- -title: '运维系统' -sidebar_position: 11 +title: '管理后台' +sidebar_position: 10 --- +## 📌 一、访问地址 +管理后台默认地址为 `http://your_server_ip:11002`,其中 `your_server_ip` 为部署管理后台前端页面的服务器 IP。 -## 组件说明 +:::tip +`11002` 属于可选开放端口。仅在需要通过浏览器访问管理后台时,才需要对外放行该端口。 +::: -| 组件名称 | 组件说明 | 部署说明 | -|-------------|-----------------------------------------|--------------------------------------| -| prometheus | 用于收集和存储指标数据的监控系统组件 | 需手动启用 | -| alertmanager | 管理和发送告警的组件 | 需手动启用 | -| grafana | 用于展示监控数据的仪表板组件 | 需手动启用 | -| node-exporter | 用于采集节点(如服务器)指标信息 | 需手动启用 | +![管理后台登录页](./admin.assets/login.png) -## 启动监控 +## 📌 二、登录后台 -### 1.启动组件 +1. 在浏览器中访问 `http://your_server_ip:11002`; +2. 确认页面已正常展示 `账号`、`密码` 输入框,以及协议勾选项; +3. 默认部署下,可先使用账号 `chatAdmin`、密码 `chatAdmin` 登录;如果你已经修改过配置,请以实际部署值为准。 -目前`OpenIM`使用的监控告警组件为`prometheus`、`alertmanager`、`grafana`、`node_exporter`。在使用`docker compose up -d`启动组件时,默认**不会**启动监控组件。如需启动监控组件,需要使用命令为: +> 如果 `chatAdmin / chatAdmin` 无法登录,优先回查你自己的部署是否已经修改了默认管理后台凭据。 -```sh -docker compose --profile m up -d -``` +> `imAdmin` 是 OpenIM 内置的 APP 管理员 `userID`,主要用于获取管理员 token 并调用管理类 REST API;不要直接把它理解为管理后台页面的默认登录密码。 -> 注意:以上方式不适用于windows系统。如果需要在windows系统中启用监控组件,需要自行修改docker-compose.yml中监控组件的网络模式,并映射相应的端口,最后将prometheus.yml中的`127.0.0.1`替换为内网ip地址。 +## 📌 三、首次登录后立即修改密码 -### 2.Grafana导入OpenIM主要指标数据 +建议在首次登录后,立即执行以下操作: -#### 登录grafana +1. 打开左侧 `账号设置 -> 修改密码`; +2. 在 `当前密码` 中填写默认密码 `chatAdmin`; +3. 在 `新密码` 中填写新的强密码; +4. 点击 `保存`; +5. 根据页面提示重新登录。 -先登录管理后台,再点击左侧数据监控菜单,输入默认用户名(admin)和密码(admin)登入grafana. +![管理后台修改密码页](./admin.assets/change-password.png) -也可以直接访问`your_ip:13000`进行访问,将`youre_ip`改为部署机器的ip地址。 +> 如果你已经修改成功,后续文档中的 `chatAdmin / chatAdmin` 只可作为默认部署下的首次登录参考,不应继续用于长期使用。 -![PC Web Interface](./assets/login1.png) +## 📌 四、登录后可见的主要页面入口 -#### 添加Prometheus数据源 +按当前 `http://localhost:11002/` 实际登录结果,成功登录后会进入 `业务系统 -> 用户管理 -> 用户列表` 页面。左侧导航可见以下模块: -如下图,在左侧菜单栏找到`Connections/Add new connection`,在输入框内输入`prometheus`添加数据源,并输入Prometheus数据源的URL: http://your_ip:19090 (19090为Prometheus默认端口) ,点击"Save and Test"保存. -![PC Web Interface](./assets/database.png) +- 数据监控; +- 业务系统: + - 用户管理:`用户列表`、`封禁列表` + - 注册管理:`默认好友`、`默认群组` +- IM 系统: + - 用户管理:`用户列表` + - 群组管理:`群组列表` + - 消息管理:`用户消息`、`群组消息` + - 日志管理:`日志列表`(客户端上传日志后,可在这个条目中查看对应记录) + - 通知管理:`通知账号`、`发送通知` +- 账号设置:`个人信息`、`修改密码` -![PC Web Interface](./assets/database2.png) +![管理后台登录后的用户列表页](./admin.assets/user-list.png) -#### **导入dashboard** +> 上图为实际登录后的页面截图。当前环境下,默认会落到 `用户列表`,页面提供按 `用户ID/手机号` 查询,以及 `创建新用户` 入口。 -在左侧菜单栏选择`Dashboards`,点击`Create Dashboard`按钮,再点击`Import dashboard`导入仪表盘。 +## 📌 五、监控仪表板 -![dashboard1](./assets/dashboard.png) +![Grafana 监控页示例](./admin.assets/dashboard.png) -有两种方式导入`OpenIM`默认的仪表盘: +管理后台中的 `数据监控` 依赖 `Grafana` 提供监控面板,用于监控 OpenIMServer 与 ChatServer 的运行状态、数据库状态、注册量、消息量等指标。 -1. 拷贝 https://github.com/openimsdk/open-im-server/tree/main/config/grafana-template/Demo.json 内容到`Import via dashboard JSON model`区域。 -2. 点击`Upload dashboard JSON file`,上传`open-im-server/config/grafana-template/Demo.json`文件。 +具体部署请参考:[数据监控](./monitoring) -接着点击load按钮 +## 📌 六、联动服务检查 -![dashboard2](./assets/dashboard2.png) +如果登录页可以打开,但登录后页面为空白、列表加载失败,或通知/群组/消息等页面无法使用,优先检查以下三项: -选择刚刚添加的 Data Source,再点击`Import` 即可导入指标信息,如下图 +1. `11002` 管理后台前端端口是否可达; +2. `10009` APP 管理员接口是否正常; +3. ChatServer 的 `admin-api`、`admin-rpc` 是否运行正常; +4. 如果只有 `数据监控` 页面异常,再额外检查 `Grafana` 直连访问、`GRAFANA_URL` 配置,以及浏览器侧跨域 / 嵌入访问限制。 -![dashboard3](./assets/dashboard3.png) +可结合以下文档继续排查: -至此,`OpenIM`的主要监控指标配置完毕。 +- [端口开放](./ports) +- [快速验证](./quickTestServer) +- [生产环境](./production) -### 3.Grafana导入node exporter指标数据 +## 📌 七、需要直接调用管理接口时 -点击左侧菜单栏的`Dashboard`,选择右侧`New`下拉框中的`Import`。 - -![image-20260320173607074](./assets/dashboard4.png) - -在`Grafana.com dashboard URL or ID`输入框中填入`1860`,点击右边的`Load`,再点击`Import`。 - -![image-20260320174708460](./assets/dashboard5.png) - -node-exporter指标信息,如下图 -![image-20260320175028356](./assets/dashboard6.png) +如果你当前只需要获取管理员 token,或单独调试管理类 REST API,也可以暂时不依赖管理后台页面,直接参考以下文档: +- [获取管理员 token](../../restapi/apis/authenticationManagement/getAdminToken) +当你通过管理员 token 调通接口后,再回到管理后台页面验证对应功能,会更容易定位是前端访问问题,还是后端服务问题。 diff --git a/docs/guides/gettingStarted/dockerCompose.md b/docs/guides/gettingStarted/dockerCompose.md index 6302dc0eda..324bf60b19 100644 --- a/docs/guides/gettingStarted/dockerCompose.md +++ b/docs/guides/gettingStarted/dockerCompose.md @@ -1,12 +1,14 @@ --- title: 'docker部署' sidebar_position: 2 - --- + ## 1.环境准备 🌍 + 对于服务器硬件、软件、操作系统、以及所依赖组件请参考[此文档](./env-comp) ## 2. 部署 OpenIMServer + ### 2.1 仓库克隆 🗂️ 建议使用 GitHub Releases 页面绿色 **Latest** 对应的**最新正式发布 tag**,不要直接按 tag 名字排序,也不要使用 alpha/rc 等预发布版本。 @@ -29,9 +31,7 @@ echo "using openim-docker stable release tag: $LATEST_STABLE_TAG" MINIO_EXTERNAL_ADDRESS="http://your-server-ip:10005" ``` - - -### 2.3服务启动 🚀 +### 2.3 服务启动 🚀 - 启动服务: @@ -45,7 +45,6 @@ docker compose up -d > 如果启动时看到 `ETCD_USERNAME`、`ETCD_PASSWORD`、`KAFKA_USERNAME`、`KAFKA_PASSWORD` 未设置的 warning,而你并未启用这些组件的鉴权,这类提示通常可以忽略。 - - 停止服务: ```bash @@ -66,6 +65,13 @@ docker compose logs -f openim-server openim-chat docker compose --profile m up -d ``` +这里的 `m` 是 `openim-docker` 在 `docker-compose.yaml` 中定义的监控 profile。启用后会额外拉起: + +- `Prometheus`:指标采集 +- `Alertmanager`:告警分发 +- `Grafana`:监控面板展示 +- `node-exporter`:主机指标采集 + 默认端口以当前 `.env` 为准,常用值如下: - `19090`:Prometheus @@ -73,17 +79,23 @@ docker compose --profile m up -d - `13000`:Grafana - `19100`:node-exporter +> 线上环境建议开启监控告警,也就是使用 `docker compose --profile m up -d` 启动 `m` profile。`openim-docker` 已内置 Prometheus 抓取配置、Alertmanager 配置以及基础告警规则,适合用于实例存活、数据库异常、低注册量、低消息量等场景的监控。 + +> 若你需要真正接收告警通知,还需按实际环境补全 `Alertmanager` 的通知配置(如 SMTP 等)。 + ## 3. 快速体验 ⚡ 快速体验 OpenIMSDK 核心能力,并测试 OpenIMServer/ChatServer 部署是否正常,请参考[快速验证](./quickTestServer)。 ## 4. 常见问题 -### unhealthy定位 +### unhealthy 定位 + 1. 执行 `docker exec -it openim-server mage check` 与 `docker exec -it openim-chat mage check`,确认是否持续超过一分钟; 2. 执行 `docker compose logs -f openim-server openim-chat` 查看日志; 3. 如果 `openim-chat` 在启动初期短暂报 `connect: connection refused`,先等待 `30-60s` 后再复查健康状态;这通常是依赖 `openim-server` 尚未完全就绪导致的启动时序现象。 ### 配置项修改 -进入容器修改config目录下的修改配置文件无效! + +进入容器修改 config 目录下的修改配置文件无效! 必须采用环境变量的方式修改配置,参考[设置环境变量指南](https://github.com/openimsdk/openim-docker/issues/136)。 diff --git a/docs/guides/gettingStarted/monitoring.md b/docs/guides/gettingStarted/monitoring.md new file mode 100644 index 0000000000..98c19cc4af --- /dev/null +++ b/docs/guides/gettingStarted/monitoring.md @@ -0,0 +1,91 @@ +--- +title: '监控告警' +sidebar_position: 11 +--- + +## 📌 一、本文说明 + +本文用于说明如何在快速部署场景下启用 `OpenIM` 的监控告警能力,并完成 `Grafana` 仪表板初始化。 + +完成本文后,你可以: + +- 启动 `Prometheus`、`Alertmanager`、`Grafana`、`node-exporter`; +- 登录 `Grafana`; +- 导入 `OpenIM` 主要指标仪表板; +- 导入 `node-exporter` 节点监控仪表板。 + +## 📌 二、启动监控 + +### 1. 启动组件 + +目前 `OpenIM` 使用的监控告警组件为 `prometheus`、`alertmanager`、`grafana`、`node_exporter`。 + +在使用 `docker compose up -d` 启动组件时,默认**不会**启动监控组件。如需启动监控组件,需要使用以下命令: + +```sh +docker compose --profile m up -d +``` + +> 注意:以上方式不适用于 Windows 系统。如果需要在 Windows 系统中启用监控组件,需要自行修改 `docker-compose.yml` 中监控组件的网络模式,并映射相应的端口,最后将 `prometheus.yml` 中的 `127.0.0.1` 替换为内网 IP 地址。 + +## 📌 三、登录 Grafana + +先登录管理后台,再点击左侧数据监控菜单,输入默认用户名 (`admin`) 和密码 (`admin`) 登录 `Grafana`。 + +也可以直接访问 `your_ip:13000`,将 `your_ip` 改为部署机器的 IP 地址。 + +![PC Web Interface](./assets/login1.png) + +## 📌 四、Grafana 导入 OpenIM 主要指标数据 + +### 1. 添加 Prometheus 数据源 + +如下图,在左侧菜单栏找到 `Connections/Add new connection`,在输入框内输入 `prometheus` 添加数据源,并输入 Prometheus 数据源的 URL:`http://your_ip:19090`(`19090` 为 Prometheus 默认端口),点击 “Save and Test” 保存。 + +![PC Web Interface](./assets/database.png) + +![PC Web Interface](./assets/database2.png) + +### 2. 导入 Dashboard + +在左侧菜单栏选择 `Dashboards`,点击 `Create Dashboard` 按钮,再点击 `Import dashboard` 导入仪表盘。 + +![dashboard1](./assets/dashboard.png) + +有两种方式导入 `OpenIM` 默认的仪表盘: + +1. 拷贝 `https://github.com/openimsdk/open-im-server/tree/main/config/grafana-template/Demo.json` 内容到 `Import via dashboard JSON model` 区域。 +2. 点击 `Upload dashboard JSON file`,上传 `open-im-server/config/grafana-template/Demo.json` 文件。 + +接着点击 `Load` 按钮。 + +![dashboard2](./assets/dashboard2.png) + +选择刚刚添加的 Data Source,再点击 `Import` 即可导入指标信息,如下图。 + +![dashboard3](./assets/dashboard3.png) + +至此,`OpenIM` 的主要监控指标配置完毕。 + +## 📌 五、Grafana 导入 node exporter 指标数据 + +点击左侧菜单栏的 `Dashboards`,选择右侧 `New` 下拉框中的 `Import`。 + +![image-20260320173607074](./assets/dashboard4.png) + +在 `Grafana.com dashboard URL or ID` 输入框中填入 `1860`,点击右边的 `Load`,再点击 `Import`。 + +![image-20260320174708460](./assets/dashboard5.png) + +node-exporter 指标信息,如下图。 + +![image-20260320175028356](./assets/dashboard6.png) + +## 📌 六、组件说明 + +| 组件名称 | 组件说明 | 部署说明 | +| ------------- | ------------------------------------ | ---------- | +| prometheus | 用于收集和存储指标数据的监控系统组件 | 需手动启用 | +| alertmanager | 管理和发送告警的组件 | 需手动启用 | +| grafana | 用于展示监控数据的仪表板组件 | 需手动启用 | +| node-exporter | 用于采集节点(如服务器)指标信息 | 需手动启用 | diff --git a/docs/guides/gettingStarted/quickTestServer.md b/docs/guides/gettingStarted/quickTestServer.md index d2934abdba..69277be5f0 100644 --- a/docs/guides/gettingStarted/quickTestServer.md +++ b/docs/guides/gettingStarted/quickTestServer.md @@ -5,7 +5,7 @@ sidebar_position: 9 ## 📌 一、部署服务端 -请参考 [docker部署](./dockerCompose) 或 [源码部署](./imSourceCodeDeployment) 来进行部署。 +请参考 [docker 部署](./dockerCompose) 或 [源码部署](./imSourceCodeDeployment) 来进行部署。 --- @@ -13,7 +13,7 @@ sidebar_position: 9 参考 [端口和防火墙](./ports) -## 📌 三、PC Web验证 +## 📌 三、PC Web 验证 :::tip 在浏览器中输入 `http://your_server_ip:11001` 来访问 PC Web。`your_server_ip` 为部署前端服务的服务器 IP 地址。 @@ -33,11 +33,13 @@ sidebar_position: 9 > 该值来自 `chat/config/chat-rpc-chat.yml` 中的 `verifyCode.superCode`;默认 `phone.use` 和 `mail.use` 也都是 `superCode`。如果你已经修改过配置,请以实际部署值为准。 +如果你还需要启用和配置监控告警,可继续参考 [监控告警](./monitoring)。 + ## 📌 四、服务进程验证 确认 OpenIMServer 与 ChatServer 进程状态正常。 -### Docker部署 +### Docker 部署 ```bash docker ps | grep -E 'openim-server|openim-chat' @@ -84,6 +86,7 @@ wss://your_domain/msg_gateway ``` > 建议在生产环境统一通过 `443` 端口访问,OpenIMClientSDK 初始化时使用: +> > - `apiAddr`: `https://your_domain/api` > - `wsAddr`: `wss://your_domain/msg_gateway` diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/change-password.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/change-password.png new file mode 100644 index 0000000000..6afcd29f97 Binary files /dev/null and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/change-password.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/dashboard.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/dashboard.png new file mode 100644 index 0000000000..b40a056400 Binary files /dev/null and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/dashboard.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/login.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/login.png new file mode 100644 index 0000000000..73f8ca5314 Binary files /dev/null and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/login.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/user-list.png b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/user-list.png new file mode 100644 index 0000000000..28d2fe1669 Binary files /dev/null and b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/admin.assets/user-list.png differ diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/dockerCompose.md b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/dockerCompose.md index 509324ee39..699af89831 100644 --- a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/dockerCompose.md +++ b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/dockerCompose.md @@ -1,12 +1,14 @@ --- title: 'Docker Deployment' sidebar_position: 2 - --- + ## 1. Environment Preparation 🌍 + For server hardware, software, operating system, and dependent components, please refer to [this document](./env-comp). ## 2. Deploy OpenIMServer + ### 2.1 Clone the Repository 🗂️ Use the latest official release tag marked with the green **Latest** badge on the GitHub Releases page. Do not sort tags manually, and do not use pre-release versions such as alpha or rc. @@ -63,6 +65,13 @@ If you also want to start `Prometheus`, `Alertmanager`, `Grafana`, and `node-exp docker compose --profile m up -d ``` +Here, `m` is the monitoring profile defined by `openim-docker` in `docker-compose.yaml`. Enabling it starts these extra services: + +- `Prometheus`: metrics collection +- `Alertmanager`: alert routing +- `Grafana`: monitoring dashboards +- `node-exporter`: host metrics collection + Default ports follow the current `.env`. Common values are: - `19090`: Prometheus @@ -70,6 +79,10 @@ Default ports follow the current `.env`. Common values are: - `13000`: Grafana - `19100`: node-exporter +> For production environments, it is recommended to enable monitoring and alerting, which means starting the `m` profile with `docker compose --profile m up -d`. `openim-docker` already includes Prometheus scrape configuration, Alertmanager configuration, and basic alert rules for cases such as instance failures, database exceptions, low registrations, and low message volume. + +> To actually receive alert notifications, complete the `Alertmanager` notification configuration for your environment, such as SMTP settings. + ## 3. Quick Experience ⚡ To quickly experience core OpenIMSDK capabilities and verify whether OpenIMServer / ChatServer deployment is working, refer to [Quick Verification](./quickTestServer). @@ -77,10 +90,12 @@ To quickly experience core OpenIMSDK capabilities and verify whether OpenIMServe ## 4. FAQ ### Troubleshooting `unhealthy` + 1. Run `docker exec -it openim-server mage check` and `docker exec -it openim-chat mage check`, then confirm whether either state lasts longer than one minute. 2. Run `docker compose logs -f openim-server openim-chat` to inspect logs. 3. If `openim-chat` briefly reports `connect: connection refused` during startup, wait `30-60s` and check again. This is usually a startup ordering issue while `openim-server` is still becoming ready. ### Configuration Changes + Editing files under the container `config` directory does not work. Configuration changes must be made through environment variables. See the [environment variable guide](https://github.com/openimsdk/openim-docker/issues/136). diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/monitoring.md b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/monitoring.md new file mode 100644 index 0000000000..2eb67f1ed3 --- /dev/null +++ b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/monitoring.md @@ -0,0 +1,91 @@ +--- +title: 'Monitoring & Alerting' +sidebar_position: 11 +--- + +## 📌 1. What This Page Covers + +This page explains how to enable `OpenIM` monitoring and alerting in a quick-deployment setup and complete the initial `Grafana` dashboard configuration. + +After finishing this page, you will be able to: + +- start `Prometheus`, `Alertmanager`, `Grafana`, and `node-exporter` +- sign in to `Grafana` +- import the main `OpenIM` metrics dashboard +- import the `node-exporter` host monitoring dashboard + +## 📌 2. Start Monitoring + +### 1. Start Components + +The monitoring and alerting components used by `OpenIM` are `prometheus`, `alertmanager`, `grafana`, and `node_exporter`. + +When you start components with `docker compose up -d`, the monitoring components are **not** started by default. To start the monitoring components, use: + +```sh +docker compose --profile m up -d +``` + +> Note: This approach does not apply to Windows systems. If you need to enable the monitoring components on Windows, you must modify the network mode of the monitoring services in `docker-compose.yml`, map the corresponding ports, and then replace `127.0.0.1` in `prometheus.yml` with the internal IP address. + +## 📌 3. Sign in to Grafana + +First sign in to the admin console, then click the `Data Monitoring` menu on the left. Enter the default username (`admin`) and password (`admin`) to sign in to `Grafana`. + +You can also access `your_ip:13000` directly. Replace `your_ip` with the IP address of the deployment machine. + +![PC Web Interface](./assets/login1.png) + +## 📌 4. Import the Main OpenIM Metrics into Grafana + +### 1. Add the Prometheus Data Source + +As shown below, find `Connections/Add new connection` in the left navigation bar, enter `prometheus` in the input box to add the data source, and enter the Prometheus data source URL: `http://your_ip:19090` (`19090` is the default Prometheus port). Then click "Save and Test" to save it. + +![PC Web Interface](./assets/database.png) + +![PC Web Interface](./assets/database2.png) + +### 2. Import the Dashboard + +In the left navigation bar, select `Dashboards`, click `Create Dashboard`, and then click `Import dashboard` to import the dashboard. + +![dashboard1](./assets/dashboard.png) + +There are two ways to import the default `OpenIM` dashboard: + +1. Copy the content of `https://github.com/openimsdk/open-im-server/tree/main/config/grafana-template/Demo.json` into the `Import via dashboard JSON model` area. +2. Click `Upload dashboard JSON file` and upload the `open-im-server/config/grafana-template/Demo.json` file. + +Then click the `Load` button. + +![dashboard2](./assets/dashboard2.png) + +Select the Data Source you just added, then click `Import` to import the metrics information, as shown below. + +![dashboard3](./assets/dashboard3.png) + +At this point, the main monitoring metrics for `OpenIM` are configured. + +## 📌 5. Import node exporter Metrics into Grafana + +Click `Dashboards` in the left navigation bar, then select `Import` from the `New` dropdown on the right. + +![image-20260320173607074](./assets/dashboard4.png) + +In the `Grafana.com dashboard URL or ID` input box, enter `1860`, click `Load` on the right, and then click `Import`. + +![image-20260320174708460](./assets/dashboard5.png) + +The node-exporter metrics are shown below. + +![image-20260320175028356](./assets/dashboard6.png) + +## 📌 6. Component Overview + +| Component | Description | Deployment | +| ------------- | ------------------------------------------------------------------ | ------------------------ | +| prometheus | Monitoring system component used to collect and store metrics data | Must be enabled manually | +| alertmanager | Component used to manage and send alerts | Must be enabled manually | +| grafana | Dashboard component used to display monitoring data | Must be enabled manually | +| node-exporter | Component used to collect node metrics such as server metrics | Must be enabled manually | diff --git a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/quickTestServer.md b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/quickTestServer.md index a487b84e28..4de5d49572 100644 --- a/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/quickTestServer.md +++ b/i18n/en/docusaurus-plugin-content-docs-guides/current/gettingStarted/quickTestServer.md @@ -33,6 +33,8 @@ If you need to register an account on PC Web, the default verification code is ` > This value comes from `verifyCode.superCode` in `chat/config/chat-rpc-chat.yml`. By default, both `phone.use` and `mail.use` are also set to `superCode`. If you have changed the configuration, use the actual deployed value. +If you also need to enable and configure monitoring and alerting, continue with [Monitoring & Alerting](./monitoring). + ## 📌 4. Service Process Verification Confirm that OpenIMServer and ChatServer are running normally. @@ -84,6 +86,7 @@ wss://your_domain/msg_gateway ``` > In production, it is recommended to access everything through port `443`. OpenIMClientSDK should use: +> > - `apiAddr`: `https://your_domain/api` > - `wsAddr`: `wss://your_domain/msg_gateway`