diff --git a/apps/docs/content/docs/cn/admin.mdx b/apps/docs/content/docs/cn/admin.mdx index 24b9de33..b964e8a6 100644 --- a/apps/docs/content/docs/cn/admin.mdx +++ b/apps/docs/content/docs/cn/admin.mdx @@ -106,7 +106,7 @@ GET /api/audit-logs?limit=50&offset=0 | `/api/tasks/{id}` | GET | 获取任务详情 | | `/api/tasks/{id}/results` | GET | 获取执行结果 | -创建任务请求体: +创建一次性任务请求体: ```json { @@ -116,6 +116,75 @@ GET /api/audit-logs?limit=50&offset=0 } ``` +## 计划任务 + +计划任务在远程命令基础上增加了 cron 调度、重试和运行历史。 + +### 创建计划任务 + +1. 进入 **Settings → Tasks**。 +2. 切换到计划任务区域。 +3. 填写名称、命令、cron 表达式、目标服务器、超时、重试次数和重试间隔。 +4. 保存任务。启用状态的任务会立即注册到服务端调度器。 + +计划任务使用服务端 `scheduler.timezone` 配置计算 `next_run_at`。 + +### 字段 + +| 字段 | 说明 | +|------|------| +| `name` | 计划任务显示名称 | +| `task_type` | cron 任务使用 `scheduled`;省略/默认表示 `oneshot` | +| `cron_expression` | 服务端调度器解析的 cron 表达式 | +| `command` | 下发给每个目标 Agent 的 Shell 命令 | +| `server_ids` | 目标服务器 | +| `timeout` | 每次尝试的命令超时时间(秒);默认执行超时为 300 秒 | +| `retry_count` | 首次尝试后的重试次数,范围 0-10 | +| `retry_interval` | 重试之间等待秒数,必须至少为 1 | +| `enabled` | 禁用后会从调度器移除,不再自动运行 | + +### 执行行为 + +- 每次计划触发都会生成 `run_id`,用于把所有目标服务器和重试尝试的结果分组。 +- ServerBee 会避免同一个计划任务重叠运行。如果上一次运行仍未结束,新的触发会被跳过。 +- `POST /api/tasks/{id}/run` 可手动运行任务,且手动运行会跳过重试逻辑。 +- 禁用或删除任务会取消当前活动运行。 +- 结果中包含 `attempt`,用于区分重试次数。 + +合成退出码: + +| 退出码 | 含义 | +|--------|------| +| `-2` | `CAP_EXEC` 未启用,或被 Agent 本地 capability 策略阻止 | +| `-3` | 服务器离线或下发失败 | +| `-4` | 超时前没有收到 Agent 响应 | + +### API 示例 + +```json +{ + "task_type": "scheduled", + "name": "Daily disk check", + "command": "df -h", + "server_ids": ["server-id-1"], + "cron_expression": "0 0 8 * * *", + "timeout": 120, + "retry_count": 2, + "retry_interval": 60 +} +``` + +相关端点: + +| 端点 | 方法 | 说明 | +|------|------|------| +| `/api/tasks?type=scheduled` | GET | 列出计划任务 | +| `/api/tasks` | POST | 创建一次性任务或计划任务 | +| `/api/tasks/{id}` | PUT | 更新任务字段、启用/禁用状态和调度注册 | +| `/api/tasks/{id}` | DELETE | 删除任务并取消活动执行 | +| `/api/tasks/{id}/run` | POST | 手动运行任务 | +| `/api/tasks/{id}/results` | GET | 获取按 task/run/attempt 分组的任务结果 | + ## 计费信息 管理员可以为每台服务器记录计费相关信息,方便追踪 VPS 费用和到期时间。 diff --git a/apps/docs/content/docs/cn/alerts.mdx b/apps/docs/content/docs/cn/alerts.mdx index b5fb4370..8a88362d 100644 --- a/apps/docs/content/docs/cn/alerts.mdx +++ b/apps/docs/content/docs/cn/alerts.mdx @@ -4,7 +4,7 @@ description: 配置告警规则和通知渠道,及时发现和响应服务器 icon: Bell --- -ServerBee 提供灵活的告警系统,支持多种指标类型的阈值监控、多种通知渠道以及精细的触发控制。 +ServerBee 提供灵活的告警系统,支持多种指标类型的阈值监控、事件驱动告警、多种通知渠道以及精细的触发控制。 ## 告警概述 @@ -71,38 +71,45 @@ ServerBee 支持 14 种以上的告警指标: | `network_latency` | 平均探测延迟超过阈值时触发 | | `network_packet_loss` | 丢包率超过阈值时触发 | -### 离线检测 +### 离线、到期和事件 | 指标类型 | 说明 | |----------|------| | `offline` | 服务器持续离线超过指定时长后触发 | +| `expiration` | 服务器 `expired_at` 距今小于等于指定天数时触发 | +| `ip_changed` | Agent 上报 IP 变化事件时触发(事件驱动,不参与每分钟轮询) | ## 阈值配置 -每个告警条件支持设置最小值和最大值: +每个告警条件支持以下字段: ```json { "rule_type": "cpu", - "min": null, - "max": 90.0 + "min": 90.0, + "max": null, + "duration": null, + "cycle_interval": null, + "cycle_limit": null } ``` -- **max**:当指标值超过此阈值时触发(上限告警) -- **min**:当指标值低于此阈值时触发(下限告警) -- 可以同时设置 min 和 max,形成范围告警 +- **`min`**:下界。对大多数资源指标,指标值大于等于该值时触发。 +- **`max`**:上界。当同时设置 `min` 和 `max` 时,指标值落在该范围内才触发。 +- **`duration`**:用于 `offline`(离线秒数)和 `expiration`(到期天数)。 +- **`cycle_interval`**:流量周期类型:`hour`、`day`、`week`、`month`、`year`。 +- **`cycle_limit`**:流量周期规则的字节阈值。 一条告警规则可以包含多个条件,所有条件必须同时满足(AND 逻辑)才会触发告警。例如: ```json [ - { "rule_type": "cpu", "max": 90.0 }, - { "rule_type": "memory", "max": 85.0 } + { "rule_type": "cpu", "min": 90.0 }, + { "rule_type": "memory", "min": 85.0 } ] ``` -上述规则表示:CPU 超过 90% 且内存超过 85% 时才触发。 +上述规则表示:CPU 大于等于 90% 且内存大于等于 85% 时才触发。 ## 覆盖类型 @@ -156,6 +163,12 @@ ServerBee 支持 14 种以上的告警指标: 2. 如果配置了恢复触发任务 (`recover_trigger_tasks`),自动执行对应的远程命令 3. 清除告警状态,下次满足条件时可以重新触发 +### 维护窗口抑制 + +当受影响服务器处于活动维护窗口时,ServerBee 会抑制该服务器的告警通知。规则评估仍会执行,但维护结束前不会发送通知。 + +`ip_changed` 属于事件驱动规则,不参与每分钟轮询。它在 Agent 上报 IP 变化事件时评估,并遵循同样的覆盖范围和维护窗口抑制逻辑。 + ## 通知渠道 ServerBee 支持以下通知渠道: @@ -193,7 +206,7 @@ ServerBee 支持以下通知渠道: 邮件通知通过 [Resend](https://resend.com/) 发送。使用前两步准备: -1. 在服务器设置 `SERVERBEE_RESEND__API_KEY`(参考[配置](/docs/cn/configuration)页面)。 +1. 在服务器设置 `SERVERBEE_RESEND__API_KEY`(参考[配置](/cn/docs/configuration)页面)。 2. 在 [resend.com/domains](https://resend.com/domains) 添加并验证发件域名。各通道的 `from` 必须属于已验证的域名。 通道配置: @@ -207,6 +220,22 @@ ServerBee 支持以下通知渠道: `to` 是数组——单个通道可以一次投递给多个收件人。主题格式为 `[ServerBee] {server_name} {event}`,正文使用 HTML 并附纯文本兜底。 +### APNs + +通过 Apple Push Notification service 向已注册的移动端设备发送原生推送。 + +```json +{ + "key_id": "ABC123DEFG", + "team_id": "TEAM999888", + "private_key": "-----BEGIN PRIVATE KEY-----...", + "bundle_id": "com.example.serverbee", + "sandbox": false +} +``` + +APNs 需要 Apple Developer key、Team ID、Bundle ID 和私钥。只有开发构建才应设置 `sandbox: true`。 + ### 模板变量 通知内容支持以下模板变量: diff --git a/apps/docs/content/docs/cn/api-reference.mdx b/apps/docs/content/docs/cn/api-reference.mdx index e1582740..697f0590 100644 --- a/apps/docs/content/docs/cn/api-reference.mdx +++ b/apps/docs/content/docs/cn/api-reference.mdx @@ -1,127 +1,166 @@ --- title: API 参考 -description: ServerBee REST API 概览、认证方式和 Swagger UI 交互式文档。 +description: ServerBee REST API 概览、认证方式、WebSocket 端点和 Swagger UI 交互式文档。 icon: FileCode --- -ServerBee 提供完整的 REST API,支持所有 Dashboard 功能的程序化访问。所有 API 均通过 OpenAPI 3.0 规范文档化。 +ServerBee 将 Web 管理面板使用的能力同时通过 REST 和 WebSocket API 暴露。最权威的 schema 级参考由服务端二进制中的 OpenAPI 注解自动生成。 ## Swagger UI -ServerBee 内置 Swagger UI 交互式 API 文档: +内置交互式文档地址: -``` +```text https://your-server/swagger-ui/ ``` -你可以在 Swagger UI 中浏览所有 50+ 个 API 端点、查看请求/响应模型、直接发送测试请求。 +你可以在 Swagger UI 中查看请求/响应模型、认证要求,并直接向自己的部署发送测试请求。原始 OpenAPI 文档地址: -## 认证方式 +```text +https://your-server/api-docs/openapi.json +``` + +## 响应格式 + +REST 成功响应统一包装为: + +```json +{ + "data": {} +} +``` + +错误响应格式: + +```json +{ + "error": "Error message describing what went wrong" +} +``` -ServerBee API 支持两种认证方式: +## 认证方式 ### Session Cookie -浏览器登录后自动使用。调用 `/api/auth/login` 获取 session: +Web 管理面板登录后自动使用: ```bash curl -X POST https://your-server/api/auth/login \ -H "Content-Type: application/json" \ - -d '{"username": "admin", "password": "your-password"}' \ + -d '{"username":"admin","password":"your-password"}' \ -c cookies.txt -# 后续请求带上 cookie curl https://your-server/api/servers -b cookies.txt ``` ### API Key -适合自动化场景。在 Settings → API Keys 页面创建: +自动化场景推荐使用 API Key。在 Settings → API Keys 创建。 ```bash curl https://your-server/api/servers \ -H "X-API-Key: serverbee_your-api-key-here" ``` -API Key 格式为 `serverbee_` 前缀 + 43 字符随机字符串,创建时仅显示一次。 +API Key 使用 `serverbee_` 前缀,创建时只显示一次。 -## 端点概览 +### Bearer Session Token -### 公开端点(无需认证) +移动端流程在部分 REST 和 WebSocket 端点中使用 Bearer token: -| 方法 | 路径 | 说明 | -|------|------|------| -| POST | `/api/auth/login` | 用户登录 | -| GET | `/api/auth/oauth/{provider}` | OAuth 授权跳转 | -| GET | `/api/auth/oauth/{provider}/callback` | OAuth 回调 | -| GET | `/api/status` | 公开状态页数据 | +```bash +curl https://your-server/api/auth/me \ + -H "Authorization: Bearer " +``` -### 认证端点(Session 或 API Key) +## 公开端点 | 方法 | 路径 | 说明 | |------|------|------| -| POST | `/api/auth/logout` | 用户登出 | -| GET | `/api/auth/me` | 获取当前用户信息 | -| POST | `/api/auth/change-password` | 修改密码 | -| GET/POST | `/api/auth/2fa/*` | 2FA 管理 | -| GET/DELETE | `/api/auth/oauth/accounts` | OAuth 账号管理 | -| GET | `/api/servers` | 列出服务器 | -| GET | `/api/servers/{id}` | 获取服务器详情 | -| GET | `/api/servers/{id}/records` | 获取指标记录 | -| GET | `/api/servers/{id}/gpu-records` | 获取 GPU 记录 | -| GET | `/api/server-groups` | 列出服务器分组 | -| GET | `/api/ping-tasks` | 列出 Ping 任务 | -| GET | `/api/ping-tasks/{id}/records` | 获取 Ping 记录 | - -### 管理员端点(需要 Admin 角色) - -| 方法 | 路径 | 说明 | +| POST | `/api/auth/login` | Web 登录 | +| GET | `/api/auth/oauth/providers` | 列出已启用的 OAuth Provider | +| GET | `/api/auth/oauth/{provider}` | OAuth 授权跳转 | +| GET | `/api/auth/oauth/{provider}/callback` | OAuth 回调 | +| POST | `/api/mobile/auth/login` | 移动端登录 | +| POST | `/api/mobile/auth/refresh` | 刷新移动端会话 | +| POST | `/api/mobile/auth/pair` | 兑换移动端配对码 | +| POST | `/api/agent/register` | Agent 使用 discovery key 自动注册 | +| GET | `/api/status` | 默认公开状态页数据 | +| GET | `/api/status/{slug}` | 可配置公开状态页数据 | +| GET | `/api/settings/brand` | 公开品牌设置 | +| GET | `/api/brand/logo` | 返回上传的 Logo | +| GET | `/api/brand/favicon` | 返回上传的 Favicon | + +## 已认证读取端点 + +除特别说明外,读取端点对 Admin 和 Member 均可用。 + +| 端点族 | 代表端点 | +|--------|----------| +| 当前用户和 API Key | `GET /api/auth/me`、`PUT /api/auth/password`、`GET/POST /api/auth/api-keys`、`DELETE /api/auth/api-keys/{id}` | +| 2FA 和 OAuth 账号 | `/api/auth/2fa/*`、`GET/DELETE /api/auth/oauth/accounts/*` | +| 移动端设备 | `POST /api/mobile/auth/logout`、`GET /api/mobile/auth/devices`、`DELETE /api/mobile/auth/devices/{id}` | +| 服务器 | `GET /api/servers`、`GET /api/servers/{id}`、`GET /api/servers/{id}/records`、`GET /api/servers/{id}/gpu-records` | +| 分组和标签 | `GET /api/server-groups`、`GET /api/server-tags` | +| 可用性和流量 | `GET /api/servers/{id}/uptime-daily`、`GET /api/servers/{id}/traffic` | +| GeoIP | `GET /api/geoip/status` | +| Ping 任务 | `GET /api/ping-tasks`、`GET /api/ping-tasks/{id}/records` | +| 网络探测 | `/api/network-probes/*`、`/api/servers/{id}/network-probes/*` | +| Traceroute 结果 | `GET /api/servers/{id}/traceroute/{request_id}` | +| 文件读取 | `POST /api/files/{server_id}/list`、`stat`、`read`、`GET /api/files/transfers`、`GET /api/files/download/{transfer_id}` | +| Docker 读取 | `GET /api/servers/{id}/docker/containers`、`stats`、`info`、`events`、`networks`、`volumes` | +| 服务监控 | `GET /api/service-monitors`、`GET /api/service-monitors/{id}`、`GET /api/service-monitors/{id}/records` | +| 状态页配置 | `GET /api/status-pages` | +| 仪表盘 | `GET /api/dashboards`、`GET /api/dashboards/default`、`GET /api/dashboards/{id}` | +| 主题 | `GET /api/themes/*` | +| 告警事件 | 仪表盘使用的告警事件读取端点 | + +## 管理员写入和管理端点 + +写入操作和系统管理需要 Admin 角色。 + +| 端点族 | 代表端点 | +|--------|----------| +| 服务器管理 | `POST/PUT/DELETE /api/servers/*`、`PUT /api/servers/batch-capabilities`、`POST /api/servers/{id}/upgrade` | +| Agent 恢复 | `GET /api/servers/{target_id}/recovery-candidates`、`GET /api/servers/recovery-jobs/{job_id}`、`POST /api/servers/{target_id}/recover-merge` | +| 分组和标签 | CRUD `/api/server-groups/*`、CRUD `/api/server-tags/*` | +| Ping 和网络探测 | CRUD `/api/ping-tasks/*`、`/api/network-probes/*` 下的写入端点 | +| Traceroute | `POST /api/servers/{id}/traceroute` | +| 文件管理 | `POST /api/files/{server_id}/write`、`delete`、`mkdir`、`move`、`download`、`upload`、`DELETE /api/files/transfers/{transfer_id}` | +| Docker 操作 | `POST /api/servers/{id}/docker/containers/{cid}/action` | +| 服务监控 | CRUD `/api/service-monitors/*`、`POST /api/service-monitors/{id}/check` | +| 仪表盘 | `POST /api/dashboards`、`PUT/DELETE /api/dashboards/{id}` | +| 主题和外观 | 主题写入端点、`PUT /api/settings/brand`、`POST /api/settings/brand/logo`、`POST /api/settings/brand/favicon` | +| 状态页 | CRUD `/api/status-pages/*` | +| 事件公告 | CRUD `/api/incidents/*`、`POST /api/incidents/{id}/updates` | +| 维护窗口 | CRUD `/api/maintenances/*` | +| 告警和通知 | CRUD `/api/alert-rules/*`、`/api/notifications/*`、`/api/notification-groups/*` | +| 任务 | `GET/POST /api/tasks`、`GET/PUT/DELETE /api/tasks/{id}`、`GET /api/tasks/{id}/results`、`POST /api/tasks/{id}/run` | +| 用户 | CRUD `/api/users/*` | +| 审计和设置 | `GET /api/audit-logs`、`/api/settings/*`、备份/恢复端点 | +| GeoIP | `POST /api/geoip/download` | +| 移动端推送 | `POST /api/mobile/pair`、`POST /api/mobile/push/register`、`POST /api/mobile/push/unregister` | + +## WebSocket 端点 + +| 路径 | 认证 | 说明 | |------|------|------| -| POST/PUT/DELETE | `/api/servers/*` | 服务器管理 | -| PUT | `/api/servers/batch-capabilities` | 批量更新功能开关 | -| POST | `/api/servers/{id}/upgrade` | 触发 Agent 升级 | -| GET | `/api/servers/{target_id}/recovery-candidates` | 列出推荐的恢复候选项 | -| GET | `/api/servers/recovery-jobs/{job_id}` | 获取恢复任务详情 | -| POST | `/api/servers/{target_id}/recover-merge` | 启动 Agent 恢复任务 | -| CRUD | `/api/server-groups/*` | 服务器分组管理 | -| CRUD | `/api/notifications/*` | 通知渠道管理 | -| CRUD | `/api/notification-groups/*` | 通知组管理 | -| CRUD | `/api/alert-rules/*` | 告警规则管理 | -| CRUD | `/api/ping-tasks/*` | Ping 任务管理 | -| POST | `/api/tasks` | 创建远程命令任务 | -| GET | `/api/tasks/{id}` | 获取任务详情和结果 | -| CRUD | `/api/users/*` | 用户管理 | -| GET | `/api/audit-logs` | 审计日志 | -| GET/PUT | `/api/settings/*` | 系统设置 | -| POST | `/api/settings/backup` | 数据库备份 | -| POST | `/api/settings/restore` | 数据库恢复 | - -### WebSocket 端点 - -| 路径 | 说明 | -|------|------| -| `/api/ws/browser` | 浏览器实时数据推送 | -| `/api/ws/terminal/{server_id}` | Web 终端代理 | - -## 错误响应 - -所有 API 错误返回统一格式: - -```json -{ - "error": "Error message describing what went wrong" -} -``` +| `/api/agent/ws?token=` | Agent token 查询参数 | Agent 指标、命令、Ping、文件、Docker、Traceroute | +| `/api/ws/servers` | Session cookie、API Key 或 Bearer token | 浏览器/移动端实时服务器更新 | +| `/api/ws/terminal/{server_id}` | 已认证 Admin + `CAP_TERMINAL` | Web 终端代理;终端数据使用二进制帧 | +| `/api/ws/docker/logs/{server_id}` | 已认证 + `CAP_DOCKER` | 按容器流式传输 Docker 日志 | -常见状态码: +## 常见状态码 -| 状态码 | 说明 | +| 状态码 | 含义 | |--------|------| -| 400 | 请求参数错误 | +| 400 | 请求错误或操作无效 | | 401 | 未认证 | -| 403 | 无权限(角色不足或功能被禁用) | -| 404 | 资源不存在 | -| 429 | 请求过于频繁(限流) | +| 403 | 无权限:角色、capability 或 Agent 本地策略阻止操作 | +| 404 | 资源不存在、服务器离线或公开页面已禁用 | +| 409 | 冲突,例如 slug/name 重复 | +| 422 | 参数校验失败 | +| 429 | 请求过于频繁 | | 500 | 服务器内部错误 | diff --git a/apps/docs/content/docs/cn/capabilities.mdx b/apps/docs/content/docs/cn/capabilities.mdx index 6f8ca0d1..2d70ca74 100644 --- a/apps/docs/content/docs/cn/capabilities.mdx +++ b/apps/docs/content/docs/cn/capabilities.mdx @@ -87,8 +87,11 @@ ServerBee 采用纵深防御(defense in depth)策略,在 Server 端和 Age ### Server 端拦截 - **Terminal**:WebSocket 升级请求被 403 拦截 -- **Exec**:`POST /api/tasks` 过滤无权限服务器,写入合成结果(`exit_code = -2`,提示 "Capability 'exec' is disabled") -- **Ping**:按 capability 过滤任务,不向无权限 Agent 同步相关探测任务 +- **Exec**:`POST /api/tasks` 和计划任务运行会过滤无权限服务器,写入合成结果(`exit_code = -2`,提示 "Capability 'exec' is disabled") +- **Auto Upgrade**:`POST /api/servers/{id}/upgrade` 在未启用 `CAP_UPGRADE` 时返回 403 +- **Ping 和 Traceroute**:按 capability 过滤探测任务;Traceroute 需要 effective `CAP_PING_ICMP` +- **File Manager**:文件端点在下发前检查 `CAP_FILE`,未启用时直接拒绝 +- **Docker**:Docker 读取/操作端点和 Docker 日志 WebSocket 需要 `CAP_DOCKER`,并要求 Agent 运行时支持 Docker ### Agent 端拒绝 @@ -129,6 +132,7 @@ ServerBee 采用纵深防御(defense in depth)策略,在 Server 端和 Age + diff --git a/apps/docs/content/docs/cn/configuration.mdx b/apps/docs/content/docs/cn/configuration.mdx index 1ce13a54..8f42ee10 100644 --- a/apps/docs/content/docs/cn/configuration.mdx +++ b/apps/docs/content/docs/cn/configuration.mdx @@ -79,7 +79,7 @@ ServerBee 使用 [figment](https://github.com/SergioBenitez/Figment) 库加载 | 环境变量 | 默认值 | 说明 | |----------|--------|------| -| `SERVERBEE_GEOIP__MMDB_PATH` | `""` | MaxMind GeoLite2-City.mmdb 文件路径,路径非空即启用 GeoIP | +| `SERVERBEE_GEOIP__MMDB_PATH` | `""` | MaxMind 兼容 MMDB 文件路径。路径非空时使用该自定义 GeoIP 数据库;为空时管理员可在 Settings → GeoIP Database 下载 DB-IP Lite 数据库 | #### Resend(邮件通知) @@ -303,7 +303,7 @@ custom_themes = true # --- GeoIP 地理位置 --- [geoip] -# MaxMind MMDB 数据库文件路径,路径非空即启用 GeoIP +# MaxMind 兼容 MMDB 数据库文件路径,路径非空时使用自定义 GeoIP;为空时可在设置页下载 DB-IP Lite # 默认: "" mmdb_path = "" @@ -483,7 +483,7 @@ file = "/var/log/serverbee-agent.log" | Docker 事件保留 | `7` 天 | Docker 容器生命周期事件 | | 审计日志保留 | `180` 天 | 半年审计记录 | | 调度时区 | `UTC` | 流量日聚合时区 | -| GeoIP | 关闭 | 提供 MMDB 文件路径即启用 | +| GeoIP | 关闭 | 可提供 MMDB 文件路径,或在设置页下载 DB-IP Lite 数据库 | | 日志级别 | `info` | 推荐生产环境使用 | ### Agent 默认值 diff --git a/apps/docs/content/docs/cn/custom-themes.mdx b/apps/docs/content/docs/cn/custom-themes.mdx index 06a0c822..f2aa7eee 100644 --- a/apps/docs/content/docs/cn/custom-themes.mdx +++ b/apps/docs/content/docs/cn/custom-themes.mdx @@ -54,6 +54,39 @@ ServerBee 内置多套预设主题,管理员也可以创建完整的自定义 - Alpha 必须在 `0` 到 `1` 之间,或在 `0%` 到 `100%` 之间。 - Chroma 不设硬上限。浏览器可能会对无法显示的颜色做色域裁切。 +## 品牌和白标 + +同一个 **设置 → 外观** 页面也用于控制基础品牌信息。品牌设置保存在服务端数据库中,后台外壳和公开页面都会读取。 + +| 字段 | 说明 | +|------|------| +| `site_title` | UI 显示的浏览器/应用标题 | +| `footer_text` | UI 渲染产品页脚时显示的文本 | +| `logo_path` | 上传 Logo 的公开路径,通常为 `/api/brand/logo` | +| `favicon_path` | 上传 Favicon 的公开路径,通常为 `/api/brand/favicon` | + +公开端点: + +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/settings/brand` | 无需认证读取品牌配置 | +| GET | `/api/brand/logo` | 返回上传的 Logo | +| GET | `/api/brand/favicon` | 返回上传的 Favicon | + +管理员端点: + +| 方法 | 路径 | 说明 | +|------|------|------| +| PUT | `/api/settings/brand` | 以 JSON 更新 `site_title`、`footer_text`、`logo_path` 和 `favicon_path` | +| POST | `/api/settings/brand/logo` | 通过 multipart 字段 `file` 上传 Logo | +| POST | `/api/settings/brand/favicon` | 通过 multipart 字段 `file` 上传 Favicon | + +Logo 和 Favicon 仅支持 PNG 或 ICO 文件,大小上限为 512 KB。上传新资源会替换同类型旧资源。 + + +`PUT /api/settings/brand` 期望 JSON。图片文件需要通过独立的 logo/favicon 上传端点提交,而不是通过 JSON 更新端点提交。 + + ## 禁用自定义主题 设置: diff --git a/apps/docs/content/docs/cn/dashboards.mdx b/apps/docs/content/docs/cn/dashboards.mdx new file mode 100644 index 00000000..7e1aca11 --- /dev/null +++ b/apps/docs/content/docs/cn/dashboards.mdx @@ -0,0 +1,114 @@ +--- +title: 仪表盘与组件 +description: 使用可拖拽组件和可复用布局构建自定义监控仪表盘。 +icon: LayoutDashboard +--- + +ServerBee 仪表盘由可配置组件组成。你可以保留默认总览仪表盘,也可以按地区、团队或用途创建多个仪表盘,并设置一个作为所有用户默认打开的仪表盘。 + +## 管理仪表盘 + +在首页顶部的仪表盘切换器中可以: + +- 切换不同仪表盘 +- 创建新仪表盘 +- 重命名仪表盘 +- 删除不再需要的仪表盘 +- 将某个仪表盘设为默认 + +如果系统中还没有仪表盘,ServerBee 会自动创建第一个默认仪表盘。已经是默认的仪表盘不能直接取消默认;请把另一个仪表盘设为默认。 + +## 编辑布局 + +1. 打开一个仪表盘。 +2. 点击 **Edit**。 +3. 从组件选择器添加组件。 +4. 拖拽组件调整位置。 +5. 在组件的最小/最大尺寸约束内调整大小。 +6. 配置组件数据源和标题。 +7. 点击 **Save**。 + +布局以网格坐标保存:`grid_x`、`grid_y`、`grid_w`、`grid_h`,组件设置保存为 JSON。保存时会做差异更新:已有组件更新,新组件插入,从布局中移除的组件会被删除。 + +## 组件类型 + +| 组件 | 分类 | 典型用途 | +|------|------|----------| +| `stat-number` | 实时 | 展示单台服务器的一个指标,例如 CPU 或内存 | +| `server-cards` | 实时 | 展示选中服务器的紧凑卡片 | +| `gauge` | 实时 | 用仪表盘形式展示某台服务器的一个指标 | +| `line-chart` | 图表 | 展示单台服务器单一指标的历史曲线 | +| `multi-line` | 图表 | 对比多台服务器的同一指标 | +| `top-n` | 实时 | 按某个指标给服务器排行 | +| `alert-list` | 状态 | 展示活动或最近告警状态 | +| `service-status` | 状态 | 展示服务监控状态 | +| `traffic-bar` | 图表 | 展示单台服务器流量使用情况 | +| `disk-io` | 图表 | 展示磁盘读写吞吐历史 | +| `server-map` | 状态 | 安装 GeoIP 后在地图上展示服务器位置 | +| `markdown` | 状态 | 用 Markdown 添加说明、Runbook 或链接 | +| `uptime-timeline` | 状态 | 展示选中服务器的可用性条形时间线 | + +## 常见组件配置 + +多数组件设置保存在 `config_json` 中。常见字段包括: + +| 字段 | 使用场景 | 含义 | +|------|----------|------| +| `server_id` | 单服务器组件 | 要查询的服务器 ID | +| `server_ids` | 多服务器组件 | 要包含的服务器 ID 列表 | +| `metric` | 指标组件 | 指标键,例如 CPU、内存、磁盘、流量或负载 | +| `hours` | 历史组件 | 图表回看时间窗口 | +| `interval` | 历史组件 | 数据粒度(`raw`、`hourly` 或 `auto`) | +| `monitor_ids` | Service Status | 要展示的服务监控 ID | +| `content` | Markdown | Markdown 内容 | + +## GeoIP 与服务器地图 + +Server Map 组件需要 GeoIP 数据。你可以: + +- 通过 `geoip.mmdb_path` 配置自定义 MaxMind 兼容 MMDB 文件,或 +- 在 **Settings → GeoIP Database** 中下载 DB-IP Lite 数据库。 + +缺少 GeoIP 数据时,组件会显示安装提示。 + +## API + +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/dashboards` | 列出仪表盘 | +| GET | `/api/dashboards/default` | 获取默认仪表盘,必要时自动创建 | +| GET | `/api/dashboards/{id}` | 获取仪表盘及其组件 | +| POST | `/api/dashboards` | 创建仪表盘 | +| PUT | `/api/dashboards/{id}` | 更新元数据和/或组件 | +| DELETE | `/api/dashboards/{id}` | 删除仪表盘 | + +更新请求示例: + +```json +{ + "name": "Production", + "is_default": true, + "widgets": [ + { + "widget_type": "stat-number", + "title": "Web CPU", + "config_json": { + "server_id": "server-id", + "metric": "cpu", + "unit": "%" + }, + "grid_x": 0, + "grid_y": 0, + "grid_w": 2, + "grid_h": 1, + "sort_order": 0 + } + ] +} +``` + + + + + + diff --git a/apps/docs/content/docs/cn/file-manager.mdx b/apps/docs/content/docs/cn/file-manager.mdx new file mode 100644 index 00000000..f7ab2585 --- /dev/null +++ b/apps/docs/content/docs/cn/file-manager.mdx @@ -0,0 +1,132 @@ +--- +title: 文件管理器 +description: 通过 ServerBee 浏览、读取、编辑、上传、下载和管理远程文件。 +icon: FolderOpen +--- + +文件管理器通过 ServerBee Agent 提供受控的远程文件系统访问能力。它适合查看日志、编辑小型配置文件、传输文件等运维场景,不必打开完整终端。 + + +文件管理器属于高风险功能。请只在可信服务器上启用,并把 `root_paths` 限制到最小必要目录。 + + +## 启用条件 + +文件管理必须在两层同时启用: + +1. **Server 端 capability:** 在 **Settings → Capabilities** 或服务器详情页启用 `CAP_FILE`。 +2. **Agent 本地策略:** 设置 `[file].enabled = true`,并配置至少一个允许访问的根目录。 + +示例 `agent.toml`: + +```toml +[file] +enabled = true +root_paths = ["/home", "/var/log", "/etc/serverbee"] +max_file_size = 1073741824 +# 默认值列在这里便于参考 +deny_patterns = ["*.key", "*.pem", "id_rsa*", ".env*", "shadow", "passwd"] +``` + +等价环境变量: + +```bash +SERVERBEE_FILE__ENABLED=true +SERVERBEE_FILE__ROOT_PATHS=/home,/var/log,/etc/serverbee +SERVERBEE_FILE__MAX_FILE_SIZE=1073741824 +``` + +## 访问方式 + +在服务器操作菜单中点击 **Files**,或直接访问: + +``` +/files/{serverId} +``` + +当服务器的 effective capabilities 中没有 `CAP_FILE` 时,前端会隐藏 Files 按钮。 + +## 权限 + +| 角色 | 允许操作 | +|------|----------| +| Admin | 浏览、stat、读取、写入、上传、下载、删除、移动、新建目录、取消传输 | +| Member | 浏览、stat、读取、下载、查看自己的传输 | + +所有高风险文件操作都会写入审计日志。因 capability 关闭而被拒绝的尝试也会记录。 + +## 支持的操作 + +| 操作 | 说明 | +|------|------| +| 列目录 | 浏览允许根目录下的文件和目录 | +| Stat | 获取单一路径的元数据 | +| 读取 | 读取 UTF-8 文本内容,用于预览或编辑器 | +| 写入 | 用提供的文本替换文件内容 | +| 上传 | 上传本地文件到远程路径 | +| 下载 | 启动经 ServerBee 中转的下载传输,再从 ServerBee 获取文件 | +| 删除 | 删除文件,或递归删除目录 | +| 新建目录 | 创建目录 | +| 移动 | 重命名或移动文件/目录 | +| 传输管理 | 查看和取消活动传输 | + +## 安全模型 + +Agent 在访问文件系统前会执行路径安全检查: + +- `root_paths` 是允许列表。空列表会拒绝所有文件操作。 +- 路径解析后必须位于某个允许根目录内。 +- `deny_patterns` 会拒绝敏感名称,例如私钥、`.env*`、`shadow`、`passwd`。 +- Agent 同样检查本地 capability,因此 Server 端不能覆盖 Agent 本地拒绝策略。 +- Server 在下发文件消息前也会检查 `CAP_FILE`。 + +## 限制 + +| 限制 | 默认值 | 配置位置 | +|------|--------|----------| +| 上传大小 | 100 MB | Server `file.max_upload_size` / `SERVERBEE_FILE__MAX_UPLOAD_SIZE` | +| Agent 读取/下载文件大小 | 1 GB | Agent `[file].max_file_size` / `SERVERBEE_FILE__MAX_FILE_SIZE` | +| 内联读取分块 | 384 KB | 协议限制,用于保证 WebSocket 帧小于最大消息大小 | + +上传和下载均采用分块传输。下载会在 Server 端创建临时传输,pending 或 in progress 状态时可以取消。 + +## API + +读取类端点对 Admin 和 Member 可用。写入类端点需要 Admin。 + +| 方法 | 路径 | 说明 | +|------|------|------| +| POST | `/api/files/{server_id}/list` | 列目录;请求体 `{ "path": "/var/log" }` | +| POST | `/api/files/{server_id}/stat` | 获取路径元数据 | +| POST | `/api/files/{server_id}/read` | 读取 UTF-8 文本内容 | +| GET | `/api/files/download/{transfer_id}` | 下载当前用户拥有的 ready 传输 | +| GET | `/api/files/transfers` | 列出当前用户拥有的传输 | +| POST | `/api/files/{server_id}/write` | 替换文件内容 | +| POST | `/api/files/{server_id}/delete` | 删除文件/目录;支持 `recursive` | +| POST | `/api/files/{server_id}/mkdir` | 创建目录 | +| POST | `/api/files/{server_id}/move` | 移动或重命名路径 | +| POST | `/api/files/{server_id}/download` | 启动下载传输 | +| POST | `/api/files/{server_id}/upload` | multipart 上传,字段为 `path` 和 `file` | +| DELETE | `/api/files/transfers/{transfer_id}` | 取消传输 | + +示例: + +```bash +curl -X POST https://your-server/api/files/server-id/list \ + -H "X-API-Key: serverbee_..." \ + -H "Content-Type: application/json" \ + -d '{"path":"/var/log"}' +``` + +```bash +curl -X POST https://your-server/api/files/server-id/upload \ + -H "X-API-Key: serverbee_..." \ + -F 'path=/tmp/example.txt' \ + -F 'file=@example.txt' +``` + + + + + + diff --git a/apps/docs/content/docs/cn/index.mdx b/apps/docs/content/docs/cn/index.mdx index e684f300..d96ec653 100644 --- a/apps/docs/content/docs/cn/index.mdx +++ b/apps/docs/content/docs/cn/index.mdx @@ -10,32 +10,36 @@ ServerBee 是一个轻量级、自托管的 VPS 监控探针系统,专为个 ServerBee 由两个核心组件构成: -- **Server**:部署在管理服务器上的控制中心,提供 Web 管理面板、数据存储、告警评估和 API 接口 -- **Agent**:部署在每台被监控服务器上的轻量级采集程序,负责采集系统指标并实时上报 +- **Server**:部署在管理服务器上的控制中心,提供 Web 管理面板、数据存储、告警评估、后台任务和 API 接口 +- **Agent**:部署在每台被监控服务器上的轻量级采集程序,负责采集系统指标并实时上报,也可按授权执行探测、终端、文件、Docker 等操作 -Agent 通过 WebSocket 与 Server 保持长连接,实现毫秒级的实时数据推送。所有数据存储在 SQLite 数据库中,无需安装任何外部数据库服务。 +Agent 通过 WebSocket 与 Server 保持长连接,实现实时数据推送。所有数据存储在 SQLite 数据库中,无需安装外部数据库服务。 ## 核心功能 -### 实时监控 +### 实时监控与自定义仪表盘 -通过 WebSocket 驱动的实时仪表盘,一目了然地查看所有服务器的运行状态。支持 CPU、内存、磁盘、网络、负载、温度等多维度指标,以及历史趋势图表。 +通过 WebSocket 驱动的实时仪表盘查看所有服务器运行状态。支持 CPU、内存、磁盘、网络、负载、温度、GPU、磁盘 I/O、流量等指标,以及多仪表盘、多组件布局。 -### 告警通知 +### 告警通知与服务监控 -灵活的告警规则引擎,支持 14 种以上指标类型的阈值监控。采用采样窗口和触发比例机制降低误报。支持 Webhook、Telegram、Bark、Email 等多种通知渠道。 +灵活的告警规则引擎支持资源阈值、流量周期、网络质量、离线、到期和 IP 变化事件。服务监控支持 SSL、DNS、HTTP 关键字、TCP 和 WHOIS 检查,并可通过 Webhook、Telegram、Bark、Email、APNs 等通知组发送告警。 -### Web 终端 +### Web 终端、远程任务与文件管理 -通过浏览器直接访问服务器的 Shell 终端,无需额外的 SSH 客户端。基于 PTY 实现完整的终端体验,支持交互式程序和彩色输出。 +通过浏览器访问服务器 Shell 终端,执行一次性命令或 cron 计划任务。文件管理器提供受控的远程浏览、读取、编辑、上传和下载能力,配合路径沙箱和审计日志使用。 -### Ping 探测 +### Ping、网络质量与 Traceroute -支持 ICMP、TCP、HTTP 三种探测协议。可以从多个 Agent 节点同时探测同一目标,对比不同地区的网络质量。 +支持 ICMP、TCP、HTTP 探测,可从多个 Agent 节点检测目标可用性、延迟和丢包。网络详情页还提供 Traceroute 排障能力。 -### GPU 监控 +### Docker 管理 -支持 NVIDIA GPU 指标采集,包括利用率、显存使用量、温度等。每块 GPU 独立记录,适合监控 GPU 服务器。 +启用 Docker capability 后,可以查看容器列表、实时统计、生命周期事件、日志流、网络、卷,并执行容器操作。 + +### 公开状态页、主题和品牌 + +可创建多个公开状态页,配置独立 slug、服务器范围、事件公告、维护窗口、可用性阈值、主题和自定义 CSS。外观设置支持预设/自定义主题,以及站点标题、Logo、Favicon 和页脚文本。 ## 技术栈 @@ -62,16 +66,21 @@ Agent 通过 WebSocket 与 Server 保持长连接,实现毫秒级的实时数 + + + + + + - diff --git a/apps/docs/content/docs/cn/meta.json b/apps/docs/content/docs/cn/meta.json index 5d5a077d..345e5729 100644 --- a/apps/docs/content/docs/cn/meta.json +++ b/apps/docs/content/docs/cn/meta.json @@ -10,8 +10,11 @@ "configuration", "---功能---", "monitoring", + "dashboards", "alerts", + "service-monitors", "terminal", + "file-manager", "ping", "capabilities", "status-page", diff --git a/apps/docs/content/docs/cn/mobile.mdx b/apps/docs/content/docs/cn/mobile.mdx index 1027ab6d..4f676031 100644 --- a/apps/docs/content/docs/cn/mobile.mdx +++ b/apps/docs/content/docs/cn/mobile.mdx @@ -138,7 +138,7 @@ refresh_ttl = 2592000 # 刷新令牌有效期(秒),默认 30 天 ### WebSocket 连接 iOS 应用维护 WebSocket 连接以获取实时更新: -- 连接 URL:`wss://your-server/api/ws/browser` +- 连接 URL:`wss://your-server/api/ws/servers` - 握手期间通过 `Authorization: Bearer` 头进行认证 - 自动重连,使用指数退避策略 - 访问令牌过期时连接自动关闭(重连触发刷新) diff --git a/apps/docs/content/docs/cn/monitoring.mdx b/apps/docs/content/docs/cn/monitoring.mdx index adf2bf08..89e01808 100644 --- a/apps/docs/content/docs/cn/monitoring.mdx +++ b/apps/docs/content/docs/cn/monitoring.mdx @@ -18,6 +18,12 @@ ServerBee 提供全面的服务器监控能力,通过 WebSocket 实时推送 - 服务器名称、地区、国旗、操作系统及网络质量迷你图 - **实时刷新**:所有数据通过 WebSocket 驱动,无需手动刷新 +如需创建面向不同场景的运维视图,可以使用 [仪表盘与组件](/cn/docs/dashboards) 创建额外仪表盘布局,包含图表、地图、服务状态、Markdown 说明和可用性时间线等组件。 + +### GeoIP 显示 + +地区/国家标签和 Server Map 组件需要 GeoIP 数据。你可以通过 `geoip.mmdb_path` 配置自定义 MaxMind 兼容 MMDB 文件,也可以在 **Settings → GeoIP Database** 下载 DB-IP Lite 数据库。状态查询端点为 `GET /api/geoip/status`,管理员可通过 `POST /api/geoip/download` 触发下载。 + ## 实时指标推送 ServerBee 通过 WebSocket 实现端到端的实时数据传输: @@ -32,7 +38,7 @@ Agent (每3秒采集) --WebSocket--> Server (内存缓存) --WebSocket--> 浏览 2. **Server 缓存**:AgentManager 将每台服务器的最新指标保存在内存中 3. **浏览器推送**:Server 通过 `broadcast` 通道将更新推送给所有已连接的浏览器 -浏览器端连接 WebSocket 后的消息流程: +浏览器端通过 `/api/ws/servers` 连接 WebSocket 后的消息流程: | 消息类型 | 时机 | 说明 | |----------|------|------| @@ -399,6 +405,26 @@ ServerBee 内置了网络质量监控系统,通过各 Agent 对网络目标发 - **统计栏** -- 综合平均延迟、可用性百分比、目标数量 - **CSV 导出** -- 下载选定时间范围的探测数据 +### Traceroute + +网络详情页可以从选中的 Agent 对目标主机或 IP 发起 Traceroute,用于排查常规探测间隔之外的路由变化和丢包问题。 + +- 需要 Agent 的 effective `CAP_PING_ICMP` 能力。 +- 目标只允许字母、数字、点、连字符和冒号。 +- Server 下发 `max_hops = 30`。 +- Agent 侧命令 60 秒超时。 +- Linux/macOS 上 Agent 优先尝试 `traceroute`,如果不存在则回退到 `mtr`。 +- Windows 上使用平台对应的 traceroute 命令。 + +API 流程: + +```http +POST /api/servers/{id}/traceroute +GET /api/servers/{id}/traceroute/{request_id} +``` + +第一个请求返回 `request_id`;轮询第二个端点直到 `completed` 为 true。 + ### 数据保留 网络探测记录采用与系统指标相同的两级存储: diff --git a/apps/docs/content/docs/cn/service-monitors.mdx b/apps/docs/content/docs/cn/service-monitors.mdx new file mode 100644 index 00000000..c0cf869d --- /dev/null +++ b/apps/docs/content/docs/cn/service-monitors.mdx @@ -0,0 +1,171 @@ +--- +title: 服务监控 +description: 从 ServerBee 服务端监控 SSL 证书、DNS 记录、HTTP 关键字、TCP 端口和 WHOIS 到期时间。 +icon: Radar +--- + +服务监控(Service Monitors)是一类由 ServerBee 服务端主动执行的合成检查。它适合监控公开服务,即使这些服务并不运行在安装了 ServerBee Agent 的主机上也可以使用。 + +与 Ping 监控不同,Ping 监控由 Agent 从各节点发起探测;服务监控由中心 Server 进程执行。检查结果会写入 SQLite,在仪表盘中展示,也可以通过通知组发送告警。 + +## 支持的监控类型 + +| 类型 | 目标格式 | 检查内容 | +|------|----------|----------| +| `ssl` | `example.com` 或 `example.com:443` | TLS 握手、证书有效期、签发者/主体、SHA-256 指纹 | +| `dns` | `example.com` | `A`、`AAAA`、`CNAME`、`MX`、`TXT` 记录,可与期望值比较 | +| `http_keyword` | `https://example.com/health` | HTTP 状态码,以及响应正文中关键字存在/不存在 | +| `tcp` | `host:port` | TCP 连接是否成功和连接延迟 | +| `whois` | `example.com` 或 URL | 域名到期时间和注册商,优先使用 WHOIS 客户端,失败时回退系统命令 | + +## 创建监控 + +1. 打开 **Settings → Service Monitors**。 +2. 点击 **New Monitor**。 +3. 选择监控类型并填写目标。 +4. 设置检查间隔(秒)。 +5. 配置该类型的专属参数。 +6. 可选:选择通知组。 +7. 保存监控。 + +后台检查器每 10 秒唤醒一次,根据每个监控的 `interval` 判断是否到期,并最多并发执行 20 个检查。 + +## 类型配置 + +### SSL 证书 + +```json +{ + "port": 443, + "warning_days": 14, + "critical_days": 7, + "timeout": 10 +} +``` + +- `port` 默认 `443`,除非目标中已经包含端口。 +- 证书剩余天数小于等于 `critical_days` 时检查失败。 +- 证书剩余天数小于等于 `warning_days` 时,结果详情中会包含 warning 信息。 + +### DNS 记录 + +```json +{ + "record_type": "A", + "expected_values": ["203.0.113.10"], + "nameserver": "8.8.8.8" +} +``` + +- `record_type` 默认 `A`,支持 `A`、`AAAA`、`CNAME`、`MX`、`TXT`。 +- 不设置 `expected_values` 时,只要解析返回至少一个值即视为成功。 +- 设置 `expected_values` 时,返回值排序后必须与期望值排序后完全一致。 +- `nameserver` 可选;不设置时使用系统解析器。 + +### HTTP 关键字 + +```json +{ + "method": "GET", + "expected_status": [200], + "keyword": "ok", + "keyword_exists": true, + "headers": { + "User-Agent": "ServerBee" + }, + "body": null, + "timeout": 10 +} +``` + +- `method` 支持 `GET` 和 `POST`。 +- `expected_status` 默认 `[200]`。 +- 设置 `keyword` 后,`keyword_exists: true` 表示必须出现,`false` 表示必须不存在。 +- `headers` 的值必须是字符串。 +- `body` 用于 `POST` 请求。 + +### TCP 端口 + +```json +{ + "timeout": 10 +} +``` + +目标必须是 `host:port`。在超时时间内能建立 TCP 连接即视为成功。 + +### WHOIS 到期 + +```json +{ + "warning_days": 30, + "critical_days": 7 +} +``` + +- 目标会归一化为域名,因此 `https://example.com/path` 会变为 `example.com`。 +- 域名剩余天数小于等于 `critical_days` 时检查失败。 +- `.app`、`.dev`、`.page` 等 TLD 不暴露标准 WHOIS 服务,建议改用 SSL 监控。 + +## 通知和重试 + +每个监控都可以关联一个通知组。每次检查时: + +1. 写入一条记录,包含 `success`、`latency`、`detail_json`、`error`、`time`。 +2. 更新监控的 `last_status`、`last_checked_at`、`consecutive_failures`。 +3. 只有当 `consecutive_failures > retry_count` 时才发送失败通知。 +4. 如果之前处于失败状态,本次恢复成功,则发送恢复通知。 + +如果监控关联了服务器,并且其中任意服务器正处于活动维护窗口,本次检查会跳过通知发送,但检查记录仍会保存。 + +## 历史记录与保留 + +进入监控详情页可以查看: + +- 最新状态和延迟 +- 最近成功/失败历史 +- 延迟趋势图 +- 每次检查的原始详情和错误信息 + +服务监控记录默认保留 30 天(`retention.service_monitor_days`)。 + +## API + +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/service-monitors` | 列出监控;可用 `?type=ssl` 过滤 | +| GET | `/api/service-monitors/{id}` | 获取单个监控及最新记录 | +| POST | `/api/service-monitors` | 创建监控 | +| PUT | `/api/service-monitors/{id}` | 更新监控 | +| DELETE | `/api/service-monitors/{id}` | 删除监控及其记录 | +| GET | `/api/service-monitors/{id}/records` | 查询记录,可带 `from`、`to`、`limit` | +| POST | `/api/service-monitors/{id}/check` | 立即触发一次检查 | + +创建请求示例: + +```json +{ + "name": "Website SSL", + "monitor_type": "ssl", + "target": "example.com", + "interval": 300, + "config_json": { + "warning_days": 14, + "critical_days": 7 + }, + "notification_group_id": "notification-group-id", + "retry_count": 1, + "server_ids_json": ["server-id"], + "enabled": true +} +``` + + +`server_ids_json` 用于把服务监控与服务器关联,主要用于维护窗口抑制通知和仪表盘展示上下文。检查本身仍由中心 ServerBee 服务端执行。 + + + + + + + diff --git a/apps/docs/content/docs/cn/status-page.mdx b/apps/docs/content/docs/cn/status-page.mdx index f51c962c..47b38db0 100644 --- a/apps/docs/content/docs/cn/status-page.mdx +++ b/apps/docs/content/docs/cn/status-page.mdx @@ -1,100 +1,167 @@ --- title: 公开状态页 -description: 为访客提供无需登录的服务器在线状态展示页面。 +description: 发布包含事件公告、维护窗口、可用性历史和自定义主题的公开健康状态页。 icon: Globe --- -ServerBee 提供一个公开状态页面,无需登录即可查看服务器的在线状态和基本指标。适合向用户或团队展示服务运行状况。 +ServerBee 提供两种公开状态页: -## 访问方式 +- **默认状态页:** `https://your-server/status`,由 `GET /api/status` 提供数据,展示所有非隐藏服务器。 +- **可配置状态页:** `https://your-server/status/{slug}`,由 `GET /api/status/{slug}` 提供数据,展示管理员选择的服务器和配置项。 -状态页面地址:`https://your-server/status` +两者均为公开访问,不需要登录认证。 -该页面不需要任何认证,任何人都可以访问。 +## 默认 `/status` 页面 -## 页面内容 +默认状态页适合不创建自定义页面时快速公开服务器概览。 -状态页展示以下信息: +它展示: -- **在线/总数统计**:显示当前在线服务器数量和总服务器数量 -- **服务器列表**:按分组展示所有非隐藏的服务器 -- **在线状态**:每台服务器显示在线/离线状态指示 -- **实时指标**:在线服务器显示 CPU、内存、磁盘使用率进度条 -- **90 天可用性时间线**:每台服务器展示过去 90 天的每日可用性彩色条形图 -- **可用性百分比**:根据每日可用性数据计算的总体可用率 -- **自动刷新**:页面每 10 秒自动从 API 获取最新数据 +- 在线/总服务器数量 +- 所有 `hidden = false` 的服务器 +- 服务器分组标签 +- 在线/离线状态 +- 在线服务器的实时指标:CPU、内存、磁盘、网络速率/流量、uptime、负载 +- 已配置的公开备注 -## 显示规则 +公开 API: -- 仅展示 **非隐藏** 的服务器(管理员可在服务器编辑中设置 `hidden` 属性) -- 按服务器分组(Group)组织展示 -- 离线服务器不显示指标数据,仅显示离线状态 - -## API 端点 - -``` +```http GET /api/status ``` -公开端点,无需认证。返回结构: +## 可配置状态页 -```json -{ - "data": { - "servers": [ - { - "id": "server-uuid", - "name": "Web Server 1", - "hostname": "web1.example.com", - "is_online": true, - "group_name": "Production", - "cpu": 45.2, - "mem_used": 8589934592, - "mem_total": 17179869184, - "disk_used": 53687091200, - "disk_total": 107374182400 - } - ], - "groups": [ - { "id": "group-uuid", "name": "Production" } - ], - "online_count": 8, - "total_count": 10 - } -} +在 **Settings → Status Pages** 中创建和管理状态页。每个页面都有独立 slug,可以单独分享。 + +### 页面设置 + +| 设置 | API 字段 | 说明 | +|------|----------|------| +| 标题 | `title` | 公开页面标题 | +| Slug | `slug` | `/status/{slug}` 的 URL 片段 | +| 描述 | `description` | 可选介绍文本 | +| 服务器 | `server_ids_json` | 此页面展示的服务器 | +| 按服务器分组 | `group_by_server_group` | 按 ServerBee 分组组织服务器 | +| 显示数值 | `show_values` | 在公开页显示可用性/状态数值 | +| 自定义 CSS | `custom_css` | 应用于该页面的额外 CSS | +| 启用 | `enabled` | 禁用后页面返回 404 | +| 黄色可用性阈值 | `uptime_yellow_threshold` | 低于该百分比的日期显示为降级 | +| 红色可用性阈值 | `uptime_red_threshold` | 低于该百分比的日期显示为严重故障 | +| 主题 | `theme_ref` | 预设/自定义主题引用,或 `null` 跟随后台默认主题 | + + +当前 API 请求字段使用 `server_ids_json` 和 `status_page_ids_json` 表示选择的 ID。这些字段在请求体中接受 JSON 数组。 + + +### 公开页面数据 + +```http +GET /api/status/{slug} ``` +响应包含: + +- `page` -- 页面元数据和显示选项 +- `theme` -- 解析后的主题变量 +- `servers` -- 选中服务器状态、可用性百分比和 90 天每日可用性数据 +- `active_incidents` -- 关联到页面且尚未解决的事件 +- `planned_maintenances` -- 关联到页面的活动/计划维护窗口 +- `recent_incidents` -- 最近历史中的已解决事件 + +服务器条目包含 `server_id`、`server_name`、地区/国家、操作系统、分组、`online`、`uptime_percent`、`uptime_daily` 和 `in_maintenance`。 + ## 可用性时间线 -状态页上每台服务器都会显示 90 天可用性时间线。时间线是一个水平条形图,每根条形代表一天: +可配置状态页上的每台服务器都可以展示 90 天可用性时间线。每根条代表一天: -- **绿色** -- 100% 可用(全天在线) -- **黄色** -- 可用性下降(低于黄色阈值,默认 100%) -- **红色** -- 严重故障(低于红色阈值,默认 95%) -- **灰色** -- 该日无数据 +- **绿色** -- 健康可用性 +- **黄色** -- 低于该页面黄色阈值 +- **红色** -- 低于该页面红色阈值 +- **灰色** -- 无数据 -将鼠标悬停在条形上可查看日期、可用性百分比和在线/总分钟数。 +可用性数据来自 `uptime_daily` 表,由服务端后台聚合任务生成。缺失日期会自动补齐,保证时间线连续。 -可用性数据来自 `uptime_daily` 表,由服务端后台聚合任务填充。状态页 API 返回的每日条目会自动补全无数据的日期(显示为灰色条,0 分钟)。 +## 事件公告(Incidents) -### 可用性阈值 +事件公告用于公开说明故障或服务降级。事件可以关联到指定服务器、状态页或两者。 -管理员可以在 **设置 > 状态页** 中为每个状态页自定义颜色阈值: +### 字段 -- **黄色阈值**(默认:100%)-- 低于此百分比的天数显示为黄色 -- **红色阈值**(默认:95%)-- 低于此百分比的天数显示为红色 +| 字段 | 说明 | +|------|------| +| `title` | 事件标题 | +| `status` | `investigating`、`identified`、`monitoring` 或 `resolved` | +| `severity` | `minor`、`major` 或 `critical` | +| `server_ids_json` | 可选,受影响服务器 | +| `status_page_ids_json` | 可选,受影响状态页 | -这些阈值按状态页存储,并在公开 API 响应中返回,允许不同的状态页设置不同的灵敏度。 +一个事件可以包含多条 update。添加 update 会记录消息,并把事件状态更新为该 update 的状态。状态变为 `resolved` 时会设置 `resolved_at`。 -## 隐藏服务器 +### API -如果某些服务器不希望出现在状态页上: +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/incidents` | 列出事件;支持按状态过滤 | +| POST | `/api/incidents` | 创建事件 | +| PUT | `/api/incidents/{id}` | 更新事件 | +| DELETE | `/api/incidents/{id}` | 删除事件 | +| POST | `/api/incidents/{id}/updates` | 添加事件更新 | + +## 维护窗口 + +维护窗口用于公告计划维护,并在活动期间抑制相关服务器的通知。 + +### 字段 + +| 字段 | 说明 | +|------|------| +| `title` | 维护标题 | +| `description` | 可选详情 | +| `start_at` | UTC 开始时间 | +| `end_at` | UTC 结束时间,必须晚于 `start_at` | +| `server_ids_json` | 可选,受影响服务器 | +| `status_page_ids_json` | 可选,受影响状态页 | +| `active` | 是否启用该维护窗口 | + +### API + +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/maintenances` | 列出维护窗口 | +| POST | `/api/maintenances` | 创建维护窗口 | +| PUT | `/api/maintenances/{id}` | 更新维护窗口 | +| DELETE | `/api/maintenances/{id}` | 删除维护窗口 | + +## 管理 API + +| 方法 | 路径 | 说明 | +|------|------|------| +| GET | `/api/status-pages` | 列出已配置状态页 | +| POST | `/api/status-pages` | 创建状态页 | +| PUT | `/api/status-pages/{id}` | 更新状态页 | +| DELETE | `/api/status-pages/{id}` | 删除状态页 | + +创建示例: + +```json +{ + "title": "Production Status", + "slug": "production", + "description": "Public health for production services", + "server_ids_json": ["server-id-1", "server-id-2"], + "group_by_server_group": true, + "show_values": true, + "enabled": true, + "uptime_yellow_threshold": 99.9, + "uptime_red_threshold": 95 +} +``` -1. 进入服务器详情页 → 编辑 -2. 勾选 **Hidden** 选项 -3. 该服务器不会出现在 `/api/status` 返回的列表中 +更新时可以传入 `theme_ref` 设置自定义主题,例如 `"preset:default"` 或自定义主题引用。传 `null` 表示跟随后台默认主题。 - - + + + diff --git a/apps/docs/content/docs/en/admin.mdx b/apps/docs/content/docs/en/admin.mdx index 31a4c839..00d6999d 100644 --- a/apps/docs/content/docs/en/admin.mdx +++ b/apps/docs/content/docs/en/admin.mdx @@ -106,7 +106,7 @@ Administrators can dispatch commands to online servers and retrieve execution re | `/api/tasks/{id}` | GET | Get task details | | `/api/tasks/{id}/results` | GET | Get execution results | -Request body for creating a task: +Request body for creating a one-shot task: ```json { @@ -116,6 +116,75 @@ Request body for creating a task: } ``` +## Scheduled Tasks + +Scheduled tasks extend remote commands with cron-based execution, retries, and run history. + +### Creating a Scheduled Task + +1. Go to **Settings → Tasks**. +2. Switch to the scheduled task section. +3. Enter a name, command, cron expression, target servers, timeout, retry count, and retry interval. +4. Save the task. Enabled tasks are registered with the server-side scheduler immediately. + +Scheduled tasks use the server's `scheduler.timezone` setting when calculating `next_run_at`. + +### Fields + +| Field | Description | +|-------|-------------| +| `name` | Display name for the scheduled task | +| `task_type` | Use `scheduled` for cron tasks; omitted/default is `oneshot` | +| `cron_expression` | Cron expression parsed by the server scheduler | +| `command` | Shell command sent to each target agent | +| `server_ids` | Target servers | +| `timeout` | Per-attempt command timeout in seconds; default execution timeout is 300 seconds | +| `retry_count` | Number of retry attempts after the first attempt; must be 0-10 | +| `retry_interval` | Seconds to wait between retries; must be at least 1 | +| `enabled` | Disabled tasks are removed from the scheduler and do not run automatically | + +### Execution Behavior + +- Each scheduled trigger creates a `run_id` used to group results from all target servers and retry attempts. +- ServerBee prevents overlapping runs of the same scheduled task. If the previous run is still active, the new trigger is skipped. +- `POST /api/tasks/{id}/run` starts a manual run and skips retry logic. +- Disabling or deleting a task cancels its active run. +- Results include an `attempt` number so you can distinguish retries. + +Synthetic exit codes: + +| Exit code | Meaning | +|-----------|---------| +| `-2` | `CAP_EXEC` disabled or blocked by agent-local capability policy | +| `-3` | Server offline or dispatch failed | +| `-4` | No agent response before timeout | + +### API Example + +```json +{ + "task_type": "scheduled", + "name": "Daily disk check", + "command": "df -h", + "server_ids": ["server-id-1"], + "cron_expression": "0 0 8 * * *", + "timeout": 120, + "retry_count": 2, + "retry_interval": 60 +} +``` + +Relevant endpoints: + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/tasks?type=scheduled` | GET | List scheduled tasks | +| `/api/tasks` | POST | Create one-shot or scheduled task | +| `/api/tasks/{id}` | PUT | Update task fields, enable/disable, and scheduler registration | +| `/api/tasks/{id}` | DELETE | Delete a task and cancel active execution | +| `/api/tasks/{id}/run` | POST | Manually run a task | +| `/api/tasks/{id}/results` | GET | Fetch task results grouped by task/run/attempt | + ## Billing Information Administrators can record billing details for each server to track VPS costs and expiration dates. diff --git a/apps/docs/content/docs/en/alerts.mdx b/apps/docs/content/docs/en/alerts.mdx index 068167bc..6555e05a 100644 --- a/apps/docs/content/docs/en/alerts.mdx +++ b/apps/docs/content/docs/en/alerts.mdx @@ -52,6 +52,7 @@ An alert rule consists of: | `transfer_out_cycle` | Outbound traffic per cycle | Triggers when cumulative transfer >= `cycle_limit` bytes | | `transfer_all_cycle` | Total traffic per cycle | Triggers when combined transfer >= `cycle_limit` bytes | | `expiration` | Server expiration date | Triggers when expired_at is within `duration` days | +| `ip_changed` | Agent IP address changed | Event-driven rule; triggers when the agent reports an IP change event | ### Threshold Configuration @@ -141,9 +142,15 @@ Each alert rule specifies which servers it applies to: When a previously triggered alert recovers (the condition is no longer met), the alert state is cleared in both the in-memory cache and the database. If the condition triggers again later, notifications will fire according to the trigger mode. +## Maintenance Suppression + +Alert notifications are suppressed while the affected server is in an active maintenance window. The rule evaluation still runs, but ServerBee skips notification delivery for that server until the maintenance window ends. + +`ip_changed` is event-driven rather than polling-based. It is evaluated when an agent reports an IP change event, and it follows the same cover-type and maintenance suppression rules as other alerts. + ## Notification Channels -ServerBee supports four notification channel types. Each channel is configured as a separate entity that can be reused across multiple notification groups. +ServerBee supports five notification channel types. Each channel is configured as a separate entity that can be reused across multiple notification groups. ### Webhook @@ -190,7 +197,7 @@ Send push notifications to iOS devices via [Bark](https://github.com/Finb/Bark). Email notifications are delivered through [Resend](https://resend.com/). Two steps before use: -1. Set `SERVERBEE_RESEND__API_KEY` on the server (see the [Configuration](/docs/en/configuration) page). +1. Set `SERVERBEE_RESEND__API_KEY` on the server (see the [Configuration](/en/docs/configuration) page). 2. Add and verify your sender domain at [resend.com/domains](https://resend.com/domains). The `from` address on each channel must belong to a verified domain. Channel config: @@ -204,6 +211,22 @@ Channel config: `to` is an array — a single channel can deliver to multiple recipients in one API call. The subject follows the format `[ServerBee] {server_name} {event}`; the body is HTML with a plain-text fallback. +### APNs + +Send native Apple Push Notification service pushes to registered mobile devices. + +```json +{ + "key_id": "ABC123DEFG", + "team_id": "TEAM999888", + "private_key": "-----BEGIN PRIVATE KEY-----...", + "bundle_id": "com.example.serverbee", + "sandbox": false +} +``` + +APNs requires an Apple developer key, team ID, bundle ID, and private key. Set `sandbox: true` only for development builds. + ## Notification Groups Notification channels are organized into **groups**. An alert rule is linked to a notification group, and when the rule triggers, all enabled channels in the group are dispatched. diff --git a/apps/docs/content/docs/en/api-reference.mdx b/apps/docs/content/docs/en/api-reference.mdx index c366b81f..105e3360 100644 --- a/apps/docs/content/docs/en/api-reference.mdx +++ b/apps/docs/content/docs/en/api-reference.mdx @@ -1,127 +1,166 @@ --- title: API Reference -description: ServerBee REST API overview, authentication, and Swagger UI interactive docs. +description: ServerBee REST API overview, authentication, WebSocket endpoints, and Swagger UI interactive docs. icon: FileCode --- -ServerBee provides a complete REST API for programmatic access to all Dashboard features. All endpoints are documented via OpenAPI 3.0. +ServerBee exposes the same capabilities used by the web dashboard through REST and WebSocket APIs. The authoritative, schema-level reference is generated from OpenAPI annotations in the server binary. ## Swagger UI -ServerBee includes a built-in Swagger UI for interactive API exploration: +Open the built-in interactive documentation at: -``` +```text https://your-server/swagger-ui/ ``` -You can browse all 50+ API endpoints, inspect request/response schemas, and send test requests directly. +Swagger UI lets you inspect request/response schemas, authentication requirements, and test requests against your own deployment. The raw OpenAPI document is available at: -## Authentication +```text +https://your-server/api-docs/openapi.json +``` + +## Response Format + +Successful REST responses are wrapped as: + +```json +{ + "data": {} +} +``` -The ServerBee API supports two authentication methods: +Errors use: + +```json +{ + "error": "Error message describing what went wrong" +} +``` + +## Authentication ### Session Cookie -Used automatically after browser login. Call `/api/auth/login` to obtain a session: +Used by the web dashboard after login: ```bash curl -X POST https://your-server/api/auth/login \ -H "Content-Type: application/json" \ - -d '{"username": "admin", "password": "your-password"}' \ + -d '{"username":"admin","password":"your-password"}' \ -c cookies.txt -# Use the cookie for subsequent requests curl https://your-server/api/servers -b cookies.txt ``` ### API Key -Suitable for automation. Create one in Settings → API Keys: +Use API keys for automation. Create them in Settings → API Keys. ```bash curl https://your-server/api/servers \ -H "X-API-Key: serverbee_your-api-key-here" ``` -API keys use the format `serverbee_` prefix + 43-character random string. The key is shown only once at creation time. +API keys use the `serverbee_` prefix and are shown only once when created. -## Endpoint Overview +### Bearer Session Token -### Public Endpoints (No Auth Required) +Mobile flows use Bearer tokens for selected REST and WebSocket endpoints: -| Method | Path | Description | -|--------|------|-------------| -| POST | `/api/auth/login` | User login | -| GET | `/api/auth/oauth/{provider}` | OAuth authorization redirect | -| GET | `/api/auth/oauth/{provider}/callback` | OAuth callback | -| GET | `/api/status` | Public status page data | - -### Authenticated Endpoints (Session or API Key) +```bash +curl https://your-server/api/auth/me \ + -H "Authorization: Bearer " +``` -| Method | Path | Description | -|--------|------|-------------| -| POST | `/api/auth/logout` | User logout | -| GET | `/api/auth/me` | Get current user info | -| POST | `/api/auth/change-password` | Change password | -| GET/POST | `/api/auth/2fa/*` | 2FA management | -| GET/DELETE | `/api/auth/oauth/accounts` | OAuth account management | -| GET | `/api/servers` | List servers | -| GET | `/api/servers/{id}` | Get server details | -| GET | `/api/servers/{id}/records` | Get metric records | -| GET | `/api/servers/{id}/gpu-records` | Get GPU records | -| GET | `/api/server-groups` | List server groups | -| GET | `/api/ping-tasks` | List ping tasks | -| GET | `/api/ping-tasks/{id}/records` | Get ping records | - -### Admin Endpoints (Admin Role Required) +## Public Endpoints | Method | Path | Description | |--------|------|-------------| -| POST/PUT/DELETE | `/api/servers/*` | Server management | -| PUT | `/api/servers/batch-capabilities` | Batch update capabilities | -| POST | `/api/servers/{id}/upgrade` | Trigger agent upgrade | -| GET | `/api/servers/{target_id}/recovery-candidates` | List recommended recovery candidates | -| GET | `/api/servers/recovery-jobs/{job_id}` | Get recovery job details | -| POST | `/api/servers/{target_id}/recover-merge` | Start an agent recovery job | -| CRUD | `/api/server-groups/*` | Server group management | -| CRUD | `/api/notifications/*` | Notification channel management | -| CRUD | `/api/notification-groups/*` | Notification group management | -| CRUD | `/api/alert-rules/*` | Alert rule management | -| CRUD | `/api/ping-tasks/*` | Ping task management | -| POST | `/api/tasks` | Create a remote command task | -| GET | `/api/tasks/{id}` | Get task details and results | -| CRUD | `/api/users/*` | User management | -| GET | `/api/audit-logs` | Audit logs | -| GET/PUT | `/api/settings/*` | System settings | -| POST | `/api/settings/backup` | Database backup | -| POST | `/api/settings/restore` | Database restore | - -### WebSocket Endpoints - -| Path | Description | -|------|-------------| -| `/api/ws/browser` | Browser real-time data push | -| `/api/ws/terminal/{server_id}` | Web terminal proxy | - -## Error Responses - -All API errors use a consistent format: - -```json -{ - "error": "Error message describing what went wrong" -} -``` - -Common status codes: - -| Code | Description | -|------|-------------| -| 400 | Bad request / invalid parameters | +| POST | `/api/auth/login` | Web login | +| GET | `/api/auth/oauth/providers` | List enabled OAuth providers | +| GET | `/api/auth/oauth/{provider}` | OAuth authorization redirect | +| GET | `/api/auth/oauth/{provider}/callback` | OAuth callback | +| POST | `/api/mobile/auth/login` | Mobile login | +| POST | `/api/mobile/auth/refresh` | Refresh a mobile session | +| POST | `/api/mobile/auth/pair` | Redeem a mobile pairing code | +| POST | `/api/agent/register` | Agent auto-registration using discovery key | +| GET | `/api/status` | Default public status page data | +| GET | `/api/status/{slug}` | Configurable public status page data | +| GET | `/api/settings/brand` | Public brand settings | +| GET | `/api/brand/logo` | Serve uploaded logo | +| GET | `/api/brand/favicon` | Serve uploaded favicon | + +## Authenticated Read Endpoints + +Read endpoints are available to Admin and Member users unless noted otherwise. + +| Family | Representative endpoints | +|--------|--------------------------| +| Current user and API keys | `GET /api/auth/me`, `PUT /api/auth/password`, `GET/POST /api/auth/api-keys`, `DELETE /api/auth/api-keys/{id}` | +| 2FA and OAuth accounts | `/api/auth/2fa/*`, `GET/DELETE /api/auth/oauth/accounts/*` | +| Mobile devices | `POST /api/mobile/auth/logout`, `GET /api/mobile/auth/devices`, `DELETE /api/mobile/auth/devices/{id}` | +| Servers | `GET /api/servers`, `GET /api/servers/{id}`, `GET /api/servers/{id}/records`, `GET /api/servers/{id}/gpu-records` | +| Groups and tags | `GET /api/server-groups`, `GET /api/server-tags` | +| Uptime and traffic | `GET /api/servers/{id}/uptime-daily`, `GET /api/servers/{id}/traffic` | +| GeoIP | `GET /api/geoip/status` | +| Ping tasks | `GET /api/ping-tasks`, `GET /api/ping-tasks/{id}/records` | +| Network probes | `/api/network-probes/*`, `/api/servers/{id}/network-probes/*` | +| Traceroute results | `GET /api/servers/{id}/traceroute/{request_id}` | +| Files, read-only | `POST /api/files/{server_id}/list`, `stat`, `read`, `GET /api/files/transfers`, `GET /api/files/download/{transfer_id}` | +| Docker, read-only | `GET /api/servers/{id}/docker/containers`, `stats`, `info`, `events`, `networks`, `volumes` | +| Service monitors | `GET /api/service-monitors`, `GET /api/service-monitors/{id}`, `GET /api/service-monitors/{id}/records` | +| Status page config | `GET /api/status-pages` | +| Dashboards | `GET /api/dashboards`, `GET /api/dashboards/default`, `GET /api/dashboards/{id}` | +| Themes | `GET /api/themes/*` | +| Alert events | alert event read endpoints used by the dashboard | + +## Admin Write and Management Endpoints + +Admin role is required for write operations and system management. + +| Family | Representative endpoints | +|--------|--------------------------| +| Server management | `POST/PUT/DELETE /api/servers/*`, `PUT /api/servers/batch-capabilities`, `POST /api/servers/{id}/upgrade` | +| Agent recovery | `GET /api/servers/{target_id}/recovery-candidates`, `GET /api/servers/recovery-jobs/{job_id}`, `POST /api/servers/{target_id}/recover-merge` | +| Groups and tags | CRUD `/api/server-groups/*`, CRUD `/api/server-tags/*` | +| Ping and network probes | CRUD `/api/ping-tasks/*`, write endpoints under `/api/network-probes/*` | +| Traceroute | `POST /api/servers/{id}/traceroute` | +| Files | `POST /api/files/{server_id}/write`, `delete`, `mkdir`, `move`, `download`, `upload`, `DELETE /api/files/transfers/{transfer_id}` | +| Docker actions | `POST /api/servers/{id}/docker/containers/{cid}/action` | +| Service monitors | CRUD `/api/service-monitors/*`, `POST /api/service-monitors/{id}/check` | +| Dashboards | `POST /api/dashboards`, `PUT/DELETE /api/dashboards/{id}` | +| Themes and appearance | theme write endpoints, `PUT /api/settings/brand`, `POST /api/settings/brand/logo`, `POST /api/settings/brand/favicon` | +| Status pages | CRUD `/api/status-pages/*` | +| Incidents | CRUD `/api/incidents/*`, `POST /api/incidents/{id}/updates` | +| Maintenance windows | CRUD `/api/maintenances/*` | +| Alerts and notifications | CRUD `/api/alert-rules/*`, `/api/notifications/*`, `/api/notification-groups/*` | +| Tasks | `GET/POST /api/tasks`, `GET/PUT/DELETE /api/tasks/{id}`, `GET /api/tasks/{id}/results`, `POST /api/tasks/{id}/run` | +| Users | CRUD `/api/users/*` | +| Audit and settings | `GET /api/audit-logs`, `/api/settings/*`, backup/restore endpoints | +| GeoIP | `POST /api/geoip/download` | +| Mobile push | `POST /api/mobile/pair`, `POST /api/mobile/push/register`, `POST /api/mobile/push/unregister` | + +## WebSocket Endpoints + +| Path | Auth | Description | +|------|------|-------------| +| `/api/agent/ws?token=` | Agent token query parameter | Agent metrics, commands, pings, files, Docker, traceroute | +| `/api/ws/servers` | Session cookie, API key, or Bearer token | Browser/mobile real-time server updates | +| `/api/ws/terminal/{server_id}` | Authenticated Admin + `CAP_TERMINAL` | Web terminal proxy; terminal payloads use binary frames | +| `/api/ws/docker/logs/{server_id}` | Authenticated + `CAP_DOCKER` | Per-container Docker log streaming | + +## Common Status Codes + +| Code | Meaning | +|------|---------| +| 400 | Bad request or invalid operation | | 401 | Not authenticated | -| 403 | Forbidden (insufficient role or capability disabled) | -| 404 | Resource not found | -| 429 | Too many requests (rate limited) | +| 403 | Forbidden: role, capability, or local agent policy blocks the operation | +| 404 | Resource not found, server offline, or disabled public page | +| 409 | Conflict, such as duplicate slug/name | +| 422 | Validation error | +| 429 | Rate limited | | 500 | Internal server error | diff --git a/apps/docs/content/docs/en/capabilities.mdx b/apps/docs/content/docs/en/capabilities.mdx index 251b82bf..68c94d54 100644 --- a/apps/docs/content/docs/en/capabilities.mdx +++ b/apps/docs/content/docs/en/capabilities.mdx @@ -87,8 +87,11 @@ ServerBee validates capabilities on both the server side and agent side: ### Server-Side Enforcement - **Terminal**: WebSocket upgrade rejected with 403 -- **Exec**: `POST /api/tasks` filters out disabled servers and writes synthetic results (`exit_code = -2`, message: "Capability 'exec' is disabled") -- **Ping**: Tasks filtered by capability — disabled agents do not receive probe tasks +- **Exec**: `POST /api/tasks` and scheduled task runs filter out disabled servers and write synthetic results (`exit_code = -2`, message: "Capability 'exec' is disabled") +- **Auto Upgrade**: `POST /api/servers/{id}/upgrade` returns 403 when `CAP_UPGRADE` is disabled +- **Ping and Traceroute**: Probe tasks are filtered by capability; traceroute requires effective `CAP_PING_ICMP` +- **File Manager**: file endpoints reject requests before dispatch when `CAP_FILE` is disabled +- **Docker**: Docker read/action endpoints and Docker log WebSocket routes require `CAP_DOCKER` and agent runtime Docker support ### Agent-Side Enforcement @@ -129,6 +132,7 @@ When an agent locally disables a capability, the UI shows the toggle as disabled + diff --git a/apps/docs/content/docs/en/configuration.mdx b/apps/docs/content/docs/en/configuration.mdx index dcd07eff..7782b82d 100644 --- a/apps/docs/content/docs/en/configuration.mdx +++ b/apps/docs/content/docs/en/configuration.mdx @@ -73,7 +73,7 @@ These variables are intentionally scoped to local tooling. `ALLOW_WRITES` is not | Environment Variable | Default | Description | |---------------------|---------|-------------| -| `SERVERBEE_GEOIP__MMDB_PATH` | `""` | Path to MaxMind GeoLite2-City.mmdb file. Non-empty path enables GeoIP | +| `SERVERBEE_GEOIP__MMDB_PATH` | `""` | Path to a MaxMind-compatible MMDB file. Non-empty path enables this custom GeoIP database; otherwise admins can download the DB-IP Lite database from Settings → GeoIP Database | #### Resend (Email Notifications) @@ -244,7 +244,7 @@ The log level can also be set via the `RUST_LOG` environment variable, which tak | Key | Type | Default | Description | |-----|------|---------|-------------| -| `mmdb_path` | string | `""` | Path to a MaxMind GeoLite2-City MMDB file. Non-empty path enables GeoIP | +| `mmdb_path` | string | `""` | Path to a MaxMind-compatible MMDB file. Non-empty path enables this custom GeoIP database; if empty, the UI can download DB-IP Lite into the server data directory | ### `[resend]` -- Email Notifications diff --git a/apps/docs/content/docs/en/custom-themes.mdx b/apps/docs/content/docs/en/custom-themes.mdx index 9de6297e..3a0e21b2 100644 --- a/apps/docs/content/docs/en/custom-themes.mdx +++ b/apps/docs/content/docs/en/custom-themes.mdx @@ -54,6 +54,39 @@ Only `version: 1` is accepted. - Alpha must be between `0` and `1`, or between `0%` and `100%`. - Chroma has no hard cap. Browsers may gamut-clip values that cannot be displayed. +## Branding and White Label + +The same **Settings → Appearance** page also controls basic product branding. Branding settings are stored in the server database and are read by both the dashboard shell and public pages. + +| Field | Description | +|-------|-------------| +| `site_title` | Browser/app title shown by the UI | +| `footer_text` | Footer text shown where the UI renders a product footer | +| `logo_path` | Public path for the uploaded logo, usually `/api/brand/logo` | +| `favicon_path` | Public path for the uploaded favicon, usually `/api/brand/favicon` | + +Public endpoints: + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/settings/brand` | Read brand configuration without authentication | +| GET | `/api/brand/logo` | Serve uploaded logo | +| GET | `/api/brand/favicon` | Serve uploaded favicon | + +Admin endpoints: + +| Method | Path | Description | +|--------|------|-------------| +| PUT | `/api/settings/brand` | Update `site_title`, `footer_text`, `logo_path`, and `favicon_path` as JSON | +| POST | `/api/settings/brand/logo` | Upload a logo via multipart field `file` | +| POST | `/api/settings/brand/favicon` | Upload a favicon via multipart field `file` | + +Logo and favicon uploads accept PNG or ICO files only and are limited to 512 KB. Uploading a new asset replaces the previous asset of the same type. + + +`PUT /api/settings/brand` expects JSON. Image files are uploaded through the dedicated logo/favicon endpoints, not through the JSON update endpoint. + + ## Disable custom themes Set: diff --git a/apps/docs/content/docs/en/dashboards.mdx b/apps/docs/content/docs/en/dashboards.mdx new file mode 100644 index 00000000..2f78420b --- /dev/null +++ b/apps/docs/content/docs/en/dashboards.mdx @@ -0,0 +1,114 @@ +--- +title: Dashboards & Widgets +description: Build custom monitoring dashboards with draggable widgets and reusable layouts. +icon: LayoutDashboard +--- + +ServerBee dashboards are configurable pages made of widgets. You can keep the default overview dashboard, create separate dashboards for regions or teams, and choose one dashboard as the default for all users. + +## Managing Dashboards + +Use the dashboard switcher in the top bar of the home page to: + +- Switch between dashboards +- Create a new dashboard +- Rename a dashboard +- Delete dashboards you no longer need +- Set a dashboard as the default + +The first dashboard is created automatically if none exists. A dashboard marked as default cannot be unset directly; set another dashboard as default instead. + +## Editing Layouts + +1. Open a dashboard. +2. Click **Edit**. +3. Add widgets from the widget picker. +4. Drag widgets to rearrange them. +5. Resize widgets within their min/max size constraints. +6. Configure each widget's data source and title. +7. Click **Save**. + +The layout is stored as grid coordinates (`grid_x`, `grid_y`, `grid_w`, `grid_h`) and widget configuration JSON. Saving a dashboard performs a diff: existing widgets are updated, new widgets are inserted, and widgets removed from the layout are deleted. + +## Widget Types + +| Widget | Category | Typical use | +|--------|----------|-------------| +| `stat-number` | Real-time | Show one metric from one server, such as CPU or memory | +| `server-cards` | Real-time | Show compact server cards for selected servers | +| `gauge` | Real-time | Gauge visualization for one metric and server | +| `line-chart` | Charts | Historical chart for one metric on one server | +| `multi-line` | Charts | Compare one metric across multiple servers | +| `top-n` | Real-time | Rank servers by a selected metric | +| `alert-list` | Status | Show active or recent alert state | +| `service-status` | Status | Show Service Monitor status | +| `traffic-bar` | Charts | Show traffic usage for one server | +| `disk-io` | Charts | Show disk read/write throughput history | +| `server-map` | Status | Show server locations on a map when GeoIP is installed | +| `markdown` | Status | Add notes, runbooks, or links using Markdown | +| `uptime-timeline` | Status | Show uptime bars for selected servers | + +## Common Widget Configuration + +Most widgets store their settings in `config_json`. Common fields include: + +| Field | Used by | Meaning | +|-------|---------|---------| +| `server_id` | Single-server widgets | Server ID to query | +| `server_ids` | Multi-server widgets | List of server IDs to include | +| `metric` | Metric widgets | Metric key such as CPU, memory, disk, traffic, or load | +| `hours` | Historical widgets | Lookback window for chart data | +| `interval` | Historical widgets | Data granularity (`raw`, `hourly`, or `auto`) | +| `monitor_ids` | Service Status | Service Monitor IDs to display | +| `content` | Markdown | Markdown content | + +## GeoIP and Server Map + +The Server Map widget requires GeoIP data. You can either: + +- Configure a custom MaxMind-compatible MMDB file with `geoip.mmdb_path`, or +- Download the DB-IP Lite database from **Settings → GeoIP Database**. + +The widget shows an installation prompt when GeoIP data is missing. + +## API + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/dashboards` | List dashboards | +| GET | `/api/dashboards/default` | Get the default dashboard, creating one if needed | +| GET | `/api/dashboards/{id}` | Get a dashboard with widgets | +| POST | `/api/dashboards` | Create a dashboard | +| PUT | `/api/dashboards/{id}` | Update metadata and/or widgets | +| DELETE | `/api/dashboards/{id}` | Delete a dashboard | + +Update request example: + +```json +{ + "name": "Production", + "is_default": true, + "widgets": [ + { + "widget_type": "stat-number", + "title": "Web CPU", + "config_json": { + "server_id": "server-id", + "metric": "cpu", + "unit": "%" + }, + "grid_x": 0, + "grid_y": 0, + "grid_w": 2, + "grid_h": 1, + "sort_order": 0 + } + ] +} +``` + + + + + + diff --git a/apps/docs/content/docs/en/file-manager.mdx b/apps/docs/content/docs/en/file-manager.mdx new file mode 100644 index 00000000..9bd0bb11 --- /dev/null +++ b/apps/docs/content/docs/en/file-manager.mdx @@ -0,0 +1,132 @@ +--- +title: File Manager +description: Browse, read, edit, upload, download, and manage remote files through ServerBee. +icon: FolderOpen +--- + +File Manager provides controlled remote filesystem access through the ServerBee agent. It is intended for operational tasks such as checking logs, editing small configuration files, and transferring files without opening a full terminal. + + +File Manager is a high-risk feature. Enable it only on trusted servers and restrict `root_paths` to the minimum directories needed. + + +## Requirements + +File Manager must be enabled at two layers: + +1. **Server-side capability:** enable `CAP_FILE` for the server in **Settings → Capabilities** or the server detail page. +2. **Agent-side policy:** set `[file].enabled = true` and configure at least one allowed root path. + +Example `agent.toml`: + +```toml +[file] +enabled = true +root_paths = ["/home", "/var/log", "/etc/serverbee"] +max_file_size = 1073741824 +# Defaults shown here for clarity +deny_patterns = ["*.key", "*.pem", "id_rsa*", ".env*", "shadow", "passwd"] +``` + +Equivalent environment variables: + +```bash +SERVERBEE_FILE__ENABLED=true +SERVERBEE_FILE__ROOT_PATHS=/home,/var/log,/etc/serverbee +SERVERBEE_FILE__MAX_FILE_SIZE=1073741824 +``` + +## Accessing the File Manager + +Open a server's action menu and click **Files**, or navigate directly to: + +``` +/files/{serverId} +``` + +The button is hidden when the server does not have `CAP_FILE` in its effective capabilities. + +## Permissions + +| Role | Allowed operations | +|------|--------------------| +| Admin | Browse, stat, read, write, upload, download, delete, move, create directories, cancel transfers | +| Member | Browse, stat, read, download, list own transfers | + +All high-risk file operations are recorded in the audit log, including denied attempts when the capability is disabled. + +## Supported Operations + +| Operation | Description | +|-----------|-------------| +| List directory | Browse files and directories below allowed roots | +| Stat | Load metadata for one path | +| Read | Read UTF-8 text content for preview/editor use | +| Write | Replace file content with provided text | +| Upload | Upload a local file to a remote path | +| Download | Start a server-mediated download transfer, then fetch it from ServerBee | +| Delete | Delete a file or recursively delete a directory | +| Mkdir | Create a directory | +| Move | Rename or move a file/directory | +| Transfers | View and cancel active file transfers | + +## Security Model + +The agent enforces path safety before touching the filesystem: + +- `root_paths` is an allow-list. Empty `root_paths` rejects all file operations. +- Paths must resolve inside one of the configured roots. +- `deny_patterns` blocks sensitive names such as private keys, `.env*`, `shadow`, and `passwd`. +- The agent also checks local capabilities, so server-side capability changes cannot override an agent-local deny. +- The server checks `CAP_FILE` before dispatching file messages to the agent. + +## Limits + +| Limit | Default | Where configured | +|-------|---------|------------------| +| Upload size | 100 MB | Server `file.max_upload_size` / `SERVERBEE_FILE__MAX_UPLOAD_SIZE` | +| Agent read/download max file size | 1 GB | Agent `[file].max_file_size` / `SERVERBEE_FILE__MAX_FILE_SIZE` | +| Inline read chunk | 384 KB | Protocol limit to keep WebSocket frames below the configured max size | + +Uploads and downloads are chunked. Downloads create a temporary transfer on the server and can be cancelled while pending or in progress. + +## API + +Read endpoints are available to Admin and Member users. Write endpoints require Admin. + +| Method | Path | Description | +|--------|------|-------------| +| POST | `/api/files/{server_id}/list` | List a directory; body `{ "path": "/var/log" }` | +| POST | `/api/files/{server_id}/stat` | Stat a path | +| POST | `/api/files/{server_id}/read` | Read UTF-8 text content | +| GET | `/api/files/download/{transfer_id}` | Download a ready transfer owned by the current user | +| GET | `/api/files/transfers` | List transfers owned by the current user | +| POST | `/api/files/{server_id}/write` | Replace file content | +| POST | `/api/files/{server_id}/delete` | Delete file/directory; supports `recursive` | +| POST | `/api/files/{server_id}/mkdir` | Create a directory | +| POST | `/api/files/{server_id}/move` | Move or rename a path | +| POST | `/api/files/{server_id}/download` | Start a download transfer | +| POST | `/api/files/{server_id}/upload` | Upload multipart form with `path` and `file` fields | +| DELETE | `/api/files/transfers/{transfer_id}` | Cancel a transfer | + +Examples: + +```bash +curl -X POST https://your-server/api/files/server-id/list \ + -H "X-API-Key: serverbee_..." \ + -H "Content-Type: application/json" \ + -d '{"path":"/var/log"}' +``` + +```bash +curl -X POST https://your-server/api/files/server-id/upload \ + -H "X-API-Key: serverbee_..." \ + -F 'path=/tmp/example.txt' \ + -F 'file=@example.txt' +``` + + + + + + diff --git a/apps/docs/content/docs/en/index.mdx b/apps/docs/content/docs/en/index.mdx index 8eee2bbb..815e1509 100644 --- a/apps/docs/content/docs/en/index.mdx +++ b/apps/docs/content/docs/en/index.mdx @@ -8,14 +8,18 @@ ServerBee is a lightweight, self-hosted VPS monitoring probe system built from t ## Key Features -- **Real-time monitoring** -- CPU, memory, disk, network, load average, temperature, and GPU metrics streamed over WebSocket -- **Alert rules** -- Flexible threshold-based alerts with support for 14+ metric types, debounce logic, and multiple notification channels -- **Web terminal** -- Secure browser-based shell access to your servers through a full PTY session -- **Ping monitoring** -- ICMP, TCP, and HTTP probes to track endpoint availability and latency -- **GPU support** -- Optional NVIDIA GPU monitoring (utilization, memory, temperature per device) -- **Lightweight footprint** -- Single binary server and agent, SQLite database, no external dependencies required -- **OAuth login** -- GitHub, Google, and generic OIDC provider support -- **GeoIP detection** -- Automatic server region identification via MaxMind MMDB +- **Real-time monitoring** -- CPU, memory, disk, network, load average, temperature, GPU, disk I/O, and traffic metrics streamed over WebSocket +- **Custom dashboards** -- Build multiple dashboard layouts with server cards, charts, maps, uptime timelines, Markdown notes, and service status widgets +- **Alert rules** -- Flexible threshold and event-based alerts with debounce logic, maintenance suppression, and multiple notification channels +- **Service monitors** -- SSL certificate, DNS, HTTP keyword, TCP, and WHOIS checks with history and notifications +- **Web terminal and remote tasks** -- Browser-based PTY sessions, one-shot commands, and scheduled cron tasks with retries +- **File manager** -- Controlled remote browse/read/write/upload/download operations with path sandboxing and audit logs +- **Ping and network quality monitoring** -- ICMP, TCP, HTTP probes, preset network targets, packet loss/latency charts, CSV export, and traceroute +- **Docker management** -- Container list, stats, events, logs, networks, volumes, and container actions when enabled +- **Public status pages** -- Share service health with custom slugs, incidents, maintenance windows, uptime timelines, themes, and custom CSS +- **Branding and themes** -- Preset/custom OKLCH themes plus white-label title, logo, favicon, and footer text +- **Lightweight footprint** -- Single binary server and agent, SQLite database, no external database required +- **OAuth and mobile support** -- GitHub, Google, generic OIDC login, mobile sessions, pairing, and push notification support ## Tech Stack @@ -31,11 +35,11 @@ ServerBee is a lightweight, self-hosted VPS monitoring probe system built from t ServerBee follows a hub-and-spoke architecture: -1. The **Server** is the central dashboard that runs the web UI, REST API, and manages all WebSocket connections +1. The **Server** is the central dashboard that runs the web UI, REST API, background jobs, and manages all WebSocket connections 2. **Agents** are installed on each VPS you want to monitor -- they collect system metrics and report back to the server every few seconds 3. The **Frontend** is a React SPA embedded into the server binary, so there is nothing extra to deploy -All communication between agents and the server happens over WebSocket with JSON-encoded messages. Terminal sessions use the same WebSocket connection with base64-encoded PTY data. +All communication between agents and the server happens over WebSocket with JSON-encoded messages. Terminal sessions and some streaming features use dedicated WebSocket routes with binary or structured frames. ## Next Steps @@ -44,13 +48,17 @@ All communication between agents and the server happens over WebSocket with JSON + + + - - + + + diff --git a/apps/docs/content/docs/en/meta.json b/apps/docs/content/docs/en/meta.json index 2db43d95..8e7f8450 100644 --- a/apps/docs/content/docs/en/meta.json +++ b/apps/docs/content/docs/en/meta.json @@ -10,8 +10,11 @@ "configuration", "---Features---", "monitoring", + "dashboards", "alerts", + "service-monitors", "terminal", + "file-manager", "ping", "capabilities", "status-page", diff --git a/apps/docs/content/docs/en/mobile.mdx b/apps/docs/content/docs/en/mobile.mdx index b89e76b8..0ecf81a4 100644 --- a/apps/docs/content/docs/en/mobile.mdx +++ b/apps/docs/content/docs/en/mobile.mdx @@ -138,7 +138,7 @@ To enable push notifications, configure APNs credentials in the web app: ### WebSocket Connection The iOS app maintains a WebSocket connection for real-time updates: -- Connection URL: `wss://your-server/api/ws/browser` +- Connection URL: `wss://your-server/api/ws/servers` - Authentication via `Authorization: Bearer` header during handshake - Automatic reconnection with exponential backoff - Connection auto-closes when access token expires (reconnect triggers refresh) diff --git a/apps/docs/content/docs/en/monitoring.mdx b/apps/docs/content/docs/en/monitoring.mdx index 3d4ee7dd..d761e89f 100644 --- a/apps/docs/content/docs/en/monitoring.mdx +++ b/apps/docs/content/docs/en/monitoring.mdx @@ -19,9 +19,15 @@ The main dashboard shows all registered servers with their current status at a g Servers are organized by groups and sorted by weight. You can filter, search, and batch-operate on servers from this view. +For custom operations views, use [Dashboards & Widgets](/en/docs/dashboards) to create additional dashboard layouts with charts, maps, service status widgets, Markdown notes, and uptime timelines. + +### GeoIP Display + +Region/country labels and the Server Map widget require GeoIP data. You can configure a custom MaxMind-compatible MMDB path with `geoip.mmdb_path`, or download the DB-IP Lite database from **Settings → GeoIP Database**. The status endpoint is `GET /api/geoip/status`, and administrators can trigger a download with `POST /api/geoip/download`. + ## Real-Time Updates -The browser connects to the server via WebSocket at `/ws/browser`. The communication flow works as follows: +The browser connects to the server via WebSocket at `/api/ws/servers`. The communication flow works as follows: 1. On initial connection, the server sends a `FullSync` message containing the current state of all servers 2. As agents report new metrics, the server broadcasts `Update` messages to all connected browsers @@ -381,6 +387,26 @@ Click a server card to see `/network/:serverId` with: - **Statistics bar** -- Average latency, availability percentage, target count - **CSV export** -- Download probe data for the selected time range +### Traceroute + +The network detail page can run traceroute from the selected agent to a target host or IP. This is useful for diagnosing routing changes and packet loss outside regular probe intervals. + +- Requires the agent's effective `CAP_PING_ICMP` capability. +- The target may contain only letters, numbers, dots, hyphens, and colons. +- The server sends `max_hops = 30` to the agent. +- The agent times out the command after 60 seconds. +- On Linux/macOS, the agent tries `traceroute` first and falls back to `mtr` if `traceroute` is missing. +- On Windows, the agent uses the platform traceroute command. + +API flow: + +```http +POST /api/servers/{id}/traceroute +GET /api/servers/{id}/traceroute/{request_id} +``` + +The first request returns a `request_id`; poll the second endpoint until `completed` is true. + ### Data Retention Network probe records follow the same two-tier storage as system metrics: diff --git a/apps/docs/content/docs/en/service-monitors.mdx b/apps/docs/content/docs/en/service-monitors.mdx new file mode 100644 index 00000000..da8f7540 --- /dev/null +++ b/apps/docs/content/docs/en/service-monitors.mdx @@ -0,0 +1,171 @@ +--- +title: Service Monitors +description: Monitor SSL certificates, DNS records, HTTP keywords, TCP ports, and WHOIS expiry from the ServerBee server. +icon: Radar +--- + +Service Monitors are synthetic checks that run from the ServerBee server. They are useful for monitoring public-facing services even when those services are not running on a ServerBee agent host. + +Unlike Ping Monitoring, which asks agents to probe network targets, Service Monitors are evaluated by the central server process. Results are stored in SQLite, displayed in the dashboard, and can send notifications through notification groups. + +## Supported Monitor Types + +| Type | Target format | What it checks | +|------|---------------|----------------| +| `ssl` | `example.com` or `example.com:443` | TLS handshake, certificate validity period, issuer/subject, SHA-256 fingerprint | +| `dns` | `example.com` | DNS records for `A`, `AAAA`, `CNAME`, `MX`, or `TXT`, optionally compared against expected values | +| `http_keyword` | `https://example.com/health` | HTTP status code and optional keyword presence/absence in the response body | +| `tcp` | `host:port` | TCP connection success and connection latency | +| `whois` | `example.com` or URL | Domain expiration date and registrar, using WHOIS lookup with a system-command fallback | + +## Creating a Monitor + +1. Open **Settings → Service Monitors**. +2. Click **New Monitor**. +3. Choose a monitor type and enter the target. +4. Set the check interval in seconds. +5. Configure type-specific options. +6. Optionally select a notification group. +7. Save the monitor. + +The background checker wakes every 10 seconds, schedules due monitors based on their `interval`, and runs up to 20 checks concurrently. + +## Type-Specific Configuration + +### SSL Certificate + +```json +{ + "port": 443, + "warning_days": 14, + "critical_days": 7, + "timeout": 10 +} +``` + +- `port` defaults to `443` unless the target already includes a port. +- The check fails when remaining certificate lifetime is less than or equal to `critical_days`. +- A warning is included in the detail payload when remaining lifetime is less than or equal to `warning_days`. + +### DNS Record + +```json +{ + "record_type": "A", + "expected_values": ["203.0.113.10"], + "nameserver": "8.8.8.8" +} +``` + +- `record_type` defaults to `A` and supports `A`, `AAAA`, `CNAME`, `MX`, and `TXT`. +- If `expected_values` is omitted, the check passes when resolution returns at least one value. +- If `expected_values` is present, the sorted returned values must exactly match the sorted expected values. +- `nameserver` is optional. When omitted, the system resolver is used. + +### HTTP Keyword + +```json +{ + "method": "GET", + "expected_status": [200], + "keyword": "ok", + "keyword_exists": true, + "headers": { + "User-Agent": "ServerBee" + }, + "body": null, + "timeout": 10 +} +``` + +- `method` supports `GET` and `POST`. +- `expected_status` defaults to `[200]`. +- If `keyword` is set, `keyword_exists: true` requires it to appear; `false` requires it to be absent. +- `headers` values must be strings. +- `body` is used for `POST` requests. + +### TCP Port + +```json +{ + "timeout": 10 +} +``` + +The target must be `host:port`. The check succeeds if a TCP connection can be established before timeout. + +### WHOIS Expiry + +```json +{ + "warning_days": 30, + "critical_days": 7 +} +``` + +- The target is normalized to a domain name, so `https://example.com/path` becomes `example.com`. +- The check fails when the domain expires within `critical_days`. +- Some TLDs such as `.app`, `.dev`, and `.page` do not expose a standard WHOIS service for this monitor. Use an SSL monitor for those domains instead. + +## Notifications and Retries + +Each monitor can link to a notification group. On every check: + +1. A record is written with `success`, `latency`, `detail_json`, `error`, and `time`. +2. The monitor updates `last_status`, `last_checked_at`, and `consecutive_failures`. +3. Failure notifications are sent only after `consecutive_failures > retry_count`. +4. Recovery notifications are sent when a monitor was failing and the next check succeeds. + +If a monitor is associated with servers and any of those servers is currently in an active maintenance window, notifications are skipped for that check. The check record is still stored. + +## History and Retention + +Open a monitor detail page to view: + +- Latest status and latency +- Recent success/failure history +- Latency chart over time +- Raw detail payload and error message per check + +Service monitor records are retained for 30 days by default (`retention.service_monitor_days`). + +## API + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/service-monitors` | List monitors; optional `?type=ssl` filter | +| GET | `/api/service-monitors/{id}` | Get one monitor with its latest record | +| POST | `/api/service-monitors` | Create a monitor | +| PUT | `/api/service-monitors/{id}` | Update a monitor | +| DELETE | `/api/service-monitors/{id}` | Delete a monitor and its records | +| GET | `/api/service-monitors/{id}/records` | Query records with optional `from`, `to`, and `limit` | +| POST | `/api/service-monitors/{id}/check` | Trigger an immediate check | + +Create request example: + +```json +{ + "name": "Website SSL", + "monitor_type": "ssl", + "target": "example.com", + "interval": 300, + "config_json": { + "warning_days": 14, + "critical_days": 7 + }, + "notification_group_id": "notification-group-id", + "retry_count": 1, + "server_ids_json": ["server-id"], + "enabled": true +} +``` + + +`server_ids_json` is used to associate a service monitor with servers, mainly for maintenance-window suppression and dashboard context. The monitor itself still runs from the central ServerBee server. + + + + + + + diff --git a/apps/docs/content/docs/en/status-page.mdx b/apps/docs/content/docs/en/status-page.mdx index 438c64bf..0baa1bc8 100644 --- a/apps/docs/content/docs/en/status-page.mdx +++ b/apps/docs/content/docs/en/status-page.mdx @@ -1,100 +1,167 @@ --- -title: Status Page -description: A public page showing server online status without authentication. +title: Status Pages +description: Publish public server health pages with incidents, maintenance windows, uptime history, and custom themes. icon: Globe --- -ServerBee provides a public status page that shows server online status and basic metrics without requiring login. It is ideal for sharing service health with users or teams. +ServerBee provides two public status experiences: -## Access +- **Default status page:** `https://your-server/status`, backed by `GET /api/status`, showing all non-hidden servers. +- **Configurable status pages:** `https://your-server/status/{slug}`, backed by `GET /api/status/{slug}`, showing the specific servers and options selected by an administrator. -The status page is available at: `https://your-server/status` +Both are public and do not require authentication. -No authentication is required — anyone can access this page. +## Default `/status` Page -## Page Content +The default status page is useful when you want a quick public overview without creating a custom page. -The status page displays: +It shows: -- **Online/Total count**: Current number of online servers vs total -- **Server list**: All non-hidden servers organized by group -- **Online status**: Each server shows an online/offline indicator -- **Live metrics**: Online servers display CPU, memory, and disk usage progress bars -- **90-day uptime timeline**: Each server shows a color-coded bar chart of daily uptime over the past 90 days -- **Uptime percentage**: Overall uptime percentage calculated from daily uptime data -- **Auto-refresh**: The page automatically fetches updated data every 10 seconds +- Online/total server count +- All servers where `hidden = false` +- Server group labels +- Online/offline status +- Live metrics for online servers: CPU, memory, disk, network speed/transfer, uptime, and load +- Public server remarks where configured -## Display Rules +Public API: -- Only **non-hidden** servers are shown (admins can set the `hidden` flag in server settings) -- Servers are organized by their group -- Offline servers show only the offline status without metric data - -## API Endpoint - -``` +```http GET /api/status ``` -Public endpoint, no authentication required. Response structure: +## Configurable Status Pages -```json -{ - "data": { - "servers": [ - { - "id": "server-uuid", - "name": "Web Server 1", - "hostname": "web1.example.com", - "is_online": true, - "group_name": "Production", - "cpu": 45.2, - "mem_used": 8589934592, - "mem_total": 17179869184, - "disk_used": 53687091200, - "disk_total": 107374182400 - } - ], - "groups": [ - { "id": "group-uuid", "name": "Production" } - ], - "online_count": 8, - "total_count": 10 - } -} +Create and manage pages in **Settings → Status Pages**. Each page has its own slug and can be shared independently. + +### Page Settings + +| Setting | API field | Description | +|---------|-----------|-------------| +| Title | `title` | Public page title | +| Slug | `slug` | URL segment for `/status/{slug}` | +| Description | `description` | Optional introductory text | +| Servers | `server_ids_json` | Servers shown on this page | +| Group by server group | `group_by_server_group` | Organize servers by their ServerBee group | +| Show values | `show_values` | Show numeric uptime/status values on the public page | +| Custom CSS | `custom_css` | Extra CSS applied to this page | +| Enabled | `enabled` | Disabled pages return 404 | +| Yellow uptime threshold | `uptime_yellow_threshold` | Days below this percentage show as degraded | +| Red uptime threshold | `uptime_red_threshold` | Days below this percentage show as major outage | +| Theme | `theme_ref` | Preset/custom theme reference, or `null` to follow admin default | + + +The current API request fields use `server_ids_json` and `status_page_ids_json` for selected IDs. These fields accept JSON arrays in request bodies. + + +### Public Page Data + +```http +GET /api/status/{slug} ``` +The response includes: + +- `page` -- page metadata and display options +- `theme` -- resolved theme variables +- `servers` -- selected server statuses, uptime percentages, and 90-day daily uptime data +- `active_incidents` -- unresolved incidents linked to the page +- `planned_maintenances` -- active/upcoming maintenance windows linked to the page +- `recent_incidents` -- resolved incidents from the recent history window + +Server entries include `server_id`, `server_name`, region/country, OS, group, `online`, `uptime_percent`, `uptime_daily`, and `in_maintenance`. + ## Uptime Timeline -Each server on the status page displays a 90-day uptime timeline. The timeline is a horizontal bar chart where each bar represents one day: +Each server on a configurable status page can show a 90-day uptime timeline. Each bar represents one day: -- **Green** -- 100% uptime (fully online all day) -- **Yellow** -- Degraded uptime (below the yellow threshold, default 100%) -- **Red** -- Major outage (below the red threshold, default 95%) -- **Gray** -- No data available for that day +- **Green** -- healthy uptime +- **Yellow** -- uptime below the page's yellow threshold +- **Red** -- uptime below the page's red threshold +- **Gray** -- no data -Hovering over a bar shows a tooltip with the date, uptime percentage, and online/total minutes. +Uptime data comes from the `uptime_daily` table, populated by the server's background aggregation tasks. Missing dates are gap-filled so the timeline remains continuous. -The uptime data comes from the `uptime_daily` table, which is populated by the server's background aggregation task. The status page API returns daily entries with gap-filling for dates where no data was recorded (shown as gray bars with 0 minutes). +## Incidents -### Uptime Thresholds +Incidents are public announcements for outages or degraded service. They can be linked to specific servers, status pages, or both. -Admins can customize the color thresholds per status page in **Settings > Status Pages**: +### Fields -- **Yellow threshold** (default: 100%) -- Days below this percentage are shown in yellow -- **Red threshold** (default: 95%) -- Days below this percentage are shown in red +| Field | Description | +|-------|-------------| +| `title` | Incident title | +| `status` | `investigating`, `identified`, `monitoring`, or `resolved` | +| `severity` | `minor`, `major`, or `critical` | +| `server_ids_json` | Optional affected servers | +| `status_page_ids_json` | Optional affected status pages | -These thresholds are stored per status page and returned in the public API response, allowing different status pages to have different sensitivity levels. +An incident can have multiple updates. Adding an update records a message and moves the incident to the update's status. Setting status to `resolved` also sets `resolved_at`. -## Hiding Servers +### API -To exclude servers from the status page: +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/incidents` | List incidents; supports status filtering | +| POST | `/api/incidents` | Create an incident | +| PUT | `/api/incidents/{id}` | Update an incident | +| DELETE | `/api/incidents/{id}` | Delete an incident | +| POST | `/api/incidents/{id}/updates` | Add an incident update | + +## Maintenance Windows + +Maintenance windows announce planned work and also suppress notifications for associated servers while active. + +### Fields + +| Field | Description | +|-------|-------------| +| `title` | Maintenance title | +| `description` | Optional details | +| `start_at` | UTC start time | +| `end_at` | UTC end time; must be after `start_at` | +| `server_ids_json` | Optional affected servers | +| `status_page_ids_json` | Optional affected status pages | +| `active` | Whether the window is enabled | + +### API + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/maintenances` | List maintenance windows | +| POST | `/api/maintenances` | Create a maintenance window | +| PUT | `/api/maintenances/{id}` | Update a maintenance window | +| DELETE | `/api/maintenances/{id}` | Delete a maintenance window | + +## Admin API + +| Method | Path | Description | +|--------|------|-------------| +| GET | `/api/status-pages` | List configured status pages | +| POST | `/api/status-pages` | Create a status page | +| PUT | `/api/status-pages/{id}` | Update a status page | +| DELETE | `/api/status-pages/{id}` | Delete a status page | + +Create example: + +```json +{ + "title": "Production Status", + "slug": "production", + "description": "Public health for production services", + "server_ids_json": ["server-id-1", "server-id-2"], + "group_by_server_group": true, + "show_values": true, + "enabled": true, + "uptime_yellow_threshold": 99.9, + "uptime_red_threshold": 95 +} +``` -1. Go to the server detail page → Edit -2. Enable the **Hidden** option -3. The server will no longer appear in the `/api/status` response +To set a custom theme during update, pass `theme_ref`, for example `"preset:default"` or a custom theme reference. Use `null` to follow the admin default. - - + + +