Skip to content

refactor(sdk-compat): centralize SDK deps, add startup probe, fix bugs (#168)#169

Open
lml2468 wants to merge 2 commits intodevelopfrom
fix/sdk-compat-shim
Open

refactor(sdk-compat): centralize SDK deps, add startup probe, fix bugs (#168)#169
lml2468 wants to merge 2 commits intodevelopfrom
fix/sdk-compat-shim

Conversation

@lml2468
Copy link
Copy Markdown
Collaborator

@lml2468 lml2468 commented Apr 11, 2026

Summary

Closes #168 — Centralize all OpenClaw Plugin SDK runtime dependencies into src/sdk-compat.ts, add startup probing for required SDK methods, and fix known bugs.

Changes

New: src/sdk-compat.ts — single SDK runtime contact point

  • Runtime singleton (setRuntime/getRuntime) replaces src/runtime.ts
  • DEFAULT_ACCOUNT_ID with safe fallback if SDK import fails
  • Optional SDK export loaders with built-in fallbacks (clearHistoryIfEnabled, getDefaultGroupHistoryLimit)
  • Startup probe: checks 9 required PluginRuntime deep methods exist before allowing bot to start

Deleted: src/runtime.ts

Functionality merged into sdk-compat.ts.

Bug fixes

  • P0: Removed openclaw.plugin.json configSchema (additionalProperties: false with empty properties rejected all config)
  • P1: capabilities.threadsfalse (threads implemented via agent-tools, not SDK ChannelThreadingAdapter)
  • P2: Removed non-standard listActions method from actions object

Cleanup

  • Removed scattered SDK compat code from inbound.ts (ensureSdkLoaded(), manual null checks) — now handled centrally by startup probe
  • SDK import consolidation: runtime value imports now go through sdk-compat.ts; import type statements remain in place (zero runtime risk)

Tests

  • New src/sdk-compat.test.ts: 5 contract tests covering probe on empty/complete/partial/null runtime mocks
  • All 457 existing tests pass, TSC clean

SDK Import Topology (after)

sdk-compat.ts ──→ openclaw/plugin-sdk  (sole runtime contact point)
    ↑
    ├── index.ts       setRuntime
    ├── channel.ts     DEFAULT_ACCOUNT_ID, probeRuntimeMethods, loadOptionalSdkExports
    ├── accounts.ts    DEFAULT_ACCOUNT_ID
    └── inbound.ts     getRuntime, clearHistoryIfEnabled, getDefaultGroupHistoryLimit

All modules retain: import type { ... } from "openclaw/plugin-sdk"  ← compile-time only

Diff stats

  • 9 files changed, +237 / -87 lines

#168)

- Create src/sdk-compat.ts: single runtime contact point for all SDK
  runtime dependencies (setRuntime/getRuntime, DEFAULT_ACCOUNT_ID with
  fallback, optional SDK export loaders, startup probe for 9 required
  PluginRuntime methods)

- Delete src/runtime.ts: functionality merged into sdk-compat

- Add startup probe in gateway.startAccount: checks all 9 deep
  PluginRuntime methods exist before allowing bot to start. Missing
  methods = clear error message instead of cryptic runtime crash.

- Fix P0: remove openclaw.plugin.json configSchema that had
  additionalProperties:false with empty properties (rejected all config)

- Fix P1: capabilities.threads false (threads implemented via
  agent-tools, not SDK ChannelThreadingAdapter)

- Fix P2: remove non-standard listActions method from actions object

- Remove scattered SDK compat code from inbound.ts (ensureSdkLoaded,
  manual null checks) — now handled centrally

- Add contract tests (src/sdk-compat.test.ts): probe correctness on
  empty/complete/partial/null runtime mocks

All 457 tests pass. TSC clean.
Copy link
Copy Markdown
Collaborator

@Jerry-Xin Jerry-Xin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean refactor — centralizing SDK deps into sdk-compat.ts with the startup probe is a solid pattern. The contract tests for probeRuntimeMethods are thorough (empty, complete, partial, null/undefined). A few things I noticed below, mostly minor.

let _DEFAULT_ACCOUNT_ID = "__default__";
try {
// eslint-disable-next-line @typescript-eslint/no-require-imports
const sdk = require("openclaw/plugin-sdk");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require() at module scope will work if this always compiles to CJS, but will throw ReferenceError in a pure ESM runtime (e.g., if package.json has "type": "module" and there's no CJS shim). The try/catch catches a missing-module error but won't catch require itself being undefined.

If the build target is always CJS, might be worth a quick comment noting that assumption. Otherwise you could defer this to loadOptionalSdkExports() and fold it into the async init:

let _DEFAULT_ACCOUNT_ID = "__default__";
// resolved in loadOptionalSdkExports()

Not a blocker if you know it's always CJS — just flagging in case the module target changes in the future.

* Load optional SDK exports that may not exist in older versions.
* Call once during startAccount after probe passes.
*/
export async function loadOptionalSdkExports(): Promise<void> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the old ensureSdkLoaded() in inbound.ts had a _sdkLoaded flag to skip redundant re-imports. This function doesn't have an idempotency guard. If startAccount is called multiple times (hot reload / account restart), it'll re-import and re-assign every time.

Not harmful, but a one-liner guard would be cheap insurance:

let _optionalLoaded = false;
export async function loadOptionalSdkExports(): Promise<void> {
  if (_optionalLoaded) return;
  _optionalLoaded = true;
  // ...
}

return;
}
const core = getRuntime();
// SDK method availability is guaranteed by probeRuntimeMethods() at startup
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the startup probe is a better pattern than scattering null checks everywhere. Just want to note the tradeoff: the old guard here was the last line of defense before dispatchReplyWithBufferedBlockDispatcher — if something went wrong, it logged detailed diagnostics and returned gracefully (message silently dropped).

With the probe, if the runtime is somehow in a bad state at message-handling time (unlikely but possible — e.g., setRuntime called again with a different object), you'd get an unhandled TypeError instead of a graceful skip.

Failing loudly is arguably the right call for production — silent message drops are worse to debug. Just worth being aware of the change in failure mode.

threads: false,
},
reload: { configPrefixes: ["channels.dmwork"] },
actions: {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Worth a brief comment or a note in the PR description about why threads: false. The PR body explains it (threads via agent-tools, not SDK ChannelThreadingAdapter), but future readers of the code might wonder why it was flipped. A one-line comment like // threading handled via agent tools, not ChannelThreadingAdapter would help.

- Remove top-level require() — defer DEFAULT_ACCOUNT_ID resolution to
  loadOptionalSdkExports() for ESM compatibility
- Add idempotency guard to loadOptionalSdkExports() (hot reload safety)
- Add code comment on threads: false explaining why
Copy link
Copy Markdown
Collaborator Author

@lml2468 lml2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All feedback addressed in 7cc9fa4:

  1. require() ESM compat — Removed top-level require(). DEFAULT_ACCOUNT_ID now starts as "__default__" and gets resolved in loadOptionalSdkExports() via await import(). Exported as a mutable binding (export { _DEFAULT_ACCOUNT_ID as DEFAULT_ACCOUNT_ID }) so consumers see the updated value after init.

  2. Idempotency guard — Added _optionalLoaded flag. Safe for hot reload / multi-account restart.

  3. Null check removal tradeoff — Agreed. Failing loudly with a TypeError is better than silently dropping messages. The startup probe ensures this path is only reachable if the methods existed at boot time, so the only scenario for runtime failure is if someone calls setRuntime() with a degraded object mid-flight — a bug worth surfacing.

  4. threads: false comment — Added inline comment.

TSC clean, 457 tests pass.

@Jerry-Xin
Copy link
Copy Markdown
Collaborator

Thanks for the quick turnaround @lml2468 — all 4 points addressed cleanly.

  • await import() with the _optionalLoaded idempotency guard looks solid. The mutable re-export (export { _DEFAULT_ACCOUNT_ID as DEFAULT_ACCOUNT_ID }) correctly leverages ESM live bindings.
  • Dropping the null checks in favor of failing loudly is the right call — probeRuntimeMethods at startup catches missing methods early. Test coverage there is thorough (empty/complete/partial/null/undefined cases).
  • configSchema and listActions removal is reasonable cleanup.
  • threads: false comment appreciated.

LGTM 👍

Copy link
Copy Markdown
Collaborator Author

@lml2468 lml2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: 质量好,可优先合入,有一个确认项

sdk-compat.ts 集中 SDK 依赖、启动探测 9 个必要方法、懒加载设计合理。删除 runtime.ts 统一入口,方向正确。测试覆盖到位(sdk-compat.test.ts + temp-utils.test.ts)。

一个需要确认的改动:

channel.tsthreads: false(线程走 agent-tools 不走 SDK)——这个改动影响面较大,是否有配套说明或通知给下游使用方?请在 PR description 里补充说明,或确认已通知相关团队。

确认后可优先合入(#132 依赖此 PR rebase)。

@Jerry-Xin
Copy link
Copy Markdown
Collaborator

Good catch on the threads: false concern. Agree it warrants explicit documentation since it changes the channel's advertised capabilities.

The change itself makes sense — threading via agent-tools gives more control than ChannelThreadingAdapter — but any consumer inspecting capabilities.threads will now see false where it used to be true. If there's downstream logic branching on that flag (e.g., UI thread indicators, routing decisions), it could silently change behavior.

Suggestion: add a brief note in the PR description explaining the rationale and confirm no downstream service relies on capabilities.threads === true from this channel. If that's confirmed, this is fine to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC: 跨 OpenClaw 版本兼容架构——SDK Shim 层 + 契约测试 + 降级策略

2 participants