Skip to content

fix(llm): add exponential backoff retry for API calls#38

Open
fryrice2000 wants to merge 2 commits intoNarcooo:masterfrom
fryrice2000:fix/llm-retry-backoff
Open

fix(llm): add exponential backoff retry for API calls#38
fryrice2000 wants to merge 2 commits intoNarcooo:masterfrom
fryrice2000:fix/llm-retry-backoff

Conversation

@fryrice2000
Copy link

@fryrice2000 fryrice2000 commented Mar 16, 2026

修复内容

LLM 调用层(chatCompletion / chatWithTools)无重试机制,
遇到 429/502/ECONNRESET 等瞬时 API 错误时会导致管线直接崩溃。

修改方案

  • 新增 [packages/core/src/llm/retry.ts]:退避重试工具
    • 默认 3 次重试,延时 1s → 2s → 4s(含随机抖动)
    • 429/502/503/ECONNRESET 等可重试;401/403/400 不重试
  • 在 [provider.ts] 的两个公共函数中包裹 [withRetry]

测试

  • 新增 [retry.test.ts],15 个测试用例
  • 全部 177 个测试通过,typecheck 零错误

✅ 已覆盖的场景
错误类型 示例 行为
限频 429 Rate Limit 重试 ✅
网关错误 502 Bad Gateway 重试 ✅
服务不可用 503 Service Unavailable 重试 ✅
连接重置 ECONNRESET 重试 ✅
连接超时 ETIMEDOUT 重试 ✅
DNS 解析失败 ENOTFOUND 重试 ✅
Socket 断开 socket hang up 重试 ✅
fetch 层失败 TypeError: fetch failed 重试 ✅
中文包装的 429 "请求过多" (wrapLLMError 输出) 重试 ✅
认证失败 401 Unauthorized 不重试 ✅
权限拒绝 403 Forbidden 不重试 ✅
请求格式错误 400 Bad Request 不重试 ✅
无效 API Key invalid_api_key 不重试 ✅

chatCompletion and chatWithTools now auto-retry on transient errors (429/502/503/ECONNRESET/ETIMEDOUT) with exponential backoff + jitter. Client errors (401/403/400) fail immediately without retry.

- New: packages/core/src/llm/retry.ts (withRetry utility)

- New: packages/core/src/__tests__/retry.test.ts (15 test cases)

- Modified: packages/core/src/llm/provider.ts (wrap both public functions)
@fryrice2000 fryrice2000 force-pushed the fix/llm-retry-backoff branch from 3b894b1 to f126843 Compare March 19, 2026 14:52
@fryrice2000 fryrice2000 force-pushed the fix/llm-retry-backoff branch from f126843 to 615e33b Compare March 19, 2026 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant