feat(provider):Thinking 能力接入#560
Merged
phantom5099 merged 7 commits into1024XEngineer:mainfrom May 6, 2026
Merged
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: phantom5099 <245659304+phantom5099@users.noreply.github.com>
test: increase thinking coverage
Collaborator
Author
|
/code 将测试覆盖率提到100% |
This comment was marked as resolved.
This comment was marked as resolved.
Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: phantom5099 <245659304+phantom5099@users.noreply.github.com>
This comment was marked as low quality.
This comment was marked as low quality.
This comment was marked as resolved.
This comment was marked as resolved.
Generated with [codeagent](https://github.com/qbox/codeagent) Co-authored-by: phantom5099 <245659304+phantom5099@users.noreply.github.com>
CompletedImplemented the prompt update in English so user-driven task switches are handled explicitly. Summary
Verification
Delivery
|
fix(prompt): handle todo cleanup on task switches
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
一、原存在的问题
1. Thinking/Reasoning 能力空缺
reasoning_content、reasoning_details、<think>标签等),所有模型返回的推理内容被丢弃2. 暴露给模型的提示词存在矛盾
agent_identity.md的 Capabilities 段声称 "Read, search, write, and edit files"plan_mode_plan.md却说 "Do not perform any write action in this stage"3. Plan 模式屏蔽了
todo_writemode_filter.go)仅含 6 个工具,todo_write不在其中4. Plan 切换时 stale todo 阻塞验证
final_acceptance.go) 看到任何 non-terminal todo 即 block completion,阻止用户推进5. Provider 覆盖不足
thinking.typevsenable_thinkingvschat_template_kwargsvs<think>标签)二、解决方案与依据
2.1 Thinking 类型系统层
新增字段:
ModelCapabilityHintsThinking、ThinkingEfforts、ThinkingDefaultEffort、ThinkingForceEnabledMessageThinkingMetadata json.RawMessageStreamEventStreamEventThinkingDelta+ThinkingDeltaPayloadGenerateRequestThinkingConfigThinkingForceEnabled为 Gemini 3 和 MiniMax 设计——这两个模型无法可靠关闭 thinking(关闭后仍可能输出),与其给用户展示一个"关不掉"的开关,不如强行灰掉。2.2 提示词架构重构
拆分前:
拆分后:
依据:
capabilitiesSource在 prompt 拼装顺序中排在corePromptSource之后、rulesPromptSource之前。Plan 模式下模型看到的能力声明本身就是只读的,不依赖跨 section 优先级推理。2.3 Plan 模式工具白名单修复
在
isReadOnlyVisibleTool白名单中增加ToolNameTodoWrite。依据:
summary_candidate.active_todo_ids引用了模型通过todo_write创建的任务项。如果 plan 阶段无法写 todo,这些引用就是悬空的,build 阶段需要重建。主流 agent(Claude Code、Devin)的 plan 模式均允许 todo 管理。2.4 5 个新 Driver + Kimi Vendor Hint
按 RFC §6.10 的分层策略为每个厂商选择合适的实现层级:
deepseekdriverthinking.type+reasoning_effort+ V4 全多轮必须回传reasoning_contentopenaicompat+ vendor hintreasoning(非reasoning_content)openaicompatqwen 子模块enable_thinking是平级布尔非嵌套对象,属于参数结构差异openaicompatglm 子模块chat_template_kwargs额外嵌套层,属于参数结构差异mimodriverminimaxdriver<think>标签)、参数结构差异(reasoning_split)、默认行为不可预测(enable_thinking不可靠)2.5 Session Schema 迁移
SQLite schema v6→v7:
messages表新增thinking_metadata_json TEXT NOT NULL DEFAULT ''。依据:
ThinkingMetadata是 message 级 opaque JSON。随 Message 自动序列化/反序列化,checkpoint 自动覆盖,无需额外 session-level 状态。2.6 Runtime Thinking 集成
resolveThinkingConfigruntime/thinking.goCapabilityHints、ThinkingOverride(预留)、全局开关、ThinkingForceEnabled,产出*ThinkingConfigEventThinkingDeltaruntime/events.goOnThinkingDeltahookstreaming/handler.goErrThinkingNotSupported重试run.go callProviderThinkingConfigSetThinkingEnabled/IsThinkingEnabledruntime/runtime.gotrue,不持久化2.7 厂商适配已确认维度
实现阶段对 6 个新增厂商的 8 个维度做了实际 API 调研(2026-05),填入 builtin 模型列表:
thinking.typethinking.typeenable_thinking(bool)enable_thinking+chat_template_kwargsthinking.typeenable_thinking+reasoning_splitreasoning_contentreasoningreasoning_contentreasoning_contentreasoning_contentreasoning_details/<think>clear_thinking2.8 Plan 切换自动清理 stale todo
问题: Plan 创建新 revision 时,旧 revision 遗留的 pending/in_progress/blocked 状态的 todo 未被清理。验证器看到 non-terminal todo 即拒绝 completion。
方案: 在
applyCurrentPlanRevision(plan revision 切换唯一切入点)中,检测到 revision 递增时调用agentsession.CancelNonTerminalTodos将旧 todo 的非终态标记为canceled。依据:
todo_write的canceled状态本身存在但未使用。终态(completed/failed)的 todo 保持原样,因为这些是跨 revision 可能仍有参考价值的记录。非终态(pending/in_progress/blocked)在新的 revision 下已无意义,自动取消避免阻塞验证。三、修改范围
3.1 Provider 层(核心实现)
provider/types/model.goprovider/types/message.goThinkingMetadata json.RawMessageprovider/types/event.goStreamEventThinkingDelta、payload、constructor、accessorprovider/types/request.goThinkingConfigstructprovider/errors.goErrThinkingNotSupported、IsThinkingNotSupportedErrorprovider/stream_events.goEmitThinkingDeltaprovider/constants.goprovider/generate_attempt.goIsEffectiveGeneratePayloadEvent加入ThinkingDeltaprovider/builtin/builtin.goprovider/deepseek/provider/mimo/provider/minimax/<think>兜底)+ 测试provider/openaicompat/qwen/enable_thinking平级布尔 + 采样参数)+ 测试provider/openaicompat/glm/chat_template_kwargs)+ 测试provider/openaicompat/chatcompletions/adapter.goreasoning_content+reasoning(Kimi K2.6),ConsumeStream 发出 thinking_deltaprovider/openaicompat/chatcompletions/types.goReasoningContentprovider/openaicompat/chatcompletions/request.gotoOpenAIMessageWithBudget从 ThinkingMetadata 恢复 continuity3.2 Config 层
config/provider.gobuiltinCapabilitiesV2、supportsChatAPIMode、isOpenAICompatLike、hintsAreZeroconfig/provider_custom_normalize.gohintsAreZero辅助函数config/defaults_test.goconfig/provider_test.goconfig/state/service_test.go3.3 Session 层
session/store.gosession/sqlite_store.gothinking_metadata_json列、migration V6→V7、Scan/INSERT/SELECT/buildMessageFromRow更新3.4 Runtime 层
runtime/thinking.goresolveThinkingConfig、modelCapabilityHintsForRequest、containsEffortruntime/streaming/handler.goHooks.OnThinkingDelta+HandleEvent新增 thinking 分支(不进入 accumulator)runtime/events.goEventThinkingDelta事件类型runtime/runtime.gothinkingEnabled字段 +SetThinkingEnabled/IsThinkingEnabled,默认trueruntime/run.goThinkingConfig;callProvider注入OnThinkingDelta+ErrThinkingNotSupported重试runtime/state.gorunState预留thinkingOverride字段3.5 提示词 & Gateway
promptasset/templates/core/agent_identity.mdpromptasset/templates/core/capabilities.mdpromptasset/templates/core/capabilities_plan.mdpromptasset/assets.goCapabilitiesPrompt(stage)context/source_capabilities.gocontext/builder.gocapabilitiesSource{}gateway/contracts.goModelEntry+CapabilityHintscli/gateway_runtime_bridge.goListModels填充CapabilityHints3.6 Bug 修复
tools/mode_filter.gotodo_write被屏蔽session/todo.go+runtime/planning.goapplyCurrentPlanRevision中自动取消旧 todoprovider/types/model_test.go[]string不能!=比较reflect.DeepEqualprovider/catalog/service_test.go四、预期收益
4.1 对用户
EventThinkingDelta,展示模型推理过程ThinkingMetadata随消息持久化,checkpoint 恢复后自动延续4.2 对开发者
thinking.typevsenable_thinkingvs<think>标签差异ThinkingForceEnabled避免假开关:新增厂商如果关闭不可靠,标记true即可,UI 自动灰掉五、TUI / GUI / Gateway 后续可接入
以下为本次已预留接口、暂未实现 TUI 交互的部分,上层可随时接入:
5.1 Thinking 开关控件
gateway.listModels返回的ModelEntry.CapabilityHints已包含Thinking、ThinkingEfforts、ThinkingDefaultEffort、ThinkingForceEnabled5.2 Thinking 内容展示
EventThinkingDelta向上游推送 thinking 文本增量5.3 用户 ThinkingOverride
PrepareInput.ThinkingOverride(Enabled *bool+Effort string)已定义ThinkingOverride传入 Runtime SubmitThinkingOverride字段在runState中,默认值走全局开关(true)5.4 全局 Thinking 开关
Service.SetThinkingEnabled(bool)/IsThinkingEnabled()已实现true,进程级有效,不持久化六、测试覆盖
6.1 新增测试文件
deepseek/request_test.godeepseek/driver_test.gomimo/provider_test.gominimax/provider_test.goqwen/provider_test.goglm/provider_test.go6.2 更新已有测试
model_test.go: struct 比较 →reflect.DeepEqualcatalog/service_test.go: model count(6→7→8)config/defaults_test.go: provider count(4→10)config/provider_test.go: 同上config/state/service_test.go: expected providers map 更新6.3 运行结果