From e064c023be19c0acd4d7d86195a2072dd91f109f Mon Sep 17 00:00:00 2001 From: fujie Date: Tue, 19 May 2026 23:57:09 +0800 Subject: [PATCH] fix(async-context-compression): rebase response parsing on v1.6.4 - accept responses-style summary payloads without a choices array - ignore reasoning-only output and surface compact empty-summary diagnostics - sync README, docs, indexes, and release notes for v1.6.4 --- README.md | 2 +- README_CN.md | 2 +- .../filters/async-context-compression.md | 13 +- .../filters/async-context-compression.zh.md | 13 +- docs/plugins/filters/index.md | 2 +- docs/plugins/filters/index.zh.md | 2 +- .../async-context-compression/README.md | 13 +- .../async-context-compression/README_CN.md | 13 +- .../async_context_compression.py | 109 +++++++++- .../test_async_context_compression.py | 198 ++++++++++++++++++ .../async-context-compression/v1.6.4.md | 20 ++ .../async-context-compression/v1.6.4_CN.md | 20 ++ 12 files changed, 371 insertions(+), 36 deletions(-) create mode 100644 plugins/filters/async-context-compression/v1.6.4.md create mode 100644 plugins/filters/async-context-compression/v1.6.4_CN.md diff --git a/README.md b/README.md index 4f33312..4f2cf75 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ A collection of enhancements, plugins, and prompts for [open-webui](https://gith | Rank | Plugin | Version | Downloads | Views | 📅 Updated | | :---: | :--- | :---: | :---: | :---: | :---: | | 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![v](https://img.shields.io/badge/v-1.0.1-blue?style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--24-gray?style=flat) | -| 🥈 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.6.3-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--19-gray?style=flat) | +| 🥈 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.6.4-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--19-gray?style=flat) | | 🥉 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![v](https://img.shields.io/badge/v-1.6.2-blue?style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--25-gray?style=flat) | | 4️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![v](https://img.shields.io/badge/v-1.2.8-blue?style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--15-gray?style=flat) | | 5️⃣ | [OpenWebUI Skills Manager Tool](https://openwebui.com/posts/openwebui_skills_manager_tool_b4bce8e4) | ![v](https://img.shields.io/badge/v-0.3.2-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--12-gray?style=flat) | diff --git a/README_CN.md b/README_CN.md index 2308554..3564c3d 100644 --- a/README_CN.md +++ b/README_CN.md @@ -21,7 +21,7 @@ OpenWebUI 增强功能集合。包含个人开发与收集的插件、提示词 | 排名 | 插件 | 版本 | 下载 | 浏览 | 📅 更新 | | :---: | :--- | :---: | :---: | :---: | :---: | | 🥇 | [Smart Mind Map](https://openwebui.com/posts/turn_any_text_into_beautiful_mind_maps_3094c59a) | ![v](https://img.shields.io/badge/v-1.0.1-blue?style=flat) | ![p1_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_dl.json&style=flat) | ![p1_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p1_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--24-gray?style=flat) | -| 🥈 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.6.3-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--19-gray?style=flat) | +| 🥈 | [Async Context Compression](https://openwebui.com/posts/async_context_compression_b1655bc8) | ![v](https://img.shields.io/badge/v-1.6.4-blue?style=flat) | ![p2_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_dl.json&style=flat) | ![p2_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p2_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--19-gray?style=flat) | | 🥉 | [Smart Infographic](https://openwebui.com/posts/smart_infographic_ad6f0c7f) | ![v](https://img.shields.io/badge/v-1.6.2-blue?style=flat) | ![p3_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_dl.json&style=flat) | ![p3_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p3_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--25-gray?style=flat) | | 4️⃣ | [Markdown Normalizer](https://openwebui.com/posts/markdown_normalizer_baaa8732) | ![v](https://img.shields.io/badge/v-1.2.8-blue?style=flat) | ![p4_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_dl.json&style=flat) | ![p4_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p4_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--04--15-gray?style=flat) | | 5️⃣ | [OpenWebUI Skills Manager Tool](https://openwebui.com/posts/openwebui_skills_manager_tool_b4bce8e4) | ![v](https://img.shields.io/badge/v-0.3.2-blue?style=flat) | ![p5_dl](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_dl.json&style=flat) | ![p5_vw](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_p5_vw.json&style=flat) | ![updated](https://img.shields.io/badge/2026--05--12-gray?style=flat) | diff --git a/docs/plugins/filters/async-context-compression.md b/docs/plugins/filters/async-context-compression.md index ce3d0cb..cfe63f2 100644 --- a/docs/plugins/filters/async-context-compression.md +++ b/docs/plugins/filters/async-context-compression.md @@ -1,6 +1,6 @@ # Async Context Compression Filter -| By [Fu-Jie](https://github.com/Fu-Jie) · v1.6.3 | [⭐ Star this repo](https://github.com/Fu-Jie/openwebui-extensions) | +| By [Fu-Jie](https://github.com/Fu-Jie) · v1.6.4 | [⭐ Star this repo](https://github.com/Fu-Jie/openwebui-extensions) | | :--- | ---: | | ![followers](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_followers.json&label=%F0%9F%91%A5&style=flat) | ![points](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_points.json&label=%E2%AD%90&style=flat) | ![top](https://img.shields.io/badge/%F0%9F%8F%86-Top%20%3C1%25-10b981?style=flat) | ![contributions](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_contributions.json&label=%F0%9F%93%A6&style=flat) | ![downloads](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_downloads.json&label=%E2%AC%87%EF%B8%8F&style=flat) | ![saves](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_saves.json&label=%F0%9F%92%BE&style=flat) | ![views](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_views.json&label=%F0%9F%91%81%EF%B8%8F&style=flat) | @@ -21,11 +21,11 @@ When the selection dialog opens, search for this plugin, check it, and continue. > [!IMPORTANT] > If the official OpenWebUI Community version is already installed, remove it first. After that, Batch Install Plugins can keep this plugin updated in future runs. -## What's new in 1.6.3 +## What's new in 1.6.4 -- **Graceful Summary Failure by Default**: Added the `SUMMARY_FAIL_MODE` valve and made transient summary-model failures non-blocking by default so the chat continues even when the background summary call fails. -- **Opt-in Strict Raise Mode**: Operators can set `SUMMARY_FAIL_MODE="raise"` to preserve the previous hard-failure behavior for debugging or strict observability workflows. -- **Regression Coverage for Both Modes**: Added direct tests for the new silent default and the explicit raise path so future summary error handling changes do not regress silently. +- **Robust summary response parsing**: Background summary generation now extracts text from several provider response shapes, including standard `choices[].message.content`, content-part arrays with `output_text`, and Responses-style `output` message items. +- **Reasoning-safe persistence**: Reasoning-only fields such as `reasoning_content`, `thinking`, and reasoning output items are ignored so private chain-of-thought is not persisted as chat memory. +- **Clearer empty-summary diagnostics**: If the summary provider returns no usable text, the filter now reports the compact response shape instead of surfacing a misleading generic format error. ## What's new in 1.6.0 @@ -184,10 +184,11 @@ If this plugin has been useful, a star on [OpenWebUI Extensions](https://github. - **Compression effect is weak**: Raise `compression_threshold_tokens` or lower `keep_first` / `keep_last` to allow more aggressive compression. - **A referenced chat summary fails**: The current request should continue with a direct-context fallback. Check the browser console (`F12`) if you need the upstream failure details. - **A background summary silently seems to do nothing**: Important failures now surface in chat status and the browser console (`F12`). +- **`Summary generation returned empty result` appears after the LLM call succeeds**: Update or reinstall the filter so the database-stored function content matches v1.6.4. This release can parse alternate provider response shapes, but it intentionally ignores reasoning-only output. If the model returns only `reasoning_content` / `thinking` without a final answer, the browser console will show the response shape and nothing will be saved as memory. - **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues) ## Changelog -See [`v1.6.3` Release Notes](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.3.md) for the release-specific summary. +See [`v1.6.4` Release Notes](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.4.md) for the release-specific summary. See the full history on GitHub: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) diff --git a/docs/plugins/filters/async-context-compression.zh.md b/docs/plugins/filters/async-context-compression.zh.md index ae68ca8..68a4074 100644 --- a/docs/plugins/filters/async-context-compression.zh.md +++ b/docs/plugins/filters/async-context-compression.zh.md @@ -1,6 +1,6 @@ # 异步上下文压缩过滤器 -| 作者:[Fu-Jie](https://github.com/Fu-Jie) · v1.6.3 | [⭐ 点个 Star 支持项目](https://github.com/Fu-Jie/openwebui-extensions) | +| 作者:[Fu-Jie](https://github.com/Fu-Jie) · v1.6.4 | [⭐ 点个 Star 支持项目](https://github.com/Fu-Jie/openwebui-extensions) | | :--- | ---: | | ![followers](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_followers.json&label=%F0%9F%91%A5&style=flat) | ![points](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_points.json&label=%E2%AD%90&style=flat) | ![top](https://img.shields.io/badge/%F0%9F%8F%86-Top%20%3C1%25-10b981?style=flat) | ![contributions](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_contributions.json&label=%F0%9F%93%A6&style=flat) | ![downloads](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_downloads.json&label=%E2%AC%87%EF%B8%8F&style=flat) | ![saves](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_saves.json&label=%F0%9F%92%BE&style=flat) | ![views](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_views.json&label=%F0%9F%91%81%EF%B8%8F&style=flat) | @@ -22,11 +22,11 @@ > [!IMPORTANT] > 如果你已经安装了 OpenWebUI 官方社区里的同名版本,请先删除旧版本,否则重新安装时可能报错。删除后,Batch Install Plugins 后续就可以继续负责更新这个插件。 -## 1.6.3 版本更新 +## 1.6.4 版本更新 -- **默认静默处理摘要失败**:新增 `SUMMARY_FAIL_MODE` 配置项,默认在摘要模型出现瞬时错误时只记录错误并跳过本轮摘要,不再打断当前聊天。 -- **保留可选的严格抛错模式**:如需保留旧行为,可将 `SUMMARY_FAIL_MODE="raise"`,用于调试或希望强制暴露摘要链路故障的场景。 -- **补齐双模式回归测试**:新增 silent 默认路径与 raise 显式路径的直接测试,避免后续摘要错误处理回归时无提示。 +- **更稳健的摘要响应解析**:后台摘要现在会从多种 provider 返回结构中提取文本,包括标准的 `choices[].message.content`、带 `output_text` 的 content parts,以及 Responses 风格 `output` 里的 message 内容。 +- **更安全的 reasoning 过滤**:`reasoning_content`、`thinking` 和 reasoning 类型 output 会被明确忽略,避免把模型思考过程写入聊天记忆。 +- **更清晰的空摘要诊断**:如果摘要模型没有返回任何可用文本,过滤器现在会报告精简后的响应结构,而不是抛出含义模糊的通用格式错误。 ## 1.6.0 版本更新 - **修正 `keep_first` 逻辑**:重新定义了 `keep_first` 的功能,现在它负责保护前 N 条**非系统消息**(以及它们之前的所有系统提示词)。这确保了初始对话背景(如身份设定、任务说明)能被正确保留。 @@ -218,10 +218,11 @@ flowchart TD - **压缩效果不明显**:提高 `compression_threshold_tokens`,或降低 `keep_first` / `keep_last` 以增强压缩力度。 - **引用聊天摘要失败**:当前请求现在应该会继续执行,并回退为直接注入上下文。如果要看上游失败原因,请打开浏览器控制台 (`F12`)。 - **后台摘要看起来“没反应”**:重要失败现在会同时出现在状态提示和浏览器控制台 (`F12`) 中。 +- **LLM 调用成功后仍出现 `Summary generation returned empty result`**:请更新或重新安装过滤器,确保数据库里保存的 function 内容已经是 v1.6.4。这个版本可以解析多种 provider 响应结构,但会刻意忽略 reasoning-only 输出;如果模型只返回 `reasoning_content` / `thinking` 而没有最终答案文本,浏览器控制台会显示响应结构,插件不会把思考过程保存为记忆。 - **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue:[OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues) ## 更新日志 -请查看 [`v1.6.3` 版本发布说明](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.3_CN.md) 获取本次版本的独立发布摘要。 +请查看 [`v1.6.4` 版本发布说明](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.4_CN.md) 获取本次版本的独立发布摘要。 完整历史请查看 GitHub 项目: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) diff --git a/docs/plugins/filters/index.md b/docs/plugins/filters/index.md index fa1ade0..c4e8412 100644 --- a/docs/plugins/filters/index.md +++ b/docs/plugins/filters/index.md @@ -22,7 +22,7 @@ Filters act as middleware in the message pipeline: Reduces token consumption in long conversations with safer summary fallbacks and clearer failure visibility. - **Version:** 1.6.3 + **Version:** 1.6.4 [:octicons-arrow-right-24: Documentation](async-context-compression.md) diff --git a/docs/plugins/filters/index.zh.md b/docs/plugins/filters/index.zh.md index b2804d2..0091098 100644 --- a/docs/plugins/filters/index.zh.md +++ b/docs/plugins/filters/index.zh.md @@ -22,7 +22,7 @@ Filter 充当消息管线中的中间件: 通过更稳健的摘要回退和更清晰的失败提示,降低长对话的 token 消耗并保持连贯性。 - **版本:** 1.6.3 + **版本:** 1.6.4 [:octicons-arrow-right-24: 查看文档](async-context-compression.zh.md) diff --git a/plugins/filters/async-context-compression/README.md b/plugins/filters/async-context-compression/README.md index 2eb9797..7acd55e 100644 --- a/plugins/filters/async-context-compression/README.md +++ b/plugins/filters/async-context-compression/README.md @@ -1,6 +1,6 @@ # Async Context Compression Filter -| By [Fu-Jie](https://github.com/Fu-Jie) · v1.6.3 | [⭐ Star this repo](https://github.com/Fu-Jie/openwebui-extensions) | +| By [Fu-Jie](https://github.com/Fu-Jie) · v1.6.4 | [⭐ Star this repo](https://github.com/Fu-Jie/openwebui-extensions) | | :--- | ---: | | ![followers](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_followers.json&label=%F0%9F%91%A5&style=flat) | ![points](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_points.json&label=%E2%AD%90&style=flat) | ![top](https://img.shields.io/badge/%F0%9F%8F%86-Top%20%3C1%25-10b981?style=flat) | ![contributions](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_contributions.json&label=%F0%9F%93%A6&style=flat) | ![downloads](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_downloads.json&label=%E2%AC%87%EF%B8%8F&style=flat) | ![saves](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_saves.json&label=%F0%9F%92%BE&style=flat) | ![views](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_views.json&label=%F0%9F%91%81%EF%B8%8F&style=flat) | @@ -21,11 +21,11 @@ When the selection dialog opens, search for this plugin, check it, and continue. > [!IMPORTANT] > If the official OpenWebUI Community version is already installed, remove it first. After that, Batch Install Plugins can keep this plugin updated in future runs. -## What's new in 1.6.3 +## What's new in 1.6.4 -- **Graceful Summary Failure by Default**: Added the `SUMMARY_FAIL_MODE` valve and made transient summary-model failures non-blocking by default so the chat continues even when the background summary call fails. -- **Opt-in Strict Raise Mode**: Operators can set `SUMMARY_FAIL_MODE="raise"` to preserve the previous hard-failure behavior for debugging or strict observability workflows. -- **Regression Coverage for Both Modes**: Added direct tests for the new silent default and the explicit raise path so future summary error handling changes do not regress silently. +- **Robust summary response parsing**: Background summary generation now extracts text from several provider response shapes, including standard `choices[].message.content`, content-part arrays with `output_text`, and Responses-style `output` message items. +- **Reasoning-safe persistence**: Reasoning-only fields such as `reasoning_content`, `thinking`, and reasoning output items are ignored so private chain-of-thought is not persisted as chat memory. +- **Clearer empty-summary diagnostics**: If the summary provider returns no usable text, the filter now reports the compact response shape instead of surfacing a misleading generic format error. ## What's new in 1.5.0 @@ -178,10 +178,11 @@ If this plugin has been useful, a star on [OpenWebUI Extensions](https://github. - **Compression effect is weak**: Raise `compression_threshold_tokens` or lower `keep_first` / `keep_last` to allow more aggressive compression. - **A referenced chat summary fails**: The current request should continue with a direct-context fallback. Check the browser console (`F12`) if you need the upstream failure details. - **A background summary silently seems to do nothing**: Important failures now surface in chat status and the browser console (`F12`). +- **`Summary generation returned empty result` appears after the LLM call succeeds**: Update or reinstall the filter so the database-stored function content matches v1.6.4. This release can parse alternate provider response shapes, but it intentionally ignores reasoning-only output. If the model returns only `reasoning_content` / `thinking` without a final answer, the browser console will show the response shape and nothing will be saved as memory. - **Submit an Issue**: If you encounter any problems, please submit an issue on GitHub: [OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues) ## Changelog -See [`v1.6.3` Release Notes](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.3.md) for the release-specific summary. +See [`v1.6.4` Release Notes](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.4.md) for the release-specific summary. See the full history on GitHub: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) diff --git a/plugins/filters/async-context-compression/README_CN.md b/plugins/filters/async-context-compression/README_CN.md index 0e32c47..5113083 100644 --- a/plugins/filters/async-context-compression/README_CN.md +++ b/plugins/filters/async-context-compression/README_CN.md @@ -1,6 +1,6 @@ # 异步上下文压缩过滤器 -| 作者:[Fu-Jie](https://github.com/Fu-Jie) · v1.6.3 | [⭐ 点个 Star 支持项目](https://github.com/Fu-Jie/openwebui-extensions) | +| 作者:[Fu-Jie](https://github.com/Fu-Jie) · v1.6.4 | [⭐ 点个 Star 支持项目](https://github.com/Fu-Jie/openwebui-extensions) | | :--- | ---: | | ![followers](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_followers.json&label=%F0%9F%91%A5&style=flat) | ![points](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_points.json&label=%E2%AD%90&style=flat) | ![top](https://img.shields.io/badge/%F0%9F%8F%86-Top%20%3C1%25-10b981?style=flat) | ![contributions](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_contributions.json&label=%F0%9F%93%A6&style=flat) | ![downloads](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_downloads.json&label=%E2%AC%87%EF%B8%8F&style=flat) | ![saves](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_saves.json&label=%F0%9F%92%BE&style=flat) | ![views](https://img.shields.io/endpoint?url=https%3A%2F%2Fgist.githubusercontent.com%2FFu-Jie%2Fdb3d95687075a880af6f1fba76d679c6%2Fraw%2Fbadge_views.json&label=%F0%9F%91%81%EF%B8%8F&style=flat) | @@ -23,11 +23,11 @@ > [!IMPORTANT] > 如果你已经安装了 OpenWebUI 官方社区里的同名版本,请先删除旧版本,否则重新安装时可能报错。删除后,Batch Install Plugins 后续就可以继续负责更新这个插件。 -## 1.6.3 版本更新 +## 1.6.4 版本更新 -- **默认静默处理摘要失败**:新增 `SUMMARY_FAIL_MODE` 配置项,默认在摘要模型出现瞬时错误时只记录错误并跳过本轮摘要,不再打断当前聊天。 -- **保留可选的严格抛错模式**:如需保留旧行为,可将 `SUMMARY_FAIL_MODE="raise"`,用于调试或希望强制暴露摘要链路故障的场景。 -- **补齐双模式回归测试**:新增 silent 默认路径与 raise 显式路径的直接测试,避免后续摘要错误处理回归时无提示。 +- **更稳健的摘要响应解析**:后台摘要现在会从多种 provider 返回结构中提取文本,包括标准的 `choices[].message.content`、带 `output_text` 的 content parts,以及 Responses 风格 `output` 里的 message 内容。 +- **更安全的 reasoning 过滤**:`reasoning_content`、`thinking` 和 reasoning 类型 output 会被明确忽略,避免把模型思考过程写入聊天记忆。 +- **更清晰的空摘要诊断**:如果摘要模型没有返回任何可用文本,过滤器现在会报告精简后的响应结构,而不是抛出含义模糊的通用格式错误。 ## 1.5.0 版本更新 - **外部聊天引用摘要**: 新增对引用聊天上下文的摘要支持。现在可以复用缓存摘要、直接注入较小引用聊天,或先为较大的引用聊天生成摘要再注入。 @@ -213,10 +213,11 @@ flowchart TD - **压缩效果不明显**:提高 `compression_threshold_tokens`,或降低 `keep_first` / `keep_last` 以增强压缩力度。 - **引用聊天摘要失败**:当前请求现在应该会继续执行,并回退为直接注入上下文。如果要看上游失败原因,请打开浏览器控制台 (`F12`)。 - **后台摘要看起来“没反应”**:重要失败现在会同时出现在状态提示和浏览器控制台 (`F12`) 中。 +- **LLM 调用成功后仍出现 `Summary generation returned empty result`**:请更新或重新安装过滤器,确保数据库里保存的 function 内容已经是 v1.6.4。这个版本可以解析多种 provider 响应结构,但会刻意忽略 reasoning-only 输出;如果模型只返回 `reasoning_content` / `thinking` 而没有最终答案文本,浏览器控制台会显示响应结构,插件不会把思考过程保存为记忆。 - **提交 Issue**: 如果遇到任何问题,请在 GitHub 上提交 Issue:[OpenWebUI Extensions Issues](https://github.com/Fu-Jie/openwebui-extensions/issues) ## 更新日志 -请查看 [`v1.6.3` 版本发布说明](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.3_CN.md) 获取本次版本的独立发布摘要。 +请查看 [`v1.6.4` 版本发布说明](https://github.com/Fu-Jie/openwebui-extensions/blob/main/plugins/filters/async-context-compression/v1.6.4_CN.md) 获取本次版本的独立发布摘要。 完整历史请查看 GitHub 项目: [OpenWebUI Extensions](https://github.com/Fu-Jie/openwebui-extensions) diff --git a/plugins/filters/async-context-compression/async_context_compression.py b/plugins/filters/async-context-compression/async_context_compression.py index be6bdff..d5356a2 100644 --- a/plugins/filters/async-context-compression/async_context_compression.py +++ b/plugins/filters/async-context-compression/async_context_compression.py @@ -5,7 +5,7 @@ author_url: https://github.com/Fu-Jie/openwebui-extensions funding_url: https://github.com/open-webui description: Reduces token consumption in long conversations while maintaining coherence through intelligent summarization and message compression. -version: 1.6.3 +version: 1.6.4 openwebui_id: b1655bc8-6de9-4cad-8cb5-a6f7829a02ce license: MIT @@ -4783,6 +4783,99 @@ def _extract_provider_error(self, response: Any) -> Optional[str]: return None + def _extract_summary_text_from_response(self, response: Any) -> str: + """Extract assistant text from chat-completions and Responses-style payloads.""" + + def collect_text(value: Any) -> str: + if isinstance(value, str): + return value + + if isinstance(value, dict): + item_type = str(value.get("type") or "") + attributes = value.get("attributes") + attribute_type = ( + str(attributes.get("type") or "") + if isinstance(attributes, dict) + else "" + ) + if item_type in { + "reasoning", + "reasoning_text", + "reasoning_summary_text", + } or attribute_type == "reasoning_content": + return "" + + for key in ("text", "output_text", "content"): + text = collect_text(value.get(key)) + if text.strip(): + return text + return "" + + if isinstance(value, list): + parts = [] + for item in value: + text = collect_text(item) + if text: + parts.append(text) + return "".join(parts) + + return "" + + if not isinstance(response, dict): + return "" + + choices = response.get("choices") + if isinstance(choices, list) and choices: + first_choice = choices[0] if isinstance(choices[0], dict) else {} + message = first_choice.get("message") + if isinstance(message, dict): + text = collect_text(message.get("content")) + if text.strip(): + return text.strip() + + text = collect_text(first_choice.get("text")) + if text.strip(): + return text.strip() + + for key in ("output_text", "text", "content", "message", "response"): + text = collect_text(response.get(key)) + if text.strip(): + return text.strip() + + output = response.get("output") + if isinstance(output, list): + text = collect_text(output) + if text.strip(): + return text.strip() + + return "" + + def _summarize_response_shape(self, response: Any) -> str: + """Build a compact description of the payload when no summary text is found.""" + if not isinstance(response, dict): + return type(response).__name__ + + parts = [f"keys={sorted(response.keys())}"] + choices = response.get("choices") + if isinstance(choices, list): + parts.append(f"choices={len(choices)}") + if choices and isinstance(choices[0], dict): + first_choice = choices[0] + parts.append(f"choice0_keys={sorted(first_choice.keys())}") + message = first_choice.get("message") + if isinstance(message, dict): + parts.append(f"message_keys={sorted(message.keys())}") + content = message.get("content") + parts.append(f"message_content_type={type(content).__name__}") + if isinstance(content, str): + parts.append(f"message_content_len={len(content)}") + + output = response.get("output") + if isinstance(output, list): + parts.append(f"output={len(output)}") + + return " | ".join(parts) + async def _call_summary_llm( self, new_conversation_text: str, @@ -4879,12 +4972,7 @@ async def _call_summary_llm( f"Full response:\n{response_repr}" ) - if ( - not response - or not isinstance(response, dict) - or "choices" not in response - or not response["choices"] - ): + if not response or not isinstance(response, dict): try: response_repr = json.dumps(response, ensure_ascii=False, indent=2) except Exception: @@ -4894,7 +4982,12 @@ async def _call_summary_llm( f"Full response:\n{response_repr}" ) - summary = response["choices"][0]["message"]["content"].strip() + summary = self._extract_summary_text_from_response(response) + if not summary: + raise ValueError( + "LLM response did not contain summary text. " + f"Response shape: {self._summarize_response_shape(response)}" + ) await self._log( f"[🤖 LLM Call] ✅ Successfully received summary", diff --git a/plugins/filters/async-context-compression/test_async_context_compression.py b/plugins/filters/async-context-compression/test_async_context_compression.py index 9d9c445..2393d30 100644 --- a/plugins/filters/async-context-compression/test_async_context_compression.py +++ b/plugins/filters/async-context-compression/test_async_context_compression.py @@ -752,6 +752,204 @@ async def fake_event_call(payload): self.assertIn("console.error", frontend_calls[0]["data"]["code"]) self.assertIn("context too long", frontend_calls[0]["data"]["code"]) + def test_extract_summary_text_supports_alternate_response_shapes(self): + self.assertEqual( + self.filter._extract_summary_text_from_response( + { + "choices": [ + { + "message": { + "content": [ + { + "type": "output_text", + "text": "", + }, + { + "type": "output_text", + "text": "test", + }, + ] + } + } + ] + } + ), + "test", + ) + self.assertEqual( + self.filter._extract_summary_text_from_response( + { + "choices": [ + { + "message": { + "content": "", + "reasoning_content": "reasoning must be ignored", + } + } + ] + } + ), + "", + ) + self.assertEqual( + self.filter._extract_summary_text_from_response( + { + "output": [ + { + "type": "message", + "content": [ + { + "type": "output_text", + "text": "responses api", + } + ], + } + ] + } + ), + "responses api", + ) + self.assertEqual( + self.filter._extract_summary_text_from_response( + { + "output": [ + { + "type": "reasoning", + "content": [ + { + "type": "output_text", + "text": "reasoning output ignored", + } + ], + }, + { + "type": "message", + "content": [ + { + "type": "output_text", + "text": "final answer only", + } + ], + }, + ] + } + ), + "final answer only", + ) + + def test_call_summary_llm_accepts_output_only_response(self): + self.filter.valves.summary_model = "fake-summary-model" + self.filter.valves.show_debug_log = False + + async def fake_generate_chat_completion(request, payload, user): + return { + "output": [ + { + "type": "message", + "content": [ + { + "type": "output_text", + "text": "responses api", + } + ], + } + ] + } + + async def noop_log(*args, **kwargs): + return None + + original_generate = module.generate_chat_completion + original_get_user = getattr(module.Users, "get_user_by_id", None) + + module.generate_chat_completion = fake_generate_chat_completion + module.Users.get_user_by_id = staticmethod( + lambda user_id: types.SimpleNamespace(email="user@example.com") + ) + self.filter._log = noop_log + self.filter._get_model_thresholds = lambda model_id: { + "max_context_tokens": 8192 + } + self.filter._build_summary_prompt = ( + lambda conversation_text, previous_summary=None: conversation_text + ) + + try: + summary = asyncio.run( + self.filter._call_summary_llm( + "conversation", + {"model": "fake-summary-model"}, + {"id": "user-1"}, + ) + ) + finally: + module.generate_chat_completion = original_generate + if original_get_user is None: + delattr(module.Users, "get_user_by_id") + else: + module.Users.get_user_by_id = original_get_user + + self.assertEqual( + summary, + "responses api", + ) + + def test_call_summary_llm_rejects_empty_message_content(self): + self.filter.valves.summary_model = "fake-summary-model" + self.filter.valves.show_debug_log = False + self.filter.valves.SUMMARY_FAIL_MODE = "raise" + + async def fake_generate_chat_completion(request, payload, user): + return { + "choices": [ + { + "message": { + "role": "assistant", + "content": "", + }, + "finish_reason": "stop", + } + ] + } + + async def noop_log(*args, **kwargs): + return None + + original_generate = module.generate_chat_completion + original_get_user = getattr(module.Users, "get_user_by_id", None) + + module.generate_chat_completion = fake_generate_chat_completion + module.Users.get_user_by_id = staticmethod( + lambda user_id: types.SimpleNamespace(email="user@example.com") + ) + self.filter._log = noop_log + self.filter._get_model_thresholds = lambda model_id: { + "max_context_tokens": 8192 + } + self.filter._build_summary_prompt = ( + lambda conversation_text, previous_summary=None: conversation_text + ) + + try: + with self.assertRaises(Exception) as exc_info: + asyncio.run( + self.filter._call_summary_llm( + "conversation", + {"model": "fake-summary-model"}, + {"id": "user-1"}, + ) + ) + finally: + module.generate_chat_completion = original_generate + if original_get_user is None: + delattr(module.Users, "get_user_by_id") + else: + module.Users.get_user_by_id = original_get_user + + self.assertIn( + "LLM response did not contain summary text", str(exc_info.exception) + ) + def test_generate_summary_async_status_guides_user_to_browser_console(self): self.filter.valves.keep_first = 1 self.filter.valves.keep_last = 1 diff --git a/plugins/filters/async-context-compression/v1.6.4.md b/plugins/filters/async-context-compression/v1.6.4.md new file mode 100644 index 0000000..c38ff4b --- /dev/null +++ b/plugins/filters/async-context-compression/v1.6.4.md @@ -0,0 +1,20 @@ +# Async Context Compression v1.6.4 Release Notes + +## Overview + +This patch release broadens summary-response parsing so the filter can accept both classic chat-completions payloads and Responses-style output payloads. It also improves empty-summary diagnostics without persisting reasoning-only fields. + +## Bug Fixes + +- **Alternate summary payload support**: `_call_summary_llm()` now accepts summary text from `choices[].message.content`, `output_text` content parts, and Responses-style `output` message items. +- **Stale choices-only gate removed**: The summary call path no longer rejects valid provider payloads just because they omit `choices`. +- **Clearer empty-summary errors**: When no final summary text is present, the filter now reports a compact response-shape summary instead of a misleading generic format error. + +## Behavior Notes + +- **Reasoning-only output is ignored**: `reasoning_content`, `thinking`, and reasoning output items are not treated as summary text, so private chain-of-thought is not written into chat memory. +- **No change to 1.6.3 fail-mode behavior**: `SUMMARY_FAIL_MODE` continues to control whether upstream summary-call errors are silent or raised. + +## Migration Notes + +No breaking changes. If a provider returns only reasoning fields and no final answer text, the filter will skip saving a summary for that turn and log the response shape for debugging. \ No newline at end of file diff --git a/plugins/filters/async-context-compression/v1.6.4_CN.md b/plugins/filters/async-context-compression/v1.6.4_CN.md new file mode 100644 index 0000000..b436d83 --- /dev/null +++ b/plugins/filters/async-context-compression/v1.6.4_CN.md @@ -0,0 +1,20 @@ +# 异步上下文压缩 v1.6.4 版本发布说明 + +## 概述 + +这个补丁版本扩展了摘要响应解析能力,使过滤器可以同时接受经典 chat-completions 结构和 Responses 风格的 `output` 结构。同时,它改进了空摘要诊断,并确保 reasoning-only 字段不会被写入聊天记忆。 + +## 问题修复 + +- **支持更多摘要返回结构**:`_call_summary_llm()` 现在可以从 `choices[].message.content`、带 `output_text` 的 content parts,以及 Responses 风格 `output` 里的 message 项提取摘要文本。 +- **移除过时的 choices-only 入口限制**:摘要调用链路不再因为 provider 返回里没有 `choices` 就提前拒绝本来合法的 payload。 +- **更清晰的空摘要报错**:当响应里没有最终摘要文本时,过滤器现在会输出精简后的响应结构说明,而不是含义模糊的通用格式错误。 + +## 行为说明 + +- **reasoning-only 输出会被忽略**:`reasoning_content`、`thinking` 和 reasoning 类型 output 不会被当成摘要文本,因此不会把模型思考过程写入聊天记忆。 +- **不改变 1.6.3 的失败模式控制**:`SUMMARY_FAIL_MODE` 仍然决定上游摘要调用失败时是静默跳过还是继续抛错。 + +## 迁移说明 + +无破坏性变更。如果 provider 只返回 reasoning 字段而没有最终答案文本,过滤器会跳过本轮摘要保存,并把响应结构记录出来供调试。 \ No newline at end of file