block structured text payloads from user replies by jamiepine · Pull Request #209 · spacedriveapp/spacebot

jamiepine · 2026-02-25T05:18:12Z

Summary

Add a shared should_block_user_visible_text guard to detect tool-like/structured payloads (JSON, bracket tool syntax, wrapper tags) before anything user-visible is sent.
Update channel turn handling to retry with a correction prompt up to 2 times when blocked output appears, nudging the model to call reply with plain text or skip.
Enforce the same guard inside the reply tool and expand the correction prompt wording so malformed payloads are blocked consistently across fallback and tool paths.

Example

if crate::tools::should_block_user_visible_text(text) {
    tracing::warn!(channel_id = %self.id, "blocked fallback output containing structured or tool syntax");
}

Testing

cargo test blocks_json_bracket_and_tool_syntax_output
cargo test allows_normal_plaintext_output

Note

This PR prevents LLM-generated structured payloads (JSON, bracket syntax, XML-like tags) from reaching end users as plain text. The solution introduces a shared validation function should_block_user_visible_text that filters output across three paths: the channel recovery loop (up to 2 retries), the fallback handler when retrigger produces text, and the reply tool directly. The correction prompt is now more explicit about what constitutes invalid output. All changes are in agent loop and tool enforcement with comprehensive test coverage.
_{Written by Tembo for commit 1d35d12.}

coderabbitai · 2026-02-25T05:18:22Z

Walkthrough

The changes implement safeguards to prevent LLMs from outputting tool syntax or structured payloads to end users. Updates include a revised system prompt, a new detection function, retry logic for recovery attempts, and a runtime guard in the reply tool to block problematic content.

Changes

Cohort / File(s)	Summary
System Prompt Update `prompts/en/fragments/system/tool_syntax_correction.md.j2`	Updated system message to explicitly disallow tool call syntax, structured payloads, and JSON/XML-like output, requiring plain natural-language replies only.
Content Detection & Blocking `src/tools.rs`, `src/tools/reply.rs`	Added `should_block_user_visible_text()` function to detect structured/tool payloads with unit tests. Integrated a runtime guard in ReplyTool::call to block and log detected problematic content before sending.
Tool Syntax Recovery Mechanism `src/agent/channel.rs`	Implemented retry loop (up to 2 attempts) for recovering from malformed tool syntax output. Added conditional branches to differentiate blocked vs. safe content, improved logging with channel_id and attempt counters, and adjusted fallback send paths accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'block structured text payloads from user replies' directly and clearly summarizes the main objective of the PR, which is to prevent LLM-generated structured payloads from reaching end users.
Description check	✅ Passed	The description is directly related to the changeset, providing a clear summary of the changes, examples of implementation, testing instructions, and detailed notes about how the solution works across three code paths.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/block-structured-channel-output

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tembo · 2026-02-25T05:45:16Z

src/tools.rs

+    let trimmed = value.trim_start();
+    if trimmed.is_empty() {
+        return false;
+    }
+
+    if trimmed.starts_with('{') || trimmed.starts_with('[') {
+        return true;
+    }
+
+    let lower = trimmed.to_ascii_lowercase();
+    if TOOL_PREFIXES.iter().any(|prefix| lower.starts_with(prefix)) {
+        return true;
+    }
+
+    lower.starts_with("<system-reminder>") || lower.starts_with("<path>")


should_block_user_visible_text currently blocks any output starting with [ (so it’ll reject legit plaintext like [FYI] ...), and it also makes the "[reply]"/etc entries in TOOL_PREFIXES unreachable because the early starts_with('[') returns first. Suggestion: only block [ when it matches known tool prefixes, or when it parses as JSON (so bracketed prose is allowed).

Suggested change

let trimmed = value.trim_start();

if trimmed.is_empty() {

return false;

}

if trimmed.starts_with('{') || trimmed.starts_with('[') {

return true;

}

let lower = trimmed.to_ascii_lowercase();

if TOOL_PREFIXES.iter().any(|prefix| lower.starts_with(prefix)) {

return true;

}

lower.starts_with("<system-reminder>") || lower.starts_with("<path>")

let trimmed = value.trim_start();

if trimmed.is_empty() {

return false;

}

if trimmed.starts_with('{') {

return true;

}

let lower = trimmed.to_ascii_lowercase();

if TOOL_PREFIXES.iter().any(|prefix| lower.starts_with(prefix)) {

return true;

}

if trimmed.starts_with('[') && serde_json::from_str::<serde_json::Value>(trimmed).is_ok() {

return true;

}

lower.starts_with("<system-reminder>") || lower.starts_with("<path>")

Might also be worth adding a small regression test like assert!(!should_block_user_visible_text("[FYI] hello")); so this doesn’t accidentally get re-introduced.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

src/agent/channel.rs (1)

1421-1455: Consolidate repeated fallback-send logic to prevent policy drift.

The same block-check → extract/normalize → log/send flow appears in three branches. Centralizing this into one helper would make future guard changes consistent across retrigger and non-retrigger paths.

♻️ Refactor sketch

- if crate::tools::should_block_user_visible_text(text) {
-     ...
- } else {
-     let extracted = extract_reply_from_tool_syntax(text);
-     ...
-     if !final_text.is_empty() {
-         ...
-         self.response_tx.send(OutboundResponse::Text(final_text)).await
-     }
- }
+ self.send_safe_fallback_text(text, is_retrigger, skipped).await;

async fn send_safe_fallback_text(&self, text: &str, is_retrigger: bool, skipped: bool) {
    // single place for block check, extraction, mention normalization, logging, and send
}

Also applies to: 1474-1506, 1520-1552

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel.rs` around lines 1421 - 1455, There are three duplicated
flows (block check via crate::tools::should_block_user_visible_text,
extract_reply_from_tool_syntax,
crate::tools::reply::normalize_discord_mention_tokens,
conversation_logger.log_bot_message, and sending via
response_tx.send(OutboundResponse::Text)) scattered across retrigger and
non-retrigger branches; consolidate them into a single async helper method
(e.g., send_safe_fallback_text(&self, text: &str, is_retrigger: bool, skipped:
bool)) that runs the block check, performs extraction with
extract_reply_from_tool_syntax, normalizes mentions with
normalize_discord_mention_tokens (using the same source lookup from
self.conversation_id), logs warnings when extraction occurred, logs the final
bot message via self.state.conversation_logger.log_bot_message, and sends via
self.response_tx.send, preserving the existing tracing.warn/error messages and
returned behavior; replace the three duplicated blocks with calls to this helper
so future guard/policy changes are centralized.

src/tools.rs (1)

534-550: Add regression tests for bracket-prefixed plaintext and generic wrapper tags.

Current tests validate core cases, but they don’t protect against the Line 209 false-positive class or generic wrapper-tag detection gaps.

🧪 Suggested test additions

     #[test]
     fn allows_normal_plaintext_output() {
         assert!(!should_block_user_visible_text("hello team"));
         assert!(!should_block_user_visible_text("- first\n- second"));
+        assert!(!should_block_user_visible_text("[FYI] deployment finished"));
     }
+
+    #[test]
+    fn blocks_generic_wrapper_tag_output() {
+        assert!(should_block_user_visible_text(
+            "<file name=\"notes.txt\">hello</file>"
+        ));
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tools.rs` around lines 534 - 550, Add regression tests to strengthen
should_block_user_visible_text: include negative assertions that plaintext
starting with bracket-prefixed labels (e.g., "[note] hello" or "[reply] hello"
where the content is plain text, not JSON/array) should NOT be blocked, and
include negative assertions for generic wrapper-like tags (e.g.,
"<wrapper>visible</wrapper>" or common tags like "<div>visible</div>") to ensure
the generic tag heuristic doesn't produce false positives; update the existing
test module (the functions blocks_json_bracket_and_tool_syntax_output and
allows_normal_plaintext_output) by adding these specific
assert!(!should_block_user_visible_text(...)) cases referencing
should_block_user_visible_text so future changes that reintroduce the Line 209
false-positive class or wrapper-tag detection gaps will be caught.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tools.rs`:
- Around line 209-219: The current matcher over-blocks any text starting with
'[' and only allows two hardcoded XML tags, causing false positives/negatives;
update the helper that uses trimmed and TOOL_PREFIXES so it: (1) only treats
leading '{' as a tool if the text is valid JSON/object (attempt a quick
serde_json::from_str check), (2) treat leading '[' as a tool only when it
matches a tool invocation pattern like '^\[\s*[A-Za-z0-9_-]+.*\]' (use a small
regex or manual parse to ensure the bracketed token looks like a tool name, not
a generic bracketed heading), and (3) replace the hardcoded tag checks (the
current lower.starts_with("<system-reminder>") || lower.starts_with("<path>"))
with a generic XML-wrapper detection that matches a single opening tag like
'^<\s*[A-Za-z0-9_-]+>' (use a case-insensitive regex on trimmed or lower) so
arbitrary XML-like wrapper tags are caught; keep the existing TOOL_PREFIXES
check intact. Ensure to reference TOOL_PREFIXES and the trimmed variable when
editing this helper.

---

Nitpick comments:
In `@src/agent/channel.rs`:
- Around line 1421-1455: There are three duplicated flows (block check via
crate::tools::should_block_user_visible_text, extract_reply_from_tool_syntax,
crate::tools::reply::normalize_discord_mention_tokens,
conversation_logger.log_bot_message, and sending via
response_tx.send(OutboundResponse::Text)) scattered across retrigger and
non-retrigger branches; consolidate them into a single async helper method
(e.g., send_safe_fallback_text(&self, text: &str, is_retrigger: bool, skipped:
bool)) that runs the block check, performs extraction with
extract_reply_from_tool_syntax, normalizes mentions with
normalize_discord_mention_tokens (using the same source lookup from
self.conversation_id), logs warnings when extraction occurred, logs the final
bot message via self.state.conversation_logger.log_bot_message, and sends via
self.response_tx.send, preserving the existing tracing.warn/error messages and
returned behavior; replace the three duplicated blocks with calls to this helper
so future guard/policy changes are centralized.

In `@src/tools.rs`:
- Around line 534-550: Add regression tests to strengthen
should_block_user_visible_text: include negative assertions that plaintext
starting with bracket-prefixed labels (e.g., "[note] hello" or "[reply] hello"
where the content is plain text, not JSON/array) should NOT be blocked, and
include negative assertions for generic wrapper-like tags (e.g.,
"<wrapper>visible</wrapper>" or common tags like "<div>visible</div>") to ensure
the generic tag heuristic doesn't produce false positives; update the existing
test module (the functions blocks_json_bracket_and_tool_syntax_output and
allows_normal_plaintext_output) by adding these specific
assert!(!should_block_user_visible_text(...)) cases referencing
should_block_user_visible_text so future changes that reintroduce the Line 209
false-positive class or wrapper-tag detection gaps will be caught.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between faf5a9f and 1d35d12.

📒 Files selected for processing (4)

prompts/en/fragments/system/tool_syntax_correction.md.j2
src/agent/channel.rs
src/tools.rs
src/tools/reply.rs

coderabbitai · 2026-02-25T05:47:04Z

src/tools.rs

+    if trimmed.starts_with('{') || trimmed.starts_with('[') {
+        return true;
+    }
+
+    let lower = trimmed.to_ascii_lowercase();
+    if TOOL_PREFIXES.iter().any(|prefix| lower.starts_with(prefix)) {
+        return true;
+    }
+
+    lower.starts_with("<system-reminder>") || lower.starts_with("<path>")
+}


⚠️ Potential issue | 🟠 Major

Refine matcher scope to avoid both over-blocking and wrapper-tag bypasses.

Line 209 blocks any leading [ text, so valid replies like bracketed headings can be suppressed. Lines 218-219 only check two hardcoded tags, so other XML-like wrappers can still pass. This helper is shared by fallback + reply, so these mismatches affect all user-visible paths.

🔧 Proposed fix

- if trimmed.starts_with('{') || trimmed.starts_with('[') { + if (trimmed.starts_with('{') || trimmed.starts_with('[')) + && serde_json::from_str::<serde_json::Value>(trimmed).is_ok() + { return true; } @@ - lower.starts_with("<system-reminder>") || lower.starts_with("<path>") + if lower.starts_with("<system-reminder>") || lower.starts_with("<path>") { + return true; + } + + // Block generic XML-like wrapper tags at start of message, e.g. <file ...>, <path>, <system-reminder>. + if let Some(rest) = trimmed.strip_prefix('<') { + if let Some(end_index) = rest.find('>') { + let tag_name = rest[..end_index] + .split_whitespace() + .next() + .unwrap_or(""); + if !tag_name.is_empty() + && tag_name + .chars() + .all(|character| character.is_ascii_alphanumeric() || character == '_' || character == '-') + { + return true; + } + } + } + + false }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/tools.rs` around lines 209 - 219, The current matcher over-blocks any text starting with '[' and only allows two hardcoded XML tags, causing false positives/negatives; update the helper that uses trimmed and TOOL_PREFIXES so it: (1) only treats leading '{' as a tool if the text is valid JSON/object (attempt a quick serde_json::from_str check), (2) treat leading '[' as a tool only when it matches a tool invocation pattern like '^\[\s*[A-Za-z0-9_-]+.*\]' (use a small regex or manual parse to ensure the bracketed token looks like a tool name, not a generic bracketed heading), and (3) replace the hardcoded tag checks (the current lower.starts_with("<system-reminder>") || lower.starts_with("<path>")) with a generic XML-wrapper detection that matches a single opening tag like '^<\s*[A-Za-z0-9_-]+>' (use a case-insensitive regex on trimmed or lower) so arbitrary XML-like wrapper tags are caught; keep the existing TOOL_PREFIXES check intact. Ensure to reference TOOL_PREFIXES and the trimmed variable when editing this helper.

block structured text replies and retry with correction

1d35d12

jamiepine marked this pull request as ready for review February 25, 2026 05:40

jamiepine merged commit ee8b6dd into main Feb 25, 2026
3 of 4 checks passed

tembo bot reviewed Feb 25, 2026

View reviewed changes

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

block structured text payloads from user replies#209

block structured text payloads from user replies#209
jamiepine merged 1 commit intomainfrom
fix/block-structured-channel-output

jamiepine commented Feb 25, 2026 •

edited by tembo bot

Loading

Uh oh!

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

tembo bot Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jamiepine commented Feb 25, 2026 • edited by tembo bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Testing

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

tembo bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jamiepine commented Feb 25, 2026 •

edited by tembo bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading