Skip to content

feat: add external tool plugins, runtime tool policies, and OpenAI tool-name aliasing#15

Open
iamthehobbit wants to merge 9 commits intoShinMegamiBoson:mainfrom
iamthehobbit:op-pr3-ext-plugins-policy-openai
Open

feat: add external tool plugins, runtime tool policies, and OpenAI tool-name aliasing#15
iamthehobbit wants to merge 9 commits intoShinMegamiBoson:mainfrom
iamthehobbit:op-pr3-ext-plugins-policy-openai

Conversation

@iamthehobbit
Copy link

Summary

This PR adds three related capabilities needed for safe external tool integrations:

  1. External plugin loading (OPENPLANTER_TOOL_MODULES)
  2. Runtime tool policy guardrails (allowlist + confirmation metadata hooks)
  3. OpenAI compatibility for namespaced/dotted tool names via aliasing

It also includes a small OpenAI compatibility retry improvement for reasoning_effort.

What’s included

External plugins

  • Adds plugin loader (agent/plugin_loader.py)
  • Loads external plugin modules from OPENPLANTER_TOOL_MODULES
  • Registers external plugins into the existing tool registry
  • Updates tests for loader/config/engine integration

Runtime tool policy guardrails

  • ToolPlugin.policy metadata support in registry
  • Engine enforcement hooks:
    • allowlist via OPENPLANTER_ALLOWED_TOOLS (glob patterns)
    • confirmation callback for tools marked requires_confirmation
  • Tests for allowlist + confirmation behavior

OpenAI tool-name aliasing (fix)

  • OpenAI APIs reject dotted tool names (e.g. foo.bar)
  • Adds provider-side aliasing for OpenAI-facing tool definitions
  • Maps aliased OpenAI tool calls back to canonical tool names before dispatch
  • Maps tool result messages back to OpenAI-safe names
  • Adds alias collision detection
  • Tests for alias conversion and parser/result mapping

OpenAI reasoning retry compatibility

  • Retries without reasoning_effort for additional API error wording:
    • "Unrecognized request argument supplied: reasoning_effort"

Tests Added / Coverage

  • tests/test_plugin_loader.py (new)
    • module list parsing
    • plugin module loading success/error paths
  • tests/test_coverage_gaps.py
    • AgentConfig.from_env() parsing for OPENPLANTER_TOOL_MODULES
    • AgentConfig.from_env() parsing for OPENPLANTER_ALLOWED_TOOLS
  • tests/test_engine.py
    • external plugin registration/invocation
    • allowlist blocks disallowed tools
    • confirmation-policy block/approval paths
  • tests/test_tool_defs.py
    • OpenAI alias generation for dotted names
    • alias collision detection
  • tests/test_model.py
    • OpenAI aliased tool-call names map back to canonical names
    • OpenAI tool-result messages use aliased names
    • reasoning_effort retry on “Unrecognized request argument” error

Why

This enables real external integrations (e.g. namespaced plugin tools) while preserving safety and compatibility across providers.

Notes

This PR combines these features because they were implemented together across overlapping engine/registry/provider files during a real integration effort. I can split further if preferred.

ShinMegamiBoson and others added 9 commits February 21, 2026 19:58
… and tests

New data sources cataloged with fetch scripts and validation tests:
- FEC federal campaign finance (API + bulk)
- USASpending.gov federal contracts (API)
- SAM.gov contractor registrations (API)
- SEC EDGAR public company filings (API)
- FDIC BankFind institution data (API)
- ProPublica Nonprofit Explorer / IRS 990 (API)
- Senate lobbying disclosures LD-1/LD-2 (bulk XML)
- EPA ECHO enforcement & compliance (API)
- OSHA inspection data (API)
- OFAC SDN sanctions list (bulk CSV)
- ICIJ Offshore Leaks database (bulk CSV)
- US Census Bureau ACS (API)

All scripts use Python stdlib only. 104 new tests pass (18 skip
without API keys or network). Wiki index updated with new categories.
- Replace _ThinkingDisplay with _ActivityDisplay supporting three modes:
  thinking (cyan), streaming response (green), tool execution (yellow)
- Auto-transition from thinking→streaming on first text delta
- Show step counter (Step N/max) in activity spinner
- Show tool name and key argument during tool execution
- Add engine cancellation via threading.Event (_cancel flag)
- Run agent in background thread so user can type next question
- Queue typed input during agent execution (FIFO)
- ESC key binding cancels the running agent
- Add 13 tests covering ActivityDisplay, cancellation, and queuing
Extends ToolResult with optional ImageData payload and adds a read_image
tool that reads PNG/JPEG/GIF/WebP files, base64-encodes them, and passes
them through to the model layer in provider-specific formats (Anthropic
content blocks, OpenAI data URI in user messages).
Add 5 reusable analysis scripts:
- quickstart_investigation.py: starter template for investigations
- scripts/entity_resolution.py: entity linking pipeline
- scripts/cross_link_analysis.py: cross-referencing engine
- scripts/build_findings_json.py: report synthesis utility
- scripts/timing_analysis.py: statistical timing correlation

Fix TUI flicker by making _ActivityDisplay a Rich renderable (__rich__
protocol) so Live's 8fps auto-refresh polls state instead of feed()
forcing update() on every token delta.
Tell the agent about replay.jsonl, events.jsonl, and state.json in its
session directory so it can read its own prior transcripts and recall
earlier work within a session.
…ape sequences

Root cause: prompt_toolkit's patch_stdout() wraps sys.stdout with StdoutProxy
which corrupts Rich's ANSI escape sequences — replacing ESC bytes (0x1b) with
'?' (0x3f). This caused raw escape codes like ?[2K?[1A?[2K to appear as
visible text instead of being interpreted by the terminal.

Fix: scope patch_stdout() to only wrap session.prompt(), not the entire main
loop. During agent execution, Rich's Live writes directly to the real stdout
with correct escape sequences. Also remove the secondary prompt loop (which
compounded the issue) and switch cancellation from ESC to Ctrl+C.

Verified via PTY test: 36 correct ESC sequences, 0 corrupted (was 0/16).
@iamthehobbit
Copy link
Author

This PR is part of a tested integration series. Suggested review/merge order: #14 -> #16 -> #15 -> #17.\n\nRecommended to review after #16 (it builds on the plugin-primary registry/tool export work). Opened against due upstream base-branch constraints for fork-only stacked branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants