fix: support multimodal image content in OpenAI provider by buuzzy · Pull Request #14 · codeany-ai/open-agent-sdk-typescript

buuzzy · 2026-04-17T01:19:05Z

The OpenAI provider drops { type: 'image' } content blocks during message conversion, so images are never sent to the model when using OpenAI-compatible endpoints.

Changes:

convertUserMessage() now converts Anthropic-style image blocks ({ type: 'base64', media_type, data }) to OpenAI's image_url format
Also supports URL-based image sources
Text-only messages keep simple string format for backward compatibility
query() prompt type widened to string | any[] for multimodal content arrays

The OpenAI provider's convertUserMessage() only handled 'text' and 'tool_result' content blocks, silently dropping 'image' blocks. Changes: - Convert Anthropic-style image blocks to OpenAI image_url format - Support base64 and URL image sources - Use detail:'high' for best recognition quality - Fall back to string content when no images present - Accept string | any[] prompt in query() for multimodal content

buuzzy mentioned this pull request Apr 17, 2026

fix: pass images as multimodal content instead of file-based Read tool workany-ai/workany#54

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support multimodal image content in OpenAI provider#14

fix: support multimodal image content in OpenAI provider#14
buuzzy wants to merge 1 commit intocodeany-ai:mainfrom
buuzzy:fix/openai-multimodal-image

buuzzy commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

buuzzy commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant