- V1 scope is image uploads only.
- V1 document types are invoices, receipts, and unknown.
- Direct OpenAI hosting is not assumed.
- Together AI is the initial provider target through its OpenAI-compatible chat completions API.
- The initial model default should be configuration-driven, with
google/gemma-4-31B-itas the first candidate because Together currently lists it with image input, JSON mode, and function-calling support. - Semantic Kernel remains a first-class requirement, not a decorative wrapper.
- Native C# Semantic Kernel plugins should handle deterministic business policy checks such as vendor matching and approval limits.
- Created a temporary console spike at
spikes/MilestoneZeroSpike. - Added
Microsoft.SemanticKernel.Connectors.OpenAIversion1.75.0. - Configured Semantic Kernel with:
- model id:
google/gemma-4-31B-it - endpoint:
https://api.together.xyz/v1 - service id:
together-vision
- model id:
- Used a fake HTTP handler to capture the outbound request without requiring a Together API key.
- The spike compiled and ran successfully.
- Semantic Kernel posted to
https://api.together.xyz/v1/chat/completions. - The request body used OpenAI-compatible multimodal chat content:
- text item
image_urlitem containing adata:image/png;base64,...URL
OpenAIPromptExecutionSettings { ResponseFormat = "json_object" }serialized as:
{
"response_format": {
"type": "json_object"
}
}- After
TOGETHER_API_KEYwas configured, the live Together call succeeded through Semantic Kernel usinggoogle/gemma-4-31B-it. - The live response returned valid JSON. The category was
Unknown, as expected for the synthetic one-pixel blank PNG used by the transport spike.
- The preferred path is viable enough to proceed into Milestone 1.
- The live provider transport path has been verified.
- The application should still isolate model calls behind app-owned classifier/extractor interfaces so that provider quirks do not leak through the codebase.
- If the live Together request rejects SK's image payload shape, the fallback is to keep SK for orchestration/plugins and use a narrow
TogetherVisionClientonly for the image call.
- Pin package versions rather than relying on "latest".
- Make provider endpoint, model id, and API key environment variable configurable.
- Add an early integration-test path that can be skipped when
TOGETHER_API_KEYis not set.