fix: eval tool_call_response uses correct event field names#2302
Merged
fix: eval tool_call_response uses correct event field names#2302
Conversation
SessionFromEvents and buildTranscript were looking for a non-existent "tool_call" nested map in tool_call_response events. The actual ToolCallResponseEvent serializes with "tool_call_id" (top-level string) and "tool_definition" (top-level object), so the type assertions always failed silently — tool call results were never written to sessions and transcript tool names were empty. Fix both functions to use the correct field names and update tests to match the real ToolCallResponseEvent JSON format. Assisted-By: docker-agent
Assisted-By: docker-agent
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
This PR correctly updates the event parsing logic to match the new tool_call_response event structure. The changes are well-documented with inline comments explaining the structural change, and all test fixtures have been updated consistently.
Key improvements:
- ✅ Proper nil guard added for
tool_definitionineval.goto prevent panics - ✅ Consistent field access pattern across both
eval.goandsave.go - ✅ Test fixtures updated to match the new structure
- ✅ Clear explanatory comments added
No bugs found in the changed code.
dgageot
approved these changes
Apr 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix incorrect parsing of
tool_call_responseevents in the evaluation package.Problem
The
tool_call_responseevent structure was updated so thattool_call_idandtool_definitionare top-level fields, but the parsing code was still reading from the old nestedtool_callstructure. This caused:save.go: tool response messages were silently dropped from reconstructed sessionseval.go: tool names showed up as empty in evaluation transcriptsAdditionally,
buildTranscripthad no nil guard ontool_definition, which could cause a runtime panic if the field was missing.Changes
save.go: readtool_call_idfrom the top-level event field instead oftool_call.ideval.go: readtool_definition.namefrom the top-level event field instead of viagetToolCallInfo, with a nil guard to prevent panicseval_test.go,save_test.go: updated test fixtures to match the correct event structure