Skip to content

fix(api/task): guard empty results list in process_results#1305

Draft
Luodian wants to merge 1 commit into
EvolvingLMMs-Lab:mainfrom
Luodian:fix/guard-empty-generate-until-results
Draft

fix(api/task): guard empty results list in process_results#1305
Luodian wants to merge 1 commit into
EvolvingLMMs-Lab:mainfrom
Luodian:fix/guard-empty-generate-until-results

Conversation

@Luodian
Copy link
Copy Markdown
Contributor

@Luodian Luodian commented Apr 23, 2026

Summary

For generate_until / generate_visual_cot tasks, when a sample's generation failed upstream (model raised, retries exhausted, returned []), the results list handed to ConfigurableTask.process_results can be empty. The existing isinstance check then falls through to results[0] on the list-of-list branch and raises IndexError — which aborts the whole postprocess loop for that task, not just the missing sample.

Change

Add a leading results and so an empty list routes to the else branch (which does [res.strip() for res in results] and produces []). Downstream task-level process_results then receives an empty list and can decide how to score the missing sample without taking down the whole task.

if self.OUTPUT_TYPE in ("generate_until", "generate_visual_cot"):
-   if isinstance(results, list) and isinstance(results[0], list):
+   # Guard empty results so results[0] below does not IndexError for
+   # samples whose generation failed. Downstream process_results then
+   # receives an empty list and can decide how to score the miss.
+   if results and isinstance(results, list) and isinstance(results[0], list):
        results = [res.strip() for res in results[0]]
    else:
        results = [res.strip() for res in results]

Test plan

  • Any task run where at least one generation produces [] — verify the task finishes and the missing sample is scored by downstream process_results.
  • Happy-path run — verify no change in scores.

For generate_until / generate_visual_cot tasks, when a sample's
generation failed upstream (model raised, retries exhausted, etc.)
the results list can be empty. The existing isinstance check then
falls through to `results[0]` on the list-of-list branch and raises
IndexError — which aborts the full postprocess loop for that task,
not just the missing sample.

Add a leading `results and` so an empty list falls through to the
else branch that does `[res.strip() for res in results]` (empty
output) — downstream task-level process_results then receives an
empty list and can decide how to score the missing sample without
taking down the whole task.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant