-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Upon reviewing the evaluation.py file, I noticed that only submissions from index 0 to config.length are considered. In cases where the number of submissions increases, wouldn’t the performance measurement for each question become imbalanced due to the exclusion of later submissions? I also observed in the dataset that there are frequent instances where the submission length exceeds 50, so I wanted to inquire about this.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels