Skip to content

Question about some excluded LongMemEval questions #1

@Litmeb

Description

@Litmeb

thanks for releasing the code and evaluation configs for OBLIVION. i have a question about the number of the questions of longmemeval

in README.md, it says full_eval_20260202/ -- Full-scale evaluation configs (488 samples, various strategies), but in the official repo the cleaned benchmark files are described as containing 500 evaluation instances

i also noticed that these codes exclude some questions in blacklist, but this list seemed not provided in this repo, so could you clarify whether the LongMemEval results in the paper were computed on the official 500-instance cleaned split, or on a filtered 488-instance subset? and if a filtered 488-instance subset was used, i wonder if it's common?

i may have misunderstood the setup, so please correct me if I missed something...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions