Thank you for following our work!

Hi, I'm Huanxin Sheng. I'm so excited that you follow our work and extend to VLM-as-a-judge paradigm! 

I briefly skimmed your work, which is really great! I noticed the intervals in your work seem to be quite wide, which might already indicates that VLMs are still not strong enough to act as a good grader for image understanding. 

Related to this problem, I'm not sure whether you can test stronger models (e.g. newest Gemini model) to make this claim more convincing, but you may also try some new CP methods for low-cost evidence. As my experience,  CIR (https://arxiv.org/pdf/2601.02769) is also a good choice if you are interested, just as good as R2CCP and BoostedCP.

I'm still active in this topic though currently studying on agents. So I'm highly excited and looking forward to further discussion or collaboration. :)

Best,
Huanxin
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thank you for following our work! #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Thank you for following our work! #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions