Score Baseline leaderboard with 2fe

We use the market Brier as an estimate of question difficulty for market questions.

For models that don't use tools and were _not_ provided the crowd forecast, that metric is not apt, as they face additional difficulty from lack of context for forecasting.

Hence, use 2fe to estimate question difficulty for the Baseline leaderboard.

Consequently, drop Baseline models from the Tournament leaderboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Score Baseline leaderboard with 2fe #190

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Score Baseline leaderboard with 2fe #190

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions