William Morris DePue says LM Arena Elo scores are easily gamed by longer responses and Markdown, arguing it is not a good evaluation.
can we not post lm arena elo score.
it’s literally worth 200 points if you just crank Markdown and response length.
this is not a good eval
This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new authority voices, debates, and emerging ideas.
← Back to Artificial Intelligence