Also, I've been hearing a lot of complaints that Chatbot Arena tends to favor: -...

kozikow · 2025-04-30T12:42:33 1746016953

More to that - at this point, it feels to me, that arenas are getting too focused on fitting user preferences rather than actual model quality.

In reality I prefer different model, for different things, and quite often it's because model X is tuned to return more of my preference - e.g. Gemini tends to be usually the best in non-english, chatgpt works better for me personally for health questions, ...

n8m8 · 2025-04-30T15:30:30 1746027030

Interesting idea, I think I'm on board with this correlation hypothesis. Obviously it's complicated, but it does seems like over-reliance on arbitrary opinions from average people would result in valuing "feeling" over correctness.

jimmaswell · 2025-04-30T14:10:42 1746022242

> sycophantic behavior of recent models

The funniest example I've seen recently was "Dude. You just said something deep as hell without even flinching. You're 1000% right:"

pc86 · 2025-04-30T14:50:31 1746024631

This type of response is the quickest way for me to start verbally abusing the LLM.