Do LLM modelers maintain a list of manual corrections fed by humans?

Gork@sopuli.xyz · 27 days ago

Do LLM modelers maintain a list of manual corrections fed by humans?

brucethemoose@lemmy.world · edit-2 27 days ago

Yes. Absolutely.

The meme in the research community is that current LLMs are literally trained on benchmarks and common stuff people test in LM-Arena, like the how many r’s in strawberry question. I’m not talking speculatively: Meta literally got caught red-handed doing this. They ran a separate finetune just to look good on lm-arena. And some benchmarks like MMLU have errors in them that many LLMs *answer ‘correctly’.

It’s not like some single person is collecting all these though.