OpenAI researchers say large language models continue to hallucinate because current evaluation methods encourage them to guess rather than admit uncertainty.
Hallucinations, defined as confident but false statements, persist despite advances in models such as GPT-5. Low-frequency facts, like specific dates or names, are particularly vulnerable.
The study argues that while pretraining predicts the next word without true or false labels, the real problem lies in accuracy-based testing. Evaluations that reward lucky guesses discourage models from saying 'I don't know'.
Researchers suggest penalising confident errors more heavily than uncertainty, and awarding partial credit when AI models acknowledge limits in knowledge. They argue that only by reforming evaluation methods can hallucinations be meaningfully reduced.