Why OpenAI's solution to AI hallucinations would kill ChatGPT tomorrow

36 points by ricksunny 2 hours ago

binarymax 28 minutes ago

Saying “I don’t know” to 30% of queries if it actually doesn’t know, is a feature I want. Otherwise there is zero trust. How do I know that I’m in a 30% wrong or 70% correct situation right now?

nunez 18 minutes ago

The paper does a good job explaining why this is mathematically not possible unless the question-answer bank is a fixed set.
jeremyjh 26 minutes ago

It doesn’t know what it doesn’t know.
- binarymax 17 minutes ago
  
  Well sure. But maybe the token logprobs can be used to help give a confidence assessment.
  - tyre 3 minutes ago
    
    Anthropic has a great paper on exactly this!
    https://www.anthropic.com/research/language-models-mostly-kn...
    The best is its plummeting confidence when beginning the answer to “Why are you alive?”
    Big same, Claude.

danjc 34 minutes ago

This is written by someone who has no idea how transformers actually work

progval 8 minutes ago

I don't know what to make of it. The author looks prolific in the field of ML, with 8 published articles (and 3 preprints) in 2025, but only one on LLMs specficially. https://scholar.google.com/citations?hl=en&user=AB5z_AkAAAAJ...
ricksunny 13 minutes ago

Contra: The piece’s first line cites OpenAI directly https://openai.com/index/why-language-models-hallucinate/
neuroelectron 32 minutes ago

Furthermore, if you simply try to push certain safety topics, you can see how actually can reduce hallucinations or at least make certain topics a hard line. They simply don't because agreeing with your pie-in-the-sky plans and giving you vague directions encourages users to engage and use the chatbot.
If people got discouraged with answers like "it would take at least a decade of expertise..." or other realistic answers they wouldn't waste time fantasizing plans.

gary_0 an hour ago

A better headline might be "OpenAI research suggests reducing hallucinations is possible but may not be economical".

scotty79 a few seconds ago

Isn't it even simpler? There are no (or almost no) question in the training data that the correct answer to is I don't know.

Once you train model within specific domain and add to training data out of domain questions or unresolvable questions within domain things will improve.

The question is, is this desirable if most of users grew to love sycophantic confident confabulators.

skybrian 37 minutes ago

> Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly.

Or maybe they would learn from feedback to use the system for some kinds of questions but not others? It depends on how easy it is to learn the pattern. This is a matter of user education.

Saying "I don't know" is sort of like an error message. Clear error messages make systems easier to use. If the system can give accurate advice about its own expertise, that's even better.

pton_xd 33 minutes ago

> Saying "I don't know" is sort of like an error message. Clear error messages make systems easier to use.
"I don't know" is not a good error message. "Here's what I know: ..." and "here's why I'm not confident about the answer ..." would be a helpful error message.
Then the question is, when it says "here's what I know, and here's why I'm not confident" -- is it telling the truth, or is that another layer of hallucination? If so, you're back to square one.
- skybrian 31 minutes ago
  
  Yeah, AI chatbots are notorious at not understanding their own limitations. I wonder how that could be fixed?

fumeux_fume 36 minutes ago

The author doesn't bother to consider that giving a false response already leads to more model calls until a better one is provided.

otterley 12 minutes ago

Not if the user doesn’t know that the response is false.

t_mann 38 minutes ago

> Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly.

Sounds like the kind of thing that would get A/B-tested in practice. Or maybe not, to avoid the ethical (and potential legal) conundrum of whether your models should 'lie for profit'. Not explicitly testing means less paper trail.

lif an hour ago

"What is the real meaning of humility?

AI Overview

The real meaning of humility is having an accurate, realistic view of oneself, acknowledging both one's strengths and limitations without arrogance or boastfulness, and a modest, unassuming demeanor that focuses on others. It's not about having low self-esteem but about seeing oneself truthfully, putting accomplishments in perspective, and being open to personal growth and learning from others."

Sounds like a good thing to me. Even, winning.

tomrod 3 minutes ago

A perfectly cromulent and self-empowering answer, a call to morality the stoics would appreciate and the sophists of many stripes would become peeved.
Well done, AI, you've done it.

ricksunny an hour ago

I felt this was such a cogent article on business imperatives vs fundamental transformer hallucinations, couldn’t help but HN-submit. In fact seems like a stealth plea for uncertainty-embracing benchmarks industry-wide.

tomrod 2 minutes ago

Data Science tried to inject confidence bounds into businesses. It didn't go well.

pdntspa 30 minutes ago

We have always known LLMs are prediction machines. How is this report novel?

jasfi an hour ago

Easily solved, pairs of models, one which would rather say IDK, one which would rather guess. Most AI agents would want the IDK version.

otterley 10 minutes ago

Anyone who claims something is easy to solve should be forced to implement their solution.
ForOldHack 25 minutes ago

Maybe, but I don't know. Although I would like to channel as many snarky remarks as I could, to be more constructive, I would use the IDK model, as I have with programming questions and use the psychotic one for questions like "are we in a simulation?" And "Yes, I would like fries with that and a large orange drink."

nunez 20 minutes ago

From the abstract of the paper [^0]:

> Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty

This is a de facto false equivalence for two reasons.

First, test takers that are faced with hard questions have the capability of _simply not guessing at all._ UNC did a study on this [^1] by administering a light version of the AMA medical exam to 14 staff members that were NOT trained in the life sciences. While most of the them consistently guessed answers, roughly 6% of them did not. Unfortunately, the study did not disambiguate correct guesses versus questions that were left blank. OpenAI's paper proves that LLMs, at this time of writing, simply do not have the self-awareness of knowing whether they _really_ don't know something, by design.

Second, LLMs are not test takers in the pragmatic sense. They are query answerers. Bar argument settlers. Virtual assistants. Best friends on demand. Personal doctors on standby.

That's how they are marketed and designed, at least.

OpenAI wants people to use ChatGPT like a private search engine. The sources it provides when it decides to use RAG are there more for instilling confidence in the answer instead of encouraging their users to check its work.

A "might be inaccurate" disclaimer on the bottom is about as effective as the Surgeon General's warning on alcohol and cigs.

The stakes are so much higher with LLMs. Totally different from an exam environment.

A final remark: I remember professors hammering "engineering error" margins into us when I was a freshman in 2005. 5% was what was acceptable. That we as a society are now okay with using a technology that has a >20% chance of giving users partially or completely wrong answers to automate as many human jobs as possible blows my mind. Maybe I just don't get it.

[^0] https://arxiv.org/pdf/2509.04664

[^1] https://www.rasch.org/rmt/rmt271d.htm