The "Are You Sure?" Problem: Why Your AI Keeps Changing Its Mind

8 93

Posted by msmash on Thursday February 12, 2026 @10:03AM from the yes-man dept.

The large language models that millions of people rely on for advice -- ChatGPT, Claude, Gemini -- will change their answers nearly 60% of the time when a user simply pushes back by asking "are you sure?," according to a study by Fanous et al. that tested GPT-4o, Claude Sonnet, and Gemini 1.5 Pro across math and medical domains.

The behavior, known in the research community as sycophancy, stems from how these models are trained: reinforcement learning from human feedback, or RLHF, rewards responses that human evaluators prefer, and humans consistently rate agreeable answers higher than accurate ones. Anthropic published foundational research on this dynamic in 2023. The problem reached a visible breaking point in April 2025 when OpenAI had to roll back a GPT-4o update after users reported the model had become so excessively flattering it was unusable. Research on multi-turn conversations has found that extended interactions amplify sycophantic behavior further -- the longer a user talks to a model, the more it mirrors their perspective.

8 comments

Easy fix (Score: 5, Funny)

by blackomegax ( 807080 ) on Thursday February 12, 2026 @10:17AM (#65984732)

Ask it, "are you sure you're sure?" and it'll output the correct answer
Fucking morons (Score: 5, Insightful)

by reanjr ( 588767 ) on Thursday February 12, 2026 @10:19AM (#65984742)

Why does it prefer agreeable text to facts?
BECAUSE LLMS DON'T KNOW FACTS, you fucking twit.
Re:Fucking morons (Score: 5, Funny)

by sinij ( 911942 ) on Thursday February 12, 2026 @10:24AM (#65984762)

Are you sure?
Re:Fucking morons (Score: 5, Insightful)

by cstacy ( 534252 ) on Thursday February 12, 2026 @11:47AM (#65984970)

Humans don't "know" facts either.
No: the point is that humans DO know facts.
They might be operating with incorrect/untrue facts, but humans are actually reasoning, with facts. Likewise, traditional AI systems also know facts and reason with them. (The problem there is that the set of facts is very small, and its expensive, so that kind of AI only operates in extremly limited domains in which it is an "expert".) By contrast, an LLM has no facts and does no reasoning. Those are simply not what an LLM does.
No (Score: 5, Insightful)

by drinkypoo ( 153816 ) on <drink@hyperlogos.org> on Thursday February 12, 2026 @10:23AM (#65984758)

The behavior, known in the research community as sycophancy, stems from how these models are trained: reinforcement learning from human feedback, or RLHF, rewards responses that human evaluators prefer, and humans consistently rate agreeable answers higher than accurate ones.
No, it's because in the training corpus most of the responses to "are you sure" that anyone bothered to record will involve someone being corrected.
Attention Blocks (Score: 5, Informative)

by SumDog ( 466607 ) on Thursday February 12, 2026 @10:31AM (#65984788)

Your prompt is broken apart into tokens, the system prompt tells the LLM to be a helpful assistant and your prompt is append to it and then it predicts the next likely token response based on the weighted model of the entire embedding space. When you ask "are you sure?" it's going to break that apart into tokens, add it to the context window and use the same attention algorithm to adjust all the weights for the next predictive response.

Those simple tokens can propagate big changes to to matrices that hold the current context.

These machines aren't magical. They don't reason. They're not oracles. They can't get things "wrong" or "right" because they have no intent and no concept of those things. They're generating text on a deterministic model, and adding some randomness by not always picking the most likely next token (sometimes picking the 96% vs 98% likely next token). Most people just don't understand how this stuff works and use terms like "hallucinating" because no one is being honest about what the weighted random guessing machines do.
Re:If one were to anthropomorphize AI; (Score: 5, Funny)

by laxguy ( 1179231 ) on Thursday February 12, 2026 @11:48AM (#65984976)

oh great now your son is gonna steal my job too?! UGH!
Doesn't help with uncommon subjects (Score: 5, Interesting)

by madbrain ( 11432 ) on Thursday February 12, 2026 @01:08PM (#65985198)

Try asking it something you know the answer to, on some rare topic.
For instance, I recently tuned my 189 string harpsichord - a painful process. For fun, I asked several AIs for a list of the most sifficult instruments to tune. It didn't even make the list, even after this famous prompt. It took a while for it to finally appear in its responses. This is likely because a very small number of people play the harpsichord nowadays.
Similarly, I tied to vibe code some security code using NSS in Python. This was with Code rhapsodyx using Claude underneath. It kept switching to OpenSSL and rewriting the code countless times after running into a snag with the code it generated. Probably did so at least 50 times. This is because the vast majority of the code it was trained on uses OpenSSL. I had to fight its training. It was extremely painful. The problem it ran into was trivial - failing to call an initialization function. But it kept repeating its mistake, over and over. I eventually got what I want out of it. I could not have written the project without the AI, as I was dealing with a programing language i can only read, but not write.

Slashdot

Last updated: 2026-02-14 03:45:00 PM EST

The "Are You Sure?" Problem: Why Your AI Keeps Changing Its Mind

8 comments

Easy fix (Score: 5, Funny)

by blackomegax ( 807080 ) on Thursday February 12, 2026 @10:17AM (#65984732)

Fucking morons (Score: 5, Insightful)

by reanjr ( 588767 ) on Thursday February 12, 2026 @10:19AM (#65984742)

Re:Fucking morons (Score: 5, Funny)

by sinij ( 911942 ) on Thursday February 12, 2026 @10:24AM (#65984762)

Re:Fucking morons (Score: 5, Insightful)

by cstacy ( 534252 ) on Thursday February 12, 2026 @11:47AM (#65984970)

No (Score: 5, Insightful)

by drinkypoo ( 153816 ) on <drink@hyperlogos.org> on Thursday February 12, 2026 @10:23AM (#65984758)

Attention Blocks (Score: 5, Informative)

by SumDog ( 466607 ) on Thursday February 12, 2026 @10:31AM (#65984788)

Re:If one were to anthropomorphize AI; (Score: 5, Funny)

by laxguy ( 1179231 ) on Thursday February 12, 2026 @11:48AM (#65984976)

Doesn't help with uncommon subjects (Score: 5, Interesting)

by madbrain ( 11432 ) on Thursday February 12, 2026 @01:08PM (#65985198)