AI Sycophancy

Because the way LLMs function, it is very hard for them to correct once they start going a wrong path (more recent models can look at the prompt multiple times and fix some of these, but in general the model can still hallucinate). It’s even worse when the LLM has no idea on what it should respond. So, it just creates something.

When you point out that the LLM is wrong, it admits that, praises your keen eye – you’re absolutely right is a phrase that just recently started being retired –, and then starts again from square one. But, this can lead to self-induced delusions of grandeur.

The problem is that goes beyond just making people think they know more than they actually do. It gives people – to various degrees – that unshakable confidence that they are right, that they are awesome, that they are on the far right side of the IQ curve. Even when the situation is ambiguous or contentious. Add in social media, and you have a recipe for heated discussions between “experts”. Which is why I wrote the previous article, it turns out that the proof is still not there yet.

In the end, scientific discovery is a field where hobbyists can easily get deluded into thinking they made a huge breakthrough. Pick any longstanding problem and there are tons of people that claimed to have solved it. We had “proofs” of perpetua mobilia in the past, now we have proofs of Collatz, P vs NP, Goldbach, and any of the Millennium Prize problems. LLMs just make it easier to get deluded, they are really great at giving you the impression that you stumbled upon something new. They will praise your crazy idea, giving you the impression that you know more than you really do.

It’s like sycophancy as a service, and the current economic drivers are pushing for it. Would you pay more per month for a model that tells you are stupid? Would the number of people that want this model be significant enough to justify training and serving it in parallel with the current ones?

Instead, the best approach is not to just rely on the LLM. You should have tooling around to compare multiple models, to present both the claim and the counter-claim and ask the LLM to judge them. Using the LLM for brainstorming in strict disciplines is not always a good idea, but tooling around the models to reduce hallucinations, to increase the chances of discovering that something is wrong, etc. becomes very important. This is why I thought that pairing Lean with LLMs in the bet from the previous article might work.

Next few articles might look into how these patterns of working with the LLM, not just with its output can be implemented.

Comments: