The Quiet Problem With Well-Supported Claims

Creatine is one of the most well-studied supplements in existence.

That claim is true if you’re a young male doing resistance training. The evidence base for that population is deep and robust.

But ask whether those conclusions hold for a 55-year-old woman interested in cognitive health, and the foundation gets dramatically thinner. The claim doesn’t become false. It becomes ungrounded for that specific context. And nothing you consumed told you that.

This pattern is everywhere. It isn’t misinformation in the dramatic sense (like deepfakes or conspiracy theories). However, it resembles something quieter and more corrosive: well-supported conclusions from one context applied confidently to another where the evidence simply isn’t there.

A recent study in Nature found that AI models exhibit a pattern resembling the Dunning-Kruger effect: the most accessible models are the most confidently wrong, while the most capable ones are more accurate but less decisive [1]. The systems we’re building to help verify information don’t know what they don’t know either.

The reason is architectural. Today’s AI learns from statistical patterns. It can tell you what’s likely, but not what’s grounded. Fortunately, a promising counter-trajectory is emerging and is now getting more and more attention: smaller, specialized models paired with structured knowledge that encodes not just facts, but where those facts come from and where the evidence runs out [2]. I might go into more details about that particular paper in a later post.

One of the reasons why I found that paper interesting is that it revolves around the same overarching topic as my master’s thesis did back in the day. That is, how language models could be improved by grounding them in knowledge graphs (structured representations of how concepts relate and where claims find their support). The question I was asking then is the same one I’m asking now, just at a much larger scale.

Going back even further to when I decided to study physics for my undergraduate degree, I did it because I wanted to describe the world. The first thing you learn is that uncertainty bounds matter as much as the measurement itself. A result without error bars isn’t a result - it’s at best a guess. I believe the same principle should apply to the information systems we rely on every day and that we can avoid past one another if we can make the ‘error bars’ explicit to reveal the underlying assumptions.

Together with an amazing team, I am building in this exact space and I can’t wait to share what we’ve been working on very soon.