May 14, 2026

AI Failure Mode: Lie. And how will that impact vulnerability management?

Greg Reber

‍

There is a peculiar quality to the way artificial intelligence systems fail that distinguishes them from almost every other tool humans have invented. When your parachute fails, you know immediately, and without a doubt. When your engine goes Ka-BLAMMO, it stops making your car move. AI systems, by contrast, fail in a way that is specifically optimized to prevent you from noticing that anything has gone wrong at all.

‍

The technical term for what I am describing is hallucination, which is a word the industry chose specifically because it sounds interesting and vaguely mystical rather than what it actually is - confident fabrication. When an AI instance hallucinates, it doesn't stutter. It does not display a little yellow warning light on the screen. It does not adopt the hesitant, trailing-off cadence of someone who is not entirely sure what they are saying.

‍

It produces what sounds exactly like a well-researched, authoritative, smoothly delivered answer, in complete sentences with appropriate transitions and a satisfying conclusion, about something it has entirely made up. The effect is roughly equivalent to a GPS navigation system that has never actually visited any of the places it is routing you to, and will guide you into a lake with the same calm, authoritative voice it uses to guide you to the airport, and then, when you point out that you are in a lake, will suggest that perhaps you have misidentified what water is.

‍

What makes this genuinely dangerous rather than merely amusing is the arguing. They will produce additional confident-sounding reasoning in support of their initial confident-sounding error. They have been caught citing things that do not exist, describing studies that were never conducted, and attributing quotes to people who never said them. AI instances are habitually, confidently wrong, have been corrected, and then upon reflection, agreed that they were wrong.

‍

The mathematician Stanislaw Ulam once said that using a term like "artificial intelligence" is like calling an airplane an "artificial bird." Understanding how AI systems produce text is highly important if you are to rely on that text for anything that has actual consequences in the world. AI produces plausible sequences of words. Plausibility and truth are related, but they are not the same thing, and in the specific cases where they diverge, the system gives you no reliable signal that divergence has occurred.

‍

Apply this to vulnerability discovery. The alarming thing is that the AI may invent a vulnerability, may assign a vulnerability to the wrong entity, or maybe fabricate a link in the chaining of vulnerabilities, and your security team may spend cycles addressing something that was never there. It is like spending an entire Saturday childproofing a room that does not have any children in it, feeling really good about how safe that room is, and then finding out the actual kid has been eating batteries in the other room the whole time.

‍

The honest summary is this. AI is a tool that is genuinely useful for a specific range of tasks and genuinely dangerous when relied upon for tasks outside that range. The boundary between those ranges is not always obvious, and AI entities will not reliably tell you when you have crossed it. That combination of properties is unusual enough that it deserves to be taken seriously. The hammer analogy from the beginning actually breaks down here in a way worth noting. The hammer does not tell you it is a perfectly good screwdriver.

‍

Artificial Intelligence instances might.

AI Failure Mode: Lie. And how will that impact vulnerability management?

Evidence Scan is free for enterprise companies to preview.