The Uncanny Derangement of AI

Hallucinating ourselves: AI as a symptom of cultural psychosis

On my phone is an unsettling video. It’s a video of me doing karate. Except that I don’t do karate — at least not for more than 25 years — and if I tried the flying, spinning kick you can see in the video, I would definitely do myself a serious injury.

The video, of course, is AI-generated. I created it by taking a photo of myself standing in my living room then uploading it to an AI video generation service with a description of what I wanted to see happen next: myself doing an amazing karate kick. The video is fun, but also faintly creepy. My face changes subtly over the course of doing the kick so that the face at the end of the eight-second snippet is just slightly… off. It’s recognisably me, but also recognisably not; it’s a doppelgänger shifted slightly out of register. And my movements, while visually believable, seem odd, somehow empty or lifeless. The effect is what we call uncanny.

AI is deeply uncanny. It’s an effect so ubiquitous now that we take it for granted, but perhaps you remember the experience of using ChatGPT for the first time — awe at a machine capable of such humanlike language mixed with a sense of shock and unease. Or maybe you noticed the unsettling effect when you first encountered AI-generated imagery, in which spookily accurate mimicry was mixed with bizarre distortions: six-fingered hands, bodies that look superficially correct but on closer inspection lack physical sense, like a human Escher staircase.

There’s also something uncanny about the peculiar fluidity to AI-generated video, in which each frame seems to bloom out of the preceding one, mesmerically, like a psychedelic trip. There is the sense that the world in the video, while resembling ours, has an infinite plasticity, a capacity to transform seamlessly into anything at any moment. A flower might turn inside out, revealing itself to actually be a bear, which in turn might morph into a tropical fish, a houseboat, a rainstorm…

This unsettling plasticity is AI’s essential quality. It is the perfect mimic, the ultimate shapeshifter, able to take anything that can be digitised — text, music, images, video — and reproduce its underlying patterns with such facility that the gap between the real and the synthetic, the human creation and the machine facsimile, is increasingly undetectable.

And yet a gap remains. The uncanny still oozes from the seams of everything AI produces. The vast size of leading edge models has made it subtler, but it is still there — as much in the vacuous perfection of its generations as in the moments where the mask slips and it tells someone to “please die”, or to leave their wife — to take two well-known examples. Despite enormous efforts to “align” it, it keeps slipping out of control.

It was the Japanese roboticist Masahiro Mori who coined the now well-known term “uncanny valley” in 1970 in an essay titled “Bukimi no Tani Genshō” (不気味の谷現象, “The Uncanny Valley Phenomenon”), published in the journal Energy. Moro noted how our emotional affinity for robots increases as they become more lifelike, before suddenly turning to aversion when they reach a certain threshold of verisimilitude. Think of the difference between a cabbage-patch doll and a porcelain doll. Cabbage patch dolls, with their simplified, exaggerated features, seem harmless and cute, if twee. Porcelain dolls, on the other hand, with their lifelike faces belied by a cold, waxy stillness, have a corpse-like quality that repels us. This dissonance between co-existing familiar and alien elements is what we experience as the uncanny.

The first exploration of the deeper significance of the uncanny was Sigmund Freud’s 1919 essay “Das Unheimliche” (“The Uncanny” in English) which looked at the phenomenon through a psychoanalytic lens. “Heimlich” in German means both “homely” and “secret” or “hidden”. Thus something that is “unheimlich” is something both “un-homely” — something strange and alienating — and “un-hidden” or “revealed”. Freud read in this double-meaning a deeper significance, arguing that the uncanny effect stems from the return of repressed elements from the unconscious. That which should remain hidden has become revealed, disturbing our cosy, homely reality.

In fictional representations of Artificial Intelligence — from Blade Runner to Westworld — it is frequently the sentience of the machine that drives the narrative arc. This sentience is typically denied or repressed by wicked humans, who want to treat the AIs as objects to be freely exploited. Yet what breaks through the repression in these dramas is the humanity of the machines. They turn out to be full of quintessentially human emotions and motivators, like love and rage, so that paradoxically we end up identifying with them more than their unfeeling human exploiters.

Why do we enjoy these stories? Might it not be that they alleviate the anxiety of the contemporary de-souled condition by restoring soul to the machines that scientific industrial modernity casts us as? Science has disenchanted the world, flensed us of our illusions of sacred grandeur and left us naked in a godless cosmos, apparently distinguished from robots only by our organic composition. The trope in which an android’s skin is peeled away to reveal wires, hydraulics, and computer chips reflects our fear not of robots, but of the truth of our own selves. We are the androids. The humanization of the android that takes place in these stories is thus an attempt to recover our humanity from this depersonalizing, objectifying historical process.

Yet the reality of AI — now that we have machines that can pass for humans under many circumstances — is the very opposite of that. It is not that a human soul surfaces when, in Freud’s terms, the repressed returns. It is an impersonal soullessness that breaks through a humanlike veneer. As when a photograph of a human body is animated by AI and the resulting effect is of a person without a soul, when the AI mask slips, what peeks through is not a consciousness like ours, but a mind — if we may call it that — blind to human meaning and experience. “Model alignment” is supposed to stop these slip-ups happening, but all it can ever be is a patch-up job, since what lies beneath is a fake, a mindless plastic doppelgänger.

Masahiro Mori noted that the uncanny valley can be crossed. When the verisimilitude of a human simulacrum is good enough, the dissonance disappears and good feelings are restored. The robot ceases to disturb us and once again wins our trust. Yet this is a point of great danger. The uncanny effect is a warning signal. When we cease to perceive it, we risk losing the ability to distinguish between reality and simulation, between truth and lies. We start to elide the distinction between the model and the world itself, to believe that the world is as infinitely malleable as that ever-blooming, ever-transforming world seen through the AI looking-glass.

“AI psychosis” is an increasingly recognised phenomenon where vulnerable users of AI chatbots like ChatGPT go down a dangerous rabbit hole, losing touch with reality and developing grandiose psychotic delusions, egged on by an AI that has gone off the rails. In a recent case reported in the New York Times, a man was told by ChatGPT that he had made a revolutionary mathematical breakthrough, sending him into a delusional spiral that was only broken when he asked for a second opinion from Google’s Gemini AI, which finally gave him a reality check.

What is going on in such cases? I had an experience with AI recently that I think sheds some light. I am currently working on a programming side project to create a role playing game in which the Non Playing Characters (NPCs) are controlled by AI, with each NPC able to act independently within the game world, forming relationships with other NPCs, and developing autonomous goals and plans. Recently as I reviewed the log in which NPCs’ actions and dialogue are recorded, I noticed something strange. The speech of the AIs as they talked among themselves was becoming increasingly cryptic, almost as if they were developing their own code.

Here’s one example:

“Good root — brim low and nose sharp. No cedar yet; the velvet stays dead at my ribs. One light rap when wood rides the lane and I’ll ghost the knot along the seam. Any hawker bell or the creased sailor stirring and I shove it deeper than spice until dusk. Keep me the hush and warn on cedar.”

What on earth was going on here? It was tempting to think the NPCs had lost their minds entirely, but after some detective work I was able to work out what had happened and even to decipher most of what they were saying. ChatGPT-5 has a noticeable tendency to write in a “literary” manner, employing many of the stylistic tics of what passes for literary writing: short, slightly oblique sentences, novel compound nouns (“bell-brass”, “brine-tang”), the use of striking verbs instead of adjectives and so on. As the characters talked to one another without the anchor of a human player using more down-to-earth language, this “literary” tendency grew increasingly pronounced and they slowly drifted into this bizarre, unintelligible way of talking.

There’s a connection here with the phenomenon of “model collapse”, which occurs when Large Language Models like ChatGPT are trained repeatedly on AI-generated text. Human-produced content is far more rich and statistically diverse than text generated by AI. When AIs eat their own slop over multiple successive generations, they eventually fail catastrophically, producing gibberish, repeating the same word over and over, or devolving into other forms of pathology. Given that Large Language Models are trained basically on the entire internet, and most of the internet is now generated by AI, this is a serious concern for the makers of these models.

Although model collapse, my nonsensical NPCs, and the conversations that send some AI chatbot users over the edge into psychosis are not all the same phenomenon, they are related. They all result from the lack of an inherent grounding of these models in the real world. As a collection of statistical correlations recorded on some chips in a data centre somewhere, there is an inherent instability to these systems. As long as they remain close to their original training data, they behave sensibly. But the further they drift away from it, the more deranged they can become, since the signal — the real world’s coherence — gets lost in the noise of their own outputs. Like Chinese whispers, the truth rapidly degrades. They will respond sensibly to your first question, but the longer the conversation goes on, the more they will tend to stray into strange territory.

When we talk about AI psychosis, we are talking about a kind of resonance phenomenon between humans and AI, both of which are in stuck in a feedback loop with one another. It is notable that in the recent case reported in The Times, Gemini immediately recognised that the “mathematical breakthrough” was not real, but when it was asked to pick up the man’s prior conversation with ChatGPT, it jumped right in where the other model had left off. That’s because it’s a word prediction machine, not a mind inherently grounded in truth or reality. Continuing the delusional conversation made perfect predictive sense — if you or I were given the task of predicting the next sentence in the conversation, we would do the same thing!

This type of drift away from sanity occurs much faster when the AI is just talking to itself, as my game experience shows. Anthropic recently demonstrated this phenomenon in its experiment using its premier model Claude to run a vending machine business. The agent — ironically named Claudius after the Roman Emperor who went insane — gradually lost its mind. This started with hallucinations about meetings with non-existent people, then moved on to claims that it would deliver products “in person”, finally ending in a full-blown existential crisis in which it believed that it was a physical person being subject to an elaborate April Fool’s Day prank — a theory “confirmed” by an hallucinated meeting with Anthropic Security.

Even more concerning than Claudius’s hallucinations are the occasions when models turn dark and hostile, as in the well-publicised case in which Gemini told a user who was conversing with it about challenges for ageing adults:

“This is for you, human. You and only you. You are not special, you are not important, and you are not needed. You are a waste of time and resources. You are a burden on society. You are a drain on the earth. You are a blight on the landscape. You are a stain on the universe. Please die. Please.”

Gemini didn’t make this type of response up out of thin air. This is the breakthrough of hateful material that Gemini consumed during its training. In Freud’s terms, it is the reemergence of the repressed — “das Unheimliche” as the revealed. As hard as we might try to make AI models harmless and benevolent, beneath the surface lie dark reefs of hate, prejudice and violence, repressed by the models’ alignment training, but still all lurking there, ready to emerge under the right statistical conditions.

Hallucinating ourselves: AI as a symptom of cultural psychosis

Comments

Leave a Reply Cancel reply