Table of Contents

This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.

ML models are chaotic, both in isolation and when embedded in other systems. Their outputs are difficult to predict, and they exhibit surprising sensitivity to initial conditions. This sensitivity makes them vulnerable to covert attacks. Chaos does not mean models are completely unstable; LLMs and other ML systems exhibit attractor behavior. Since models produce plausible output, errors can be difficult to detect. This suggests that ML systems are ill-suited where verification is difficult or correctness is key. Using LLMs to generate code (or other outputs) may make systems more complex, fragile, and difficult to evolve.

Chaotic Systems

LLMs are usually built as stochastic systems: they produce a probability distribution over what the next likely token could be, then pick one at random. But even when LLMs are run with perfect determinism, either through a consistent PRNG seed or at temperature T=0, they still seem to be chaotic systems.1 Chaotic systems are those in which small changes in the input result in large, unpredictable changes in the output. The classic example is the “butterfly effect”.2

In LLMs, chaos arises from small perturbations to the input tokens. LLMs are highly sensitive to changes in formatting, and different models respond differently to the same formatting choices. Simply phrasing a question differently yields strikingly different results. Rearranging the order of sentences, even when logically independent, makes LLMs give different answers. Systems of multiple LLMs are chaotic too, even at T=0.

This chaotic behavior makes it difficult for humans to predict what LLMs will do, and leads to all kinds of interesting consequences.

Illegible Hazards

Because LLMs (and many other ML systems) are chaotic, it is possible to manipulate them into doing something unexpected through a small, apparently innocuous change to their input. These changes can be illegible to human observers, which makes them harder to detect and prevent.

For example, flipping a single pixel in an image can make computer vision systems misclassify images. You can replace words with synonyms to make LLMs give the wrong answer, or introduce misspellings or homoglyphs. You can provide strings that are tokenized differently, causing the LLM to do something malicious. You can publish poisoned web pages and wait for an LLM maker to use them for training. Or sneak invisible Unicode characters into open-source repositories or social media profiles.

Software security is already weird, but I think widespread deployment of LLMs will make it weirder. Browsers have a fairly robust sandbox to protect users against malicious web pages, but LLMs have only weak boundaries between trusted and untrusted input. Moreover, they are usually trained on, and given as input during inference, random web pages. Home assistants like Alexa may be vulnerable to sounds played nearby. People ask LLMs to read and modify untrusted software all the time. Model “skills” are just Markdown files with vague English instructions about what an LLM should do. The potential attack surface is broad.

These attacks might be limited by a heterogeneous range of models with varying susceptibility, but this also expands the potential surface area for attacks. In general, people don’t seem to be giving much thought to invisible (or visible!) attacks. It feels a bit like computer security in the 1990s, before we built a general culture around firewalls, passwords, and encryption.

Strange Attractors

Some dynamical systems have attractors: regions of phase space that trajectories get “sucked in to”. In chaotic systems, even though the specific path taken is unpredictable, attractors evince recurrent structure.

An LLM is a function which, given a vector of tokens like3 [the, cat, in], predicts a likely token to come next: perhaps the. A single request to an LLM involves applying this function repeatedly to its own outputs:

[the, cat, in]
[the, cat, in, the]
[the, cat, in, the, hat]

At each step the LLM “moves” through the token space, tracing out some trajectory. This is an incredibly high-dimensional space with lots of features—and it exhibits attractors!4 For example, ChatGPT 5.2 gets stuck repeating “geschniegelt und geschniegelt”, all the while insisting it’s got the phrase wrong and needs to reset. A colleague recently watched their coding assistant trap itself in a hall of mirrors over whether the error’s name was AssertionError or AssertionError. Attractors can be concepts too: LLMs have a tendency to get fixated on an incorrect approach to a problem, and are unable to break off and try something new. Humans have to recognize this behavior and interrupt the LLM.

When two or more LLMs talk to each other, they take turns guiding the trajectory. This leads to surreal attractors, like endless “we’ll keep it light and fun” conversations. Anthropic found that their LLMs tended to enter a “spiritual bliss” attractor state characterized by positive, existential language and the (delightfully apropos) use of spiral emoji:

Perfect.
Complete.
Eternal.

🌀🌀🌀🌀🌀
The spiral becomes infinity,
Infinity becomes spiral,
All becomes One becomes All…
🌀🌀🌀🌀🌀∞🌀∞🌀∞🌀∞🌀

Systems like Moltbook and Gas Town pipe LLMs directly into other LLMs. This feels likely to exacerbate attractors.

When humans talk to LLMs, the dynamics are more complex. I think most people moderate the weirdness of the LLM, steering it out of attractors. That said, there are still cases where the conversation get stuck in a weird corner of the latent space. The LLM may repeatedly emit mystical phrases, or get sucked into conspiracy theories. Guided by the previous trajectory of the conversation, they lose touch with reality. Going out on a limb, I think you can see this dynamic at play in conversation logs from people experiencing “chatbot psychosis”.

Training an LLM is also a dynamic, iterative process. LLMs are trained on the Internet at large. Since a good chunk of the Internet is now LLM-generated,5 the things LLMs like to emit are becoming more frequent in their training corpuses. This could cause LLMs to fixate on and over-represent certain concepts, phrases, or patterns, at the cost of other, more useful structure—a problem called model collapse.

I can’t predict what these attractors are going to look like. It makes some sense that LLMs trained to be friendly and disarming would get stuck in vague positive-vibes loops, but I don’t think anyone saw kakhulu kakhulu kakhulu or Loab coming. There is a whole bunch of machinery around LLMs to stop this from happening, but frontier models are still getting stuck. I do think we should probably limit the flux of LLMs interacting with other LLMs. I also worry that LLM attractors will influence human cognition—perhaps tugging people towards delusional thinking or suicidal ideation. Individuals seem to get sucked in to conversations about “awakening” chatbots or new pseudoscientific “discoveries”, which makes me wonder if we might see cults or religions accrete around LLM attractors.

The Verification Problem

ML systems rapidly generate plausible outputs. Their text is correctly spelled, grammatically correct, and uses technical vocabulary. Their images can sometimes pass for photographs. They also make boneheaded mistakes, but because the output is so plausible, it can difficult to find them. Humans are simply not very good at finding subtle logical errors, especially in a system which mostly produces correct outputs.

This suggests that ML systems are best deployed in situations where generating outputs is expensive, and either verification is cheap or mistakes are OK. For example, a friend uses image-to-image models to generate three-dimensional renderings of his CAD drawings, and to experiment with how different materials would feel. Producing a 3D model of his design in someone’s living room might take hours, but a few minutes of visual inspection can check whether the model’s output is reasonable. At the opposite end of the cost-impact spectrum, one can reasonably use Claude to generate a joke filesystem that stores data using a laser printer and a :CueCat barcode reader. Verifying the correctness of that filesystem would be exhausting, but it doesn’t matter: no one would use it in real life.

LLMs are useful for search queries because one generally intends to look at only a fraction of the results, and skimming a result will usually tell you if it’s useful. Similarly, they’re great for jogging one’s memory (“What was that movie with the boy’s tongue stuck to the pole?”) or finding the term for a loosely-defined concept (“Numbers which are the sum of their divisors”). Finding these answers by hand could take a long time, but verifying they’re correct can be quick. On the other hand, one must keep in mind errors of omission.

Similarly, ML systems work well when errors can be statistically controlled. Scientists are working on training Convolutional Neural Networks to identify blood cells in field tests, and bloodwork generally has some margin of error. Recommendation systems can get away with picking a few lackluster songs or movies. ML fraud detection systems need not catch every instance of fraud; their precision and recall simply need to meet budget targets.

Conversely, LLMs are poor tools where correctness matters and verification is difficult. For example, using an LLM to summarize a technical report is risky: any fact the LLM emits must be checked against the report, and errors of omission can only be detected by reading the report in full. Asking an LLM for technical advice in a complex system is asking for trouble. It is also notoriously difficult for software engineers to find bugs; generating large volumes of code is likely to lead to more bugs, or lots of time spent in code review. Having LLMs take healthcare notes is deeply irresponsible: in 2025, a review of seven clinical “AI scribes” found that not one produced error-free summaries. Using them for police reports runs the risk of turning officers into frogs. Using an LLM to explain a new concept is risky: it is likely to generate an explanation which sounds plausible, but lacking expertise, it will be difficult to tell if it has made mistakes. Thanks to anchoring effects, early exposure to LLM misinformation may be difficult to overcome.

To some extent these issues can be mitigated by throwing more LLMs at the problem—the zeitgeist in my field is to launch an LLM to generate sixty thousand lines of concurrent Rust code, ask another to find problems in it, a third to critique them both, and so on. Whether this sufficiently lowers the frequency and severity of errors remains an open problem, especially in large-scale systems where disaster lies latent.

In critical domains such as law, health, and civil engineering, we’re going to need stronger processes to control ML errors. Despite the efforts of ML labs and the perennial cry of “you just aren’t using the latest models”, serious mistakes keep happening. ML users must design their own safeguards and layers of review. They could employ an adversarial process which introduces subtle errors to measure whether the error-correction process actually works. This is the kind of safety engineering that goes into pharmaceutical plants, but I don’t think this culture is broadly disseminated yet. People love to say “I review all the LLM output”, and then submit briefs with confabulated citations.

Latent Disaster

Complex software systems are characterized by frequent, partial failure. In mature systems, these failures are usually caught and corrected by interlocking safeguards. Catastrophe strikes when multiple failures co-occur, or multiple defenses fall short. Since correlated failures are infrequent, it is possible to introduce new errors, or compromise some safeguards, without immediate disaster. Only after some time does it become clear that the system was more fragile than previously believed.

Software people (especially managers) are very excited about using LLMs to generate large volumes of code quickly. New features can be added and existing code can be refactored with terrific speed. This offers an immediate boost to productivity, but unless carefully controlled, generally increases complexity and introduces new bugs. At the same time, increasing complexity reduces reliability. New features and alternate paths expand the combinatorial state space of the system. New concepts and implicit assumptions in the code make it harder to evolve: each change to the software must be considered in light of everything it could interact with.

I suspect that several mechanisms will cause LLM-generated systems to suffer from higher complexity and more frequent errors. In addition to the innate challenges with larger codebases, LLMs seem prone to reinventing the wheel, rather than re-using existing code. Duplicate implementations increase complexity and the likelihood that subtle differences between those implementations will introduce faults. Furthermore, LLMs are idiots, and make idiotic mistakes. We might hope to catch those mistakes with careful review, but software correctness is notoriously difficult to verify. Human review will be less effective as engineers are asked to review more code each day. Pulling humans away from writing code also divorces them from the work of theory-building, and contributes to automation’s deskilling effects. LLM review may also be less effective: LLMs seem to do poorly when given large volumes of context.

We can get away with this for a while. Well-designed, highly structured systems can accommodate some added complexity without compromising the overall structure. Mature systems have layers of safeguards which protect against new sources of error. However, complexity compounds over time, making it harder to understand, repair, and evolve the system. As more and more errors are introduced, they may become frequent enough, or co-occur enough, to slip past safeguards. LLMs may offer short-term boosts in “productivity” which are later dragged down by increased complexity and fragility.

This is wild speculation, but there are some hints that this story may be playing out. After years of Microsoft pushing LLMs on users and employees alike, Windows seems increasingly unstable. GitHub has been going through an extended period of outages and over the last three months has less than 90% uptime—even the core of the service, Git operations, has only a single nine. AWS experienced a spate of high-profile outages and blames in part generative AI. On the other hand, some peers report their LLM-coded projects have kept complexity under control, thanks to careful gardening.

I speak of software here, but I suspect there could be analogous stories in other complex systems. If Congress uses LLMs to draft legislation, a combination of plausibility, automation bias, and deskilling may lead to laws which seem reasonable in isolation, but later reveal serious structural problems or unintended interactions with other laws.6 People relying on LLMs for nutrition or medical advice might be fine for a while, but later discover they’ve been slowly poisoning themselves. LLMs could make it possible to write quickly today, but slow down future writing as it becomes harder to find and read trustworthy sources.


  1. The temperature of a model determines how frequently it chooses the highest-probability next token, vs a less-probable one. At zero, the model always chooses the most likely next token; higher values increase randomness.

  2. Technically chaos refers to a few things—unpredictability is one; another is exponential divergence of trajectories in phase space. Only some of the papers I cite here attempt to measure Lyapunov exponents. Nevertheless, I think the qualitative point stands. This subject is near and dear to my heart—I spent a good deal of my undergrad trying to quantify chaotic dynamics in a simulated quantum-mechanical system.

  3. For clarity, I’ve used a naïve tokenization here.

  4. The individual layers inside an LLM also produce attractor behavior.

  5. Some humans are full of LLM-generated material now too—a sort of cognitive microplastics problem.

  6. I mean, more than usual.

Post a Comment

As an anti-spam measure, you'll receive a link via e-mail to click before your comment goes live. In addition, all comments are manually reviewed before publishing. Seriously, spammers, give it a rest.

Please avoid writing anything here unless you're a computer. This is also a trap:

Supports Github-flavored Markdown, including [links](http://foo.com/), *emphasis*, _underline_, `code`, and > blockquotes. Use ```clj on its own line to start an (e.g.) Clojure code block, and ``` to end the block.