Stephan Schmidt - May 26, 2026
You Already Live in an Illusion. Get Over It.
AI is the next absorption event, and the thing being absorbed is code
TL;DR: Senior engineers refuse AI-generated code because they cannot ship code they do not understand. But they never understood the code -- not the bytecode, not the query planner, not the SIMD instruction the compiler folds their loop into. They trusted stable abstractions, not understanding. Every layer of our stack absorbed the one below it: assembly into C, malloc into garbage collection, threads into async runtimes. AI codegen is the next absorption event, and code is the thing being absorbed. The failure modes that look catastrophic today will stabilize the way GC pauses did. What developers are really protecting is not understanding but control -- and control can move to a new layer without losing what mattered.
My journey started in the 1980s as a coder. From back then the journey was up, up, up, not only as a career, but up in abstraction layers. I started writing Z80/68k machine code, then assembler, then C, Java, JavaScript. Each transition a climb up the abstraction ladder.
I see a detail there repeating itself again and again. A new abstraction layer arrives, the coders whose expertise sits at the old layer resist it, and ten years later nobody remembers what the fight was about - or do you know the Malloc vs. GC fight? We are in that fight again now, but this time it is about AI-generated code.
Talking to senior developers in companies I help transition to AI they tell me some version of the same thing: “I cannot ship code I do not understand.” - and “I can’t be responsible for code I didn’t write.” Sounds responsible, it sounds like the position of someone who cares about quality, but it is actually the position of someone who has not noticed that they lived in an illusion for decades and never understood the code in the first place.
Never understood the code?
Last week I encountered a paper by Shiffrin, Stigler, and Keil called “Illusions of Understanding in the Sciences”1 , and although it is about scientists not about developers, every page of it applies to our industry. The gist is simple: scientists almost always believe they understand more than they do, and this is a structural feature of doing intellectual work in a world that is infinitely complex.
The authors point out that this is not laziness on the part of scientists and I think the same is true for developers. The paper leads with George Box’s line “all models are wrong, but some are useful”2 and extends it: every causal model we create is only an approximation, and the question is not whether our understanding is complete, but whether the model is useful. Creating causal models is core to software development too, one might say that it’s the essential skill.
Ask people how well they understand how a toilet works, and they will say, that they understand it quite well. Then ask them to actually explain it, and they cannot. Their understanding collapses the moment you ask for the explanation - because a toilet is more complicated than it looks from the outside. Still they can use the toilet.
It seems psychologists have a name for this: the illusion of explanatory depth3, shown in 2002 using a question about everyday mechanical objects. The effect turns out to be stronger for explanations than for other kinds of knowledge, like remembering a movie and that suggests our assumption of how things work is different so that we systematically overestimate our grokking of it.
You ask a developer how their Python code runs, the same thing happens like with a toilet. They will tell you about list comprehensions and async, but ask them about the bytecode the interpreter actually runs, or how reference counting interacts with the GIL, most will be unable to do so. They live in an illusion of understanding of what is going on.
“Understandings may produce particularly strong illusions when they are satisfying, even when scientists are aware the explanations are incomplete.”
“Illusions of Understanding in the Sciences” makes another point, which is that prediction is not understanding - which I find insightful, never thought about that. The authors give the example of Isaac Newton, who in the 1690s was asked to calculate the probability of a dice outcome and got the right answer. When he gave an explanation it was wrong although the math was right. 4 We can predict what happens and think therefor we understand the system. But we might be able to predict things for the wrong reasons - or for the right ones but not understanding them.
I’d argue this is what developers do today. They ship working code, the code runs, the tests pass, and from every signal available developers understand what is happening, but I’d think they don’t. They have a story about the code that is sufficient for delivering, but the story has no relationship to what the machine is actually doing - not the bytecode, the ORM, sql, database connections or REST calls that are actually happening and in what order. The story just helps them write the code and discuss it, even if it is wrong.
Most striking to me is an Rust example I’ve read about. An (inexperienced) Rust developer writes a double loop over an array, looks at it, and thinks to themselves: “Ah, two nested loops, that means O(n²) complexity.” That is what they think the machine will do and therefor it’s their mental model. The actual machine, based on the code the compiler created, will execute that double loop as a single SIMD instruction. Not two loops with O(n²), but one single instruction instead. Their mental model about execution, cycles and memory usages is totally off - but it doesn’t matter.
This happens everywhere, because every layer of the stack does this to every layer above it. The Python developer does not know what real bytecode runs for their code, the C programmer does not know what instructions the compiler optimizer reordered, and the Java developer does not know what garbage collector is running or when or whether escape analysis has stack-allocated half their “heap” objects - with different performance characteristics.
Developers think they want determinism, but they want predictability and stable failure modes instead - or at least the illusion of it. They think they live in a deterministic world, until they encounter a Heisenbug or flaky tests. Determinism is king, but at the same time “It works on my machine”. Many developers have been bitten by Hash Maps that return items in a different order every time.Go does this on purpose – map iteration order is deliberately randomized. SQL is especially wild because every query is rewritten by the planner based on - non deterministic - statistics the developer does not know or care about (every developer knows to add sort for a predictable order - with SQL and with a map). And non-determinism doesn’t stop with SQL - JavaScript is in-lining code and JITing code based on non-deterministic user access patterns and paths.
All developers ship working software, and every one holds a mental model of the system that is wrong about the actual system in it’s details at least, but it works anyway because the illusion is stable enough to be useful even though it is not true.
When a senior engineer tells me they cannot use AI coding because they will not understand the code, my challenge is that they do not understand the code that is really running now either, and they never did. They have been operating on an illusion, and the illusion worked because the abstractions underneath were stable enough to make their illusion predictive. This made them trust their illusion and by extension the machine. They have faith in stable abstractions not in the compiler generating code.
Going back to the paper, it has a section on this, which it frames in terms of scientists who use statistical procedures or simulation code without understanding the internals. It argues modern science could not function without this kind of trust — no individual scientist has time to deeply learn every method they use, so they rely on the procedure being accepted and consistent. Just like with developers trust is not in understanding but trust is in the stability of the abstraction. It is what the seniors I meet are doing when they trust the V8 Javascript runtime but not Claude Code. Because the difference is not: understanding vs. non-understanding, but familiar stable abstraction versus unfamiliar (not-yet) stable abstraction.
But it might not have been about understanding and trust from the beginning.
It might be about control. What developers might actually be protecting today is control.
When you read code line by line, you are establishing a control surface (See my theory of control). A control surface you can review, revise, blame, defend. Reading and writing code was not about understanding but a mechanism to exercise control over the system and the outcome.
We can see this as a reflex to AI: Spec-Driven Development is gaining traction with the same engineers who resist letting AI just write code, but it is not because specs give them deeper understanding, because they don’t do that . The spec does not tell you what code the AI generated and how the generated code actually works. What the spec gives you is a new control surface at a level developers can work with, something you can review and version and attribute - and make someone responsible for. It is a relocation of control, not a fix for understanding. The understanding was already gone by using a high-level language, developers just did not notice because the control mechanism - code - was still in their hands.
Back in 2008 I was managing developers who spent serious time tuning Java garbage collection, and there were books about it, conference talks, consultants who would charge you tens of thousands of euros (or D-Marks?) to come in and read your GC logs and tune GC settings. The flags were arcane, changed all the time from Java version to Java version, like NewRatio and SurvivorRatio, CMS versus parallel collectors, when to use what. When our website was hanging for 10 seconds under load I wanted someone who knew that stuff.
Today nobody cares about GC, not because anyone got smarter, but because G1 (and later ZGC etc.) got good enough and hardware became 10x faster. Failure modes that used to define our allnighters stopped happening at the frequency. Although the expertise of these consultants is still correct, for most people it is no longer relevant. The same thing happened to Tomcat (Http server) thread pool tuning, where figuring out maxThreads and acceptCount and why our connection pool was exhausted used to be the operational concern not only in the startups I worked for but in many startups, but it is now mostly invisible because frameworks and platforms absorbed the failure mode.
And this has been happening for the last 50+ years. New platforms arrived - Cloud, Kubernetes, Serverless - and the new platform absorbs the dominant failure modes of the previous one. Inherent complexity does not go away, it’s like water that can’t be compressed, instead it gets pushed down one layer into the work of a small number of developers who maintain the absorbing layer, and their work is useful to the millions of developers above them. The rituals and dance that used to protect against the failure become unnecessary, and the people who built their identity on that rituals (like Malloc or GC tuning) become specialists in an ever shrinking niche - same with AI.
And developers forget failure modes of the past and live in the illusion of understanding the system.
From my career (but keep Joel Spolsky’s leaking abstractions in mind5): Assembly was absorbed by C, and memory layout became somebody else’s problem (SEP someone else problem from THGTTG6) . Manual memory management with Malloc/Free was absorbed by garbage collection, and malloc tuning also became somebody else’s problem (for most people). Threads were absorbed by async runtimes and goroutines, so thread pool sizing became somebody else’s problem. Network protocols and TCP windows were absorbed by HTTP libraries, deployment was absorbed by containers and then by orchestrators and then by serverless, and at every step the operational concerns of the previous developer generation became invisible to the next one who didn’t experience them.
All those absorptions were resisted by developers whose expertise was being absorbed. The C programmers told the Java evangelists that garbage collection would never be fast enough for serious work, and I remember those arguments because I made some of them myself (or the fight when people used C while we used hand tuned machine code). SQL users told ORM users that they were building castles on sand. Each time the trajectory won and the holdouts became specialists in niches that still exist (embedded systems, game engines, HFT, Fortran!) - where money can be made but the number of jobs becomes (extremely) limited.
What I found interesting in the paper, it points out that the resistance in each case was framed as a quality or understanding argument. “You cannot really know what your program does if you let GC manage memory.” “You cannot reason about performance if the ORM writes your queries instead of you writing SQL by hand.” The paper calls this a confusion of induction and deduction7: developers were doing deduction (from this code, this behaviour) and treating the result as if it were induction (and therefore I understand the causes). Their familiar abstraction lets them deduce well enough to ship code, but it does not give them the causal model. The upcoming stable abstraction of LLMs that write code will do the same for them.
Don’t view AI through the wrong lens. AI coding is not another tool and it is not a productivity boost and it is not a way to write the same code faster.Though writing the same code faster is exactly what CEOs wrongly care about.
AI is the next absorption event, and the thing being absorbed is code.
The same way the JVM absorbed memory management, AI codegen will absorb the activity of typing characters into an IDE in a syntax that the compiler will accept. A small number of people at the platform layer, at Anthropic and OpenAI, will continue to care deeply about how this works, while everyone else operates one level up and stops thinking about it. And just like GCs became better and better, and failure modes collapsed and stabilized, AI generated code will get better and better and failure modes will stabilize, fold and collapse. The app generated with no auth is funny and tragic today - but as a failure mode of AI generated code will go away. What is left will stabilize and be worked around in a new dance and ritual.
Claude Code today has failure modes, and seniors who point at these problems are not wrong. But they mistake the tree for the forest. They are making the same mistake the Java skeptics made when they pointed at GC pauses, because the failure modes are real now but they will not be real in five years or even earlier. The dance and rituals, like prompt engineering, SDD, plan mode, /goal, that protect against today’s AI failure modes will become unnecessary, the same way GC dance and rituals became unnecessary.
When that happens, developers will trust AI-generated code the same way they trust Rust to compile a double loop - without knowing about the SIMD instruction, the same way they trust Postgres to plan their queries. They will build an illusion around how it works (think toilet!), and the illusion will be wrong, but every other illusion in the stack is wrong too, and the illusion will be stable enough to be useful, which is all an illusion ever needs to be.
If you accept this framing, you do not need to convince anyone that AI-generated code is perfect today, because it is not, and you do not need to convince them they will deeply understand what the AI produces, because they will not, because they do not deeply understand what the compiler produces either - and they are happy with it. How you can help is make them understand what they are defending, “I will not ship code I do not understand,” is not the position they think it is, because they have been practically shipping code and building complex systems they do not understand - but have the illusion they could understand if they’d climb down the abstraction ladder, but this is true for AI generated code too, and they didn’t do it for JS byte code and they don’t need to for AI generated code - for their entire careers, and the only thing changing is the layer at which the not-understanding and depending on stable failure modes happens.
The skill that made them senior was never deep understanding of the machine, it was managing uncertainty. Sitting in front of a broken system at 3am in the morning after a pager call, forming a hypothesis about the failure mode, testing it, forming the next hypothesis, and keeping going until the system came back. You are not understanding the system in some deep sense, you are forming and testing hypotheses about where it broke and its failure mode, and the loop is the same with AI.
The paper calls this: induction under uncertainty. You observe a phenomenon (the system is broken), you form the best causal account you can given everything you know, and you test it. The authors argue that this is what scientists actually do most of the time, even though they pretend they are doing something more rigorous. The paper is clear that induction is hard, that humans are bad at it, and that experience does not eliminate the difficulty but it does build better priors about where to look first. And today an AI is a valuable multiplier in these skills if you are able to direct it - people tell me “debugging AI code” is hard, I’d argue “AIs are much better at debugging than humans”. Don’t resist, let go.
How to pitch this to skeptical engineers? The hard expertise that currently makes you senior is on the same trajectory as GC tuning expertise was, just different. You can be early to operating at the new layer where all the value is moving, or you can be a specialist in the absorbed layer, which will continue to exist but will not be where most careers are built. You can write a novel on a typewriter, but it reduces your chances if you’re not Stephen King. The choice is yours, but the trajectory is not.
As always something is lost in this transition. I love to hand tune machine code, optimizing size, replacing a 3 byte instruction with a 1 byte instruction, replacing two instructions that run 6 cycles with one that runs 4 cycles. “Z80 Assembly Language Subroutines” is still my favorite computer book. There was beauty in it, and skill, and craftsmanship and a deep feeling of accomplishment.
This is in one way sad, but it is not a reason to stay out. The transition will not be smooth, there will be 2am incidents, AI code that brings down the system, data leaks, there will be production fires whose root causes are unknown to you, and there will probably be a generation of consultants who get paid tens of thousands of Euros to come in and explain how you can exert control by adding control surfaces.Yes – me. I became one of those consultants I used to lament about back in the GC days. You might not want this because as a developer you have been growing up in a simple and stable world of gigantic hardware and stable platforms, and because the last rough transition, E.g. GCs, was before your time, but you, not wanting stableness doesn’t stop the absorption event. Then, like every previous absorption event, the platform will get good enough that the consultants stop getting paid (me again!), the dance and rituals stop being necessary, and the layer becomes invisible.
That is how this always works, that is how it worked with assembly and Malloc and threads and deployment, and that is how it will work with code generation or agents. The only question is whether you and your team want to be on the inside of that transition or on the outside watching it happen - and become unemployable looking for a job in an ever smaller niche of hand written code.
The illusion of understanding code was always an illusion, and what developers are protecting when they say they need to read every line that goes to production is not understanding but control, and control can be relocated to a new layer without losing what mattered about it. Every previous absorption event in our industry was resisted by the people whose expertise was being absorbed, and every previous resistance lost on a roughly ten-year timescale, and AI codegen is the next absorption event, and the failure modes that look catastrophic today will stabilize the same way GC failure modes stabilized, became predictable and went away in the end.
Shiffrin, R., Stigler, S., & Keil, F. (2026). “Illusions of Understanding in the Sciences.” Computational Brain & Behavior. The paper argues that partial and incomplete understanding is universal in science, that scientists routinely confuse prediction with causal explanation, and that this matters for how science is designed, communicated, and taught. The core sections include a long worked example on linear regression (Simpson’s paradox, regression to the mean, Lord’s paradox, Stein’s paradox) showing that even something most scientists believe they understand is genuinely hard to understand. ↩︎
Box, G.E.P. (1976). “Science and Statistics.” Journal of the American Statistical Association 71(356), 791–799. The full quote is often shortened, but Box’s point was specifically about statistical models: all are wrong, the practical question is whether they are useful enough for the purpose at hand. Shiffrin et al. quote Giorgio Parisi, the 2021 physics Nobel laureate, making the same point: “Scientific truths are always approximations to the truth.” ↩︎
Rozenblit, L., & Keil, F. (2002). “The Misunderstood Limits of Folk Science: An Illusion of Explanatory Depth.” Cognitive Science 26(5), 521–562. The original studies used everyday objects like zippers, locks, and toilets. The effect was specifically larger for explanations of mechanisms than for other types of knowledge. Frank Keil, one of the original authors of this paper, is also one of the authors of the “Illusions of Understanding” paper I am drawing on here. ↩︎
Stigler, S. (2006). “Isaac Newton as a Probabilist.” Statistical Science 21, 400–403. Stephen Stigler, one of the authors of the paper I’m citing, has written extensively on the history of statistics. The Newton example matters because it shows that getting the right answer and having the right explanation are independent — Newton’s calculation worked, his story about why it worked did not. The paper uses this to argue that prediction is not understanding, and that mathematical or computational models that predict well can produce a particularly strong illusion that the predictor understands what is going on. ↩︎
Spolsky, J. (2002). “The Law of Leaky Abstractions.” Joel on Software. The argument is that every non-trivial abstraction leaks — the abstraction works most of the time, but the underlying complexity surfaces eventually, and when it does you have to understand the layer below. This is in tension with the “absorption” argument I’m making, but not in conflict with it: absorptions reduce the frequency at which the lower layer matters, they do not eliminate it, which is why a small number of specialists at the absorbing layer continue to matter. ↩︎
SEP = Someone Else’s Problem, from Douglas Adams’ Life, the Universe and Everything. An SEP field is a way of making things invisible: not by camouflage, but by making them so obviously someone else’s problem that nobody looks at them. This is a remarkably accurate description of what successful platform abstractions do. ↩︎
The paper makes a careful distinction between deduction (deriving conclusions from agreed premises using logic and math) and induction (forming the best causal account given everything you know about the data). Shiffrin et al. argue that scientists routinely use deduction (this model predicts this) and then quietly treat the result as if it were induction (and therefore this model explains the causes). This conflation is at the heart of most illusions of understanding, including the developer’s confidence that “my code runs, therefore I understand what is happening.” ↩︎
About me: Hey, I'm Stephan, I help CTOs with Coaching, with 40+ years of software development and 25+ years of engineering management experience. I've coached and mentored 80+ CTOs and founders. I've founded 3 startups. 1 nice exit. I help CTOs and engineering leaders grow, scale their teams, gain clarity, lead with confidence and navigate the challenges of fast-growing companies.