Tommy Jepsen
Supervised versus unsupervised AI-generated code

Supervised vs Unsupervised AI-generated code

2026-06-30 — by Tommy Jepsen

"100% of our code is written by AI" is something I feel I am hearing everywhere right now. It sounds like the future arrived early. But the claim hides a question I keep returning to: was there no human in the loop then?

Because I feel there are two very different sentences hiding inside that one. "AI wrote all our code, and an engineer reviewed it." and "AI wrote all our code, and we just shipped it."

Both can be the right way to do it, but to me it really depends on the product.

Supervised means a human still owns the diff

The model drafts, a person reads the diff, runs the required tests, questions the parts that look too confident, and takes responsibility for what merges. The AI moved the work from typing to judging. It did not remove the judge.

I don't mean the engineers are reading every line. It more means reading the changes that matter, the files where a mistake would actually hurt, enough to understand what the diff does and to keep the mental context of what is going on in the codebase.

Automation helps you get there: a separate model can critique the diff first, a test suite can run etc., so the obvious problems are caught before a person looks. But that only narrows what the human reviews, it doesn't replace the understanding. Someone still has to grasp the impact and be accountable for what merges.

This is the version that scales to high-stakes code in my opinion. Not because the model isn't trusted, but because something deliberate stands between the diff and production when it matters.

Unsupervised means the model is the last set of eyes

Unsupervised is the other reading. Prompt to product, you judge the result by whether it looks right when you click around. This is what most people mean by vibecoding, and the instinct behind it is sound: if the model can produce something that works, reading every line it wrote is often a waste of your time.

For a landing page, simpler UI changes or a throwaway prototype, ship it. The blast radius is small, the worst case is embarrassment, and the time you'd spend reviewing is better spent building the next thing. Plenty of genuinely good software gets made this way.

What shifts in my opinion is when the unsupervised code handles real data critical for real businesses. Auth, payments, automated workflows, anything with personally identifiable data etc. "It works when I click around" just doesn't feel good enough for me, even with other LLM's doing code reviews, automated test suits pass etc. to support the unsupervised code.

Are the models good enough to remove the human?

Honestly, I don't know. The models improve fast enough that a confident "no" ages badly.

But even if the models were good enough to remove the human from the loop completely, I'm still not sure I'm comfortable handing my data to, or base my entire business on, a product whose code no engineer reviewed and approved.

That's not a verdict on model quality, and it doesn't get weaker as the models get stronger. It just feels more natural to me, knowing that all the critical parts of a software I depend on, is reviewed and approved by a human.

So when companies say "Our code is 100% AI generated" I still ask myself before siging up and handing over my data: but was it reviewed?

Tommy Jepsen - design engineer in Copenhagen

Connect on LinkedIn

Follow along on LinkedIn for more updates, tips, and insights on UX design and AI.