Last June I wrote a reflection on how vibe coding had made it possible to create some fireworks that helped bring our vision for the future of public employment support to life. In the months since, those fireworks have become a portfolio of provocatypes, and a lot of learning and a cascading number of new ideas.

And it all has the feeling of standing close to fast machinery. You’ll know what I mean if you’ve ever been to an old mill turned industrial museum where the looms are up and running. Powerful, amazing, machinery. That could absolutely rip your arm off.

It’s hard to imagine what it would have been like when Edmund Cartwright first showed off his power loom, let alone what it was like when mill after mill was filled with them and all you’d ever known was hand looms and spinning wheels.

AI-assisted delivery and the craft of confidence

But I think if you’ve spent any time with AI-assisted development tools over the last year and a bit then you’ll know what I’m talking about. Maybe you’ve only just started with the new models in the last few weeks and you’ve read Matt Shumer’s widely shared Something Big is Happening piece which says things I can absolutely identify with. Or maybe, like me, this has been a slower burn and for a while now you’ve had that intoxicating mixture of exhilaration and dizziness.

Either way, I do think we are past an inflexion point like the one that greeted the start of the 19th century: a kairos moment for digital delivery where the nature of what it is to work in digital government has changed shape.

There’s a popular term for this: vibe-coding. But as I explore in the chapter why vibe-coding matters (and what it isn’t), the phrase is really the hook, not the substance. This world where we can use natural language to describe what we want and have working software generated moments later. But there is a tension in that language. It immediately sounds like flimsy demo-ware, and like a rejection of discipline or craft.

Vibe coding: the hook, not the point

And when it comes to building services that respond to the needs of the public, vibes aren’t enough.

So some months ago now I started to try and write about what this might mean for our disciplines and our craft. It has taken me longer than I wanted, but I’m happy enough with them to publish my reflections on on what changes for product management in a vibe-coded world — and the broader argument collected at https://vc-product.wel.by/.

A small warning: it’s long. I think it deserves your time (which will be much less than the time I gave it), because it’s trying to take seriously what AI-assisted delivery changes, and what it doesn’t. So I’ve woven in links to take you into the dedicated chapters where I try and unpack each idea. You’ll see there’s breadth but that’s because the implications are broad: for how we work today, and how we’ll need to work tomorrow.

But while vibe-coding is the hook, I think we are really talking about something that needs its own name.

Andrej Karpathy, who coined the original phrase, has suggested ‘agentic engineering‘, and that’s useful because it insists that this is becoming a professional workflow: you orchestrate agents, you scrutinise output, and you keep the quality bar intact. That makes it serious rather than a good party trick.

But it’s not the frame we need when it comes to our public services.

‘Agentic engineering’ names how software gets written. But we’re grappling with how public value gets added. In government (and more than likely elsewhere too), the unit of delivery is not an individual with a clever workflow – the unit of delivery is the multidisciplinary team, and the enabling environment wrapped around it. It’s policy, ops, analysis, content, design, engineering, and their leaders getting closer to runnable reality sooner, together. That’s why in my writing I’m using AI-assisted delivery.

Which brings me to a game of telephone.

The old telephone game

For decades, the substance of government delivery has been a game of telephone. Policy intent is translated into requirements, which are translated into tickets, which are eventually translated into code. That’s a translation layer where good ideas go to die.

One of the early pieces of dogma I absorbed at GDS was the importance of blending policy, digital and operations into multidisciplinary teams that break that cycle. It’s a theme I came back to multiple times across written papers and spoken advice at the OECD and in the paper I discuss this in reimagining the team and its roles. And yet my reflection on returning to UK government was that in some places those gaps are bigger than they’ve ever been.

So a big reason why the application of AI to how we make policy and design services excites me is that it becomes harder to sustain that cycle. It becomes possible, if not natural, for the people whose day to day is policy, product or operations to express intent as something runnable, not just describable.

But…

Speed without safety is just a faster way to fail

So the paper is my attempt to hold both truths at once.

Yes, the tools collapse the gap between intent and software. They make it easier to show rather than tell. They let a small team get to something testable quickly, and learn in public sooner — particularly when you treat delivery as learning, testing and scaling responsibly.

But they also collapse the gap between something that runs and something that is ready.

The hard parts don’t get any cheaper

Standards do not get softer just because building gets easier. In fact, they become more load-bearing — something I explore in steady standards in a fast world.

Governance cannot live as a set-piece moment. When the cadence speeds up, governance has to become part of the daily rhythm, and part of the machinery — it must become governance in the whole system. That implies something closer to a shared operating system — a highway code for digital — where guardrails are clear, visible and built into the flow of work.

“It runs” is not a maturity model. Readiness is still expensive: accessibility, security, lawful data handling, monitoring, support paths, and rollback. That makes user research in the age of instant prototypes even more critical.

Teams can move quickly with fewer hand-offs, but they cannot move quickly with fewer perspectives. No clever tooling is a substitute for genuine multidisciplinary work of the kind I start to tease out in this discipline by discipline discussion (but which lends itself to being given its own dedicated treatment).

Abundance for building means changing the operating model

There’s another shift in the paper that speaks to some of the real, underlying plumbing where the real work is.

When building gets cheaper, the job becomes less about tending a ticket queue and more about learning quickly, and doing so in the open. What does delivery start to look like organising when building is cheap?

Backlogs start looking more like outcomes rather than tasks. Roadmaps look more like hypotheses than commitments — and commissioning becomes about buying change, not a plan.

Evidence becomes the thing that earns you permission to scale.

Earning the right to go faster

So the question that I’ve been sitting with is not “how do we go faster?” It’s “how do we earn the right to go faster?”

The phrase I landed on by the concluding chapter is the craft of confidence.

Not confidence as personality, or bluster, or a strong opinion delivered at speed. Confidence as a trail you can point to. Evidence, standards, operability, and governance that is close enough to the work to be real. Not your own confidence, but the confidence others can reasonably have in your work.

And if you read that and think “so what”, that probably means your experience is one where the fundamentals of good multidisciplinary product are already normal. In places where they aren’t, the value of AI isn’t in new principles; it’s in making it harder to keep saying the right things while shipping the old reality.

If any of this resonates, have a read and leave your feedback – the paper is here.

Updated on 04/03/26 to reference Andrej Karpathy’s post discussing agentic engineering and to be clear about why AI-assisted delivery is my choice of words.