Thoughts from the mind of Ben Welby

Tag: Vibe coding

Beyond the vibes

Last June I wrote a reflection on how vibe coding had made it possible to create some fireworks that helped bring our vision for the future of public employment support to life. In the months since, those fireworks have become a portfolio of provocatypes, and a lot of learning and a cascading number of new ideas.

And it all has the feeling of standing close to fast machinery. You’ll know what I mean if you’ve ever been to an old mill turned industrial museum where the looms are up and running. Powerful, amazing, machinery. That could absolutely rip your arm off.

It’s hard to imagine what it would have been like when Edmund Cartwright first showed off his power loom, let alone what it was like when mill after mill was filled with them and all you’d ever known was hand looms and spinning wheels.

AI-assisted delivery and the craft of confidence

But I think if you’ve spent any time with AI-assisted development tools over the last year and a bit then you’ll know what I’m talking about. Maybe you’ve only just started with the new models in the last few weeks and you’ve read Matt Shumer’s widely shared Something Big is Happening piece which says things I can absolutely identify with. Or maybe, like me, this has been a slower burn and for a while now you’ve had that intoxicating mixture of exhilaration and dizziness.

Either way, I do think we are past an inflexion point like the one that greeted the start of the 19th century: a kairos moment for digital delivery where the nature of what it is to work in digital government has changed shape.

There’s a popular term for this: vibe-coding. But as I explore in the chapter why vibe-coding matters (and what it isn’t), the phrase is really the hook, not the substance. This world where we can use natural language to describe what we want and have working software generated moments later. But there is a tension in that language. It immediately sounds like flimsy demo-ware, and like a rejection of discipline or craft.

Vibe coding: the hook, not the point

And when it comes to building services that respond to the needs of the public, vibes aren’t enough.

So some months ago now I started to try and write about what this might mean for our disciplines and our craft. It has taken me longer than I wanted, but I’m happy enough with them to publish my reflections on on what changes for product management in a vibe-coded world — and the broader argument collected at https://vc-product.wel.by/.

A small warning: it’s long. I think it deserves your time (which will be much less than the time I gave it), because it’s trying to take seriously what AI-assisted delivery changes, and what it doesn’t. So I’ve woven in links to take you into the dedicated chapters where I try and unpack each idea. You’ll see there’s breadth but that’s because the implications are broad: for how we work today, and how we’ll need to work tomorrow.

But while vibe-coding is the hook, I think we are really talking about something that needs its own name.

Andrej Karpathy, who coined the original phrase, has suggested ‘agentic engineering‘, and that’s useful because it insists that this is becoming a professional workflow: you orchestrate agents, you scrutinise output, and you keep the quality bar intact. That makes it serious rather than a good party trick.

But it’s not the frame we need when it comes to our public services.

‘Agentic engineering’ names how software gets written. But we’re grappling with how public value gets added. In government (and more than likely elsewhere too), the unit of delivery is not an individual with a clever workflow – the unit of delivery is the multidisciplinary team, and the enabling environment wrapped around it. It’s policy, ops, analysis, content, design, engineering, and their leaders getting closer to runnable reality sooner, together. That’s why in my writing I’m using AI-assisted delivery.

Continue reading

Pocket, Pavement, Platform: Government in the App Store and on the High Street – Part 3

This was one big post, and now it’s five smaller pieces thinking about what public service really means in a digital age, and the risks of mistaking convenience for coherence. I started by wondering about how far fitting government into our pockets offers real transformation. In the last post, the topic was the underlying plumbing that makes everything else possible. The next piece will argue for an omnichannel approach that designs for every doorway. And when you make it to the end then your reward is a piece that is all about Goths.

But now, in this third part, I want to you to think about the future (which isn’t too far from being the present): where where the interface melts away altogether. What happens when services are no longer tapped, but summoned? As AI agents emerge, does that realise the dream of transformation, or is it just that it keeps complexity out of sight?

Disappearing interfaces don’t disappear the problem

If apps promise pocket government, AI now promises agentic government: services summoned through conversation, no forms or websites needed, just a natural interface that handles everything for you. It’s an appealing vision, and maybe not far off in some domains. But abstraction without foundation risks leaving people behind.

Apps, when done right, can be transformative. They can bring government closer, offering convenience and speed for those who want it. The GOV.UK App and the wider GOV.UK ecosystem could deliver that promise. Pick the most recent government service you interacted with and imagine its app-enabled future.

For me that’s renewing my driving licence: a push notification from the GOV.UK App, thumbprint authentication (GOV.UK One Login), reusing a passport photo (Home Office), paying via GOV.UK Pay, confirming via GOV.UK Notify and a renewed credential in my GOV.UK Wallet. A seamless journey in seconds, where the user barely notices the machinery – DVLA, Home Office, GDS, or otherwise – because the ecosystem just works. Apps shine for tasks like these – quick, personal, and always on hand; when the infrastructure supports them.

That’s also the transactional promise GOV.UK has offered since 2012: one platform, one ecosystem with one consistent user journey. And, in 2012 and still today, that vision demands simple, integrated, permissioned services: plumbing that works and data that flows. 

But without that plumbing then an app is just another channel, not a platform. Right now the GOV.UK App feels like it’s a solution in search of a problem. In being distinct from the GOV.UK website, for which 100% of government services are built it’s introducing friction – like requiring authentication to access a website that takes people to services using different ways to log in.

Back in 2013, GDS famously declared: “We’re not ‘appy. Not ‘appy at all.” The principle was clear: standalone apps must wait unless the core web service works as well on mobile, and even then, only by exception and driven by user need. Do not read my callback to that as an oversimplified holding to an outmoded point of view. A decade on, as user needs have evolved and so has technology, apps have a clear and valuable role.

But for government they should always be additive to the web experience. Digital inclusion is not a solved problem and while releasing early and failing fast has its merits, there is a deliberate decision to launch the GOV.UK app before the core web service meets that bar and with the open expectation that many features are going to be exclusive to the app creates a walled garden, not open doors. And for me that runs counter to what made GOV.UK a global exemplar in the first place.

AI amplifies this challenge. An AI-led bit of government in your pocket might navigate complex services but it can’t fix contradictory policies, confusing eligibility, or poor service design. I’ve learnt so much from my vibe coded experiments, one of which was to create an AI-led experience of jobs and careers support. But that example also clearly showed that the value lies not in the interface but in the underlying service.

Anthropomorphising AI is obviously not the right thing to do, but thinking about an AI agent like a person might be. It’s the work of service design – figuring out how to best help someone achieve an outcome. When you design for people whose interfaces onto the service might not be directly through a browser but indirectly via their children or a support worker then that delegated experience also reflects something of the experience for those whose interface of choice is AI.

Indeed, over time, some people will experience a disappearing interface. Entire service journeys will be handled by agents. But right now, no UK government service is designed with that in mind. They’re designed, and as long as the Service Manual and the Service Standard exist, will continue to be designed, to lower the barriers to entry and include everyone. They’re rightly not locked behind an app layer or forcing you to authenticate before you get to the content you need. That safeguards the state as service-shaped, interoperable, and testable, paving the way for an AI-mediated future without excluding anyone today.

Whether or not it’s what Martha Lane Fox had in mind, this is really the embodiment of what it means for government to go wholesale. After creating the digital centre, fixing publishing, and fixing services, the final task was to build the state as a platform: a network of capabilities, not a stack of destinations. Open APIs, shared infrastructure, and services that can flow into the places people already are. Useful then but now essential in this potentially agentic world of ours.

So AI definitely has a role. But it’s a layer, not a solution. A reflection of good service design, not a replacement for it. And any AI-led experience must be one of many. Because for all those who talk to bots, there are plenty who need a human to sit with them on the sofa over a cup of tea.

This is where transformed government shines: services designed for everyone. And that discipline must extend to every channel, digital or physical, to keep the state inclusive. A state built for everyone doesn’t retreat behind an app icon, or vanish into AI. It shows up: for real lives, in real time, across real channels.

You’ve made it past the half way point of my five part series. Next in Part 4, it’s all about the real world and exploring how public services must meet people where they are, not just through screens, but through every available entrance.

In Part 1 I interrogated the appeal of “government in your pocket” and whether it is more valuable than simply being a good soundbite. Part 2 went beneath that surface: to the plumbing that makes service delivery possible. And in the last part we’ll talk about putting GOV.UK on the High Street.

Vibe Coding, Fireworks and the Mortar of Government

A few days ago, I lit the fuse on a working prototype of a government service. No team, no procurement cycle, no waiting for approval. Just me, a few prompts, and a handful of AI tools. And honestly? Fireworks. (Though if you’re looking for what happens after the display ends and what to reflect on how we turn these sparks into sustainable, governed public services, then you might be interested in my follow up – Beyond the Vibes).

Vibe coding (or vibecoding) is an approach to producing software by using artificial intelligence (AI), where a person describes a problem in a few sentences as a prompt to a large language model (LLM) tuned for coding. The LLM generates software based on the description, shifting the programmer’s role from manual coding to guiding, testing, and refining the AI-generated source code.

Vibe coding, Wikipedia

I’m not new to what’s now being called vibe coding. Over the last year ChatGPT has helped me to bring a few random ideas to life1. Last weekend I thought I’d see what Codex CLI could do and I was again blown away. I mentioned this at work and in the conversation that followed we mused on whether some of the frustrations we’d been feeling could be shifted by trying the same thing there.

So I sat down with a laptop, some product instinct, and a handful of different AI tools. I wanted to see whether we could finally conjure the ‘fireworks’ we’d been waiting weeks to set off. I started with ChatGPT and the scale of the task was a bit intimidating. But then I remembered about Firebase and in minutes had something to show off. As I did, another colleague responded by asking if I’d seen Stitch, and another colleague said I should check out Jules.

And once I discovered Jules, that was when things got really interesting. Very quickly I had something live. Not a sketch or simulation, but something real. It’s up and running on Render (and I’d love to give you the link but I probably shouldn’t let it escape into the wild; at least not yet).

Obviously it’s just a prototype. But that also seems to do it a disservice. What is true is that it absolutely appears to do the job we had in mind. No engineers. No designers. Just me, some prompts and decisions, and it works, and it works in a way that will absolutely elicit the right sort of oohs and aahs.

I suppose I ought to make one small confession. I really shouldn’t have done any of this. Inside the department, everything except Copilot is blocked (and even then you only get Copilot on a Windows machine, not a Mac). Which means this burst of delivery joy has happened off network, off platform, and probably against better judgment. But that, too, is part of the problem. When the path of least resistance leads outside the system, it’s the system that needs fixing, not the people finding their way around it. Well, I would say that wouldn’t I?

Now, for our purposes as a team this exercise might be the perfect fireworks but more broadly for government, what are the repercussions?

I’m going to call it: Jules and Codex earn their hype.

Continue reading