Anthropic's Code Output Grew 8x — Now the Bottleneck Is Verification

Source: Lenny's Podcast | Published: 2026-06-21T12:30:02Z

Anthropic engineers are shipping 8x more code per quarter, but a new bottleneck has emerged: verifying that code submitted by designers and PMs actually works.

Anthropic engineers are shipping code at 8x their pre-2025 quarterly average. That number isn't just about better tools — it represents a fundamental rewrite of the logic underlying software engineering itself.

Fiona Fung leads the Claude Code and Claude.ai teams, overseeing groups run by Boris Cherny and Cat Wu. She spent 11 years as an engineer at Microsoft before joining Meta, where she built Facebook Marketplace from scratch and scaled it to over $100 billion in annual GMV. After that came Meta's smart glasses, the Orion AR glasses, and Instagram's infrastructure and security teams. Now she occupies a unique position: the team she runs is fundamentally changing what it means to be an engineer.

The bottleneck has shifted from writing code to verifying it

Fiona's talks keep returning to one central claim: writing code is no longer the bottleneck. That might sound like a throwaway line, but she's not saying code gets written faster. She's saying the strategic importance of writing code has been completely transformed.

Her examples are specific. Inside Anthropic, it's not just engineers committing to the codebase — designers, PMs, and nearly everyone else is checking in code too. Once the ceiling on output gets raised, a new bottleneck naturally surfaces: how do you verify that the code is good? How do you ensure that what ships at 8x velocity actually works?

She draws an analogy to the shift from physical CD distribution to online software delivery. Burning a disc created a hard deadline — you had to plan everything within a fixed window, so planning itself was load-bearing. Now that release cycles have compressed to near-zero, the problem becomes: with this many things shipping simultaneously, how do you know what's broken?

Everyone is becoming a builder

When Fiona describes the current state of her team, she gravitates toward one word: builder. Not "full-stack engineer." Not "T-shaped talent." Builder. Boundaries are dissolving. Functions are merging.

On the Claude Code team, PMs are already shipping features directly. When an engineer's queue is full, a PM rolls up their sleeves and builds part of it themselves. A backend engineer can now ship an Android feature with Claude's help — not because they're an Android expert, but because they no longer need to be.

This isn't a handful of early adopters running an experiment. She recently talked to an engineer who said he wanted to extend a feature to mobile. Previously that would've been blocked — "I don't know Android." This time he just did it. Fiona's framing: "It raises the ceiling on what everyone is capable of."

She uses Claude to manage a team running at 8x

When delivery velocity jumps 8x, traditional management rhythms can't keep up. Fiona's solution: she's set up a persistent Claude Code session connected to all her repos, with access to every Slack channel and product metrics dashboard.

Once a month, she shares her screen and runs a Claude Code session with team members — not to generate PRs, but to reflect. What areas did we focus on last month? How are the features we shipped performing? What themes are showing up in user feedback? She says this process surfaces problems that previously required intuition or hours of manual work to find — like which parts of the codebase have clustered incidents, or where quality investment is overdue.

"I used to open the Slack feedback channel every morning with my coffee and manually pick out things I could act on. Now I've set up a routine that runs automatically every morning. By the time I wake up, there's already a summary — and a few PRs ready for me to review."

The logic is simple: when both information volume and delivery volume have increased 8x, manual processing can't scale. The only option for leadership is to automate the morning routine itself.

Routines: the next layer of abstraction has arrived

Fiona believes engineering work is shifting entirely toward the asynchronous, and the product she's most excited about right now is Routines.

The way she explains it is illuminating. First you wrote a prompt and waited synchronously for a result. Then you could run several prompts in parallel. Now Routines let you write a "prompt that generates prompts" — you tell it to check a feedback channel every morning, identify bug themes, generate targeted polish fixes, and open PRs. You sleep. It works.

Her metaphor: the abstraction layer keeps rising. From hand-written assembly to high-level languages, from synchronous calls to async queues, and now from writing prompts to writing routines that govern agent behavior. In her view, this is the next structural shift in how engineers work — not doing the same things faster, but supervising at a higher level.

The people who fall behind usually feel out of control

When the conversation turns to why some people struggle with AI tools, Fiona's diagnosis isn't "insufficient skill" — it's "loss of perceived control."

She says what she mostly sees is fear, and the hardest thing about fear is that it's typically tied to a sense that everything is outside your control. Her own coping mechanism is a single question: in this situation, what is actually within my control?

She tells a story from high school. She wanted to study visual arts and wasn't confident about the sciences, worried she couldn't get into engineering school. The Bank of Canada had posted a flyer at her school recruiting high school students for a summer teller job. She applied — despite hating her accounting class. The teller income carried her through university, and she kept at it through graduation, spanning the dot-com bust when no one was hiring.

Her read: the growth mindset divide already existed before AI. AI just made the gap more visible.

She's hiring two kinds of people: dreamers and diggers

Fiona has simplified her 2026 hiring into two profiles: product-minded creative builders, and deep systems specialists who own the hard parts.

Both are essential, but for different reasons.

The product-minded builder is someone who gets excited by a product idea, builds a prototype, iterates relentlessly, and doesn't stop until the user experience feels right. This is what becomes genuinely scarce once AI tools lower the barrier to implementation — knowing what to build matters more than knowing how.

The deep systems specialist exists because of "trust but verify." Claude is excellent at generating code, but in distributed systems and low-level architecture, you need people who can judge whether the generated output is actually trustworthy. She says the first gap she noticed when she joined the Claude Code team was a shortage of engineers with serious systems backgrounds.

High agency must come with high accountability

On team culture, she names a pairing: high agency and high accountability are two sides of the same coin.

Agency means everyone has ideas, everyone can drive things forward, no one needs to wait for approval. But accountability means you have to articulate: what problem am I solving? What are my assumptions? How will I verify the outcome?

Agency without accountability is dangerous, she says — you're moving fast but not necessarily making progress. She quotes one of her favorite lines: don't confuse motion with progress.

This is also her core criterion for evaluating engineers — not tokens consumed, not lines written, but whether they can clearly state what they're solving and whether it actually happened.

The trap in measuring AI-era engineering productivity

Fiona is deeply skeptical of productivity metrics. She's watched the evolution from line counts to "meaningful lines" to PR merge time, and each metric gets optimized in ways that decouple it from what you actually cared about.

Her early Facebook Marketplace experience makes the point sharply. They used "number of sellers" as the gating metric for expanding into new regions — then found a region with few sellers where buyers could still find everything they wanted, because a handful of power sellers covered the demand. By the original metric, they nearly misidentified the region as unhealthy.

What matters is always outcomes, not proxy metrics for output. Her advice: instead of building dashboards, do listening tours — talk to your most senior engineers, hear what's working and what isn't. The information density is far higher than any metrics page.

The quality framework: distinguishing bad from sad

For tracking code quality, she's built a two-tier framework: bad (unrecoverable, serious errors) and sad (recoverable but frustrating experience problems).

The value isn't precision — it's giving every sub-team a shared language while leaving enough flexibility for each team to define its own boundaries. A CLI crash is bad; a UI flicker is sad. The definitions vary by surface, but you can aggregate across teams to read overall experience trends.

She's found that when you look at raw performance numbers alone, it's hard to know whether a given number is good or bad. But when you first establish "bad/sad" judgments and then look at the numbers below that, decisions become clear.

Code is changing, but loneliness is a new problem

When everyone on the team is primarily working with their own agent, collaboration between engineers becomes less frequent. Fiona says the Claude Code team recently noticed this — some people were starting to feel isolated.

Their solution was a pair programming lunch. Not to have two people build the same thing together, but more like parallel play for kids — everyone works on their own thing, but you're sitting together and can see how the other person works.

The biggest insight, she says, was discovering that even on the same team, everyone uses Claude Code completely differently. Just watching someone else work teaches you things no documentation or onboarding tutorial ever could.

Every new manager must do IC time first

Fiona has instituted a rule on the Claude Code team: every new manager who joins must spend an initial period doing only IC work, with no people management responsibilities.

Her reasoning: management is a heavy responsibility. If you're asked to support others on day one, you'll reflexively reach into the "manager toolkit" and start running management plays — while knowing almost nothing about the codebase, the product, or the team culture.

By contrast, if you spend your first three months just writing code and being an IC, the operational understanding and team trust you build go deeper than any onboarding meeting. She went through this herself when moving from Microsoft to Meta — she spent the entire first quarter shipping code as a regular engineer, because she needed to know what it felt like to be a Meta engineer.

She still reviews PRs today. Not to prove something, but because if you're not using the product every day, you lose the feel for it. Metrics and dashboards tell you whether there's a problem. Feel tells you what the problem is like.

JIT planning: six-month roadmaps are obsolete

Planning has changed too. When Fiona joined the Claude Code team, her first instinct was to build a six-month roadmap — she was careful to keep it lightweight. Three months later, she found that no one was reading it anymore, because the world had moved on.

Now she uses JIT planning — just-in-time, at monthly granularity. Not a document — a small spreadsheet listing the most important directions for the month. Weekly check: are these still the priorities?

There's something else embedded in this design — a goal she's been chasing: automating the act of updating the spreadsheet itself. She doesn't want anyone to experience "updating the roadmap" as a tax.

The unsolved problem: the cognitive cost of context switching

She admits there's one problem she hasn't cracked: as the number of async agents grows, the cognitive cost of context switching is rising.

Deep coding work used to demand focused, uninterrupted time, so you'd deliberately block out hours. Now you can kick off 20 agents simultaneously — but at some point you still have to pick up all that context. That work hasn't gone away. It's just been deferred.

She finds herself blocking out focused time again — not to write code, but to catch up on all the async work already in flight. There's no answer yet. But she thinks this is the next friction point engineers must face as work enters the multi-agent era — and maybe the next product opportunity.