logo
Nadella’s Test: What’s Left When The AI Model Is Pulled?
Microsoft CEO Satya Nadella (R) speaks as OpenAI CEO Sam Altman (L) looks on during the OpenAI DevDay
SAN FRANCISCO, CALIFORNIA - Microsoft CEO Satya Nadella (R) speaks as OpenAI CEO Sam Altman (L) looks on during the OpenAI DevDay event on November 06, 2023. (Photo by Justin Sullivan/Getty Images)
Getty Images

Satya Nadella defined a sharp test for whether your company will hold a defensible moat as artificial intelligence accelerates: you should be able to swap out a generalist model without losing the “company veteran” expertise built into your learning system.

At first glance, this may sound obvious. If removing the model means you no longer deliver value, the model was the one doing your job. But Nadella's read is more nuanced, grounded in an understanding that where we go from here is not yet decided — despite increasingly loud voices arguing that the window has closed and the leading frontier labs will inevitably win it all.

The Model Is Not The Moat

A significant share of frontier capabilities will be commoditized — every hyperscaler will eventually source enough compute, and open source models trail closely. Willingness to pay for frontier intelligence won't disappear, especially where AI can deliver meaningful breakthroughs. But converting compute into a sustainable business edge takes more than a model — something OpenAI itself has acknowledged.

Some believe that if you have the best model, you win. That the first to build self-improving AI will open a lead too wide to close. Nadella's test inverts that: build a future where you can swap the model without loss, and the model is by definition a commodity input — however capable, however far ahead.

We are far from that world today. Anyone who experienced Fable's productivity boost, then saw it vanish when the model was banned, was reminded of the incredible value of frontier intelligence. Which is also why many are clamoring for a more open, resilient ecosystem, or even a rebel alliance: a handful of leading providers concentrates too much power — economic and political — in too few hands.

MORE FOR YOU

From Intelligence To Verification

Still, the deeper shift is structural: as AI drives the cost of execution toward zero, the binding constraint stops being intelligence and becomes verification. Tasks with measurable inputs and outputs get cheaper. But in many critical domains, checking the work does not — it still runs on scarce experience, long feedback loops, and someone willing to stand behind the result. The top experts in those domains become the critical verifiers and underwriters of agentic work.

So what does a world where the model is hot-swappable look like? One where leading firms refuse to concede their crown jewels to the foundation models. That crown jewel is verification infrastructure: everything the firm has measured and can measure, plus everything its top experts have learned — the accumulated experience that shapes how they verify, judge, curate, and apply taste on the job.

Nadella correctly identifies the associated self-improving loop as the new, key intellectual property of the firm. And he is right that, set up correctly, it compounds:

“I think of it as a hill climbing machine. And unlike most assets, it compounds. Every improved workflow generates better training signal, which accelerates the accumulation of tacit knowledge unique to the firm. The companies that build this early will have an advantage that is hard to replicate, regardless of any new individual model capability.”

The Only Loop That Compounds

That advantage is a network effect, but not the familiar kind. For two decades the strongest moats grew with sheer participation: more users, more sellers, more developers and apps. In an agentic economy that weakens, because agents can manufacture the activity that used to signal health, and port real inventory and workflows across platforms in minutes. Scale can even invert: when millions of agents optimize the same metrics, they flood a network with plausible noise it mistakes for quality, and more activity makes it less valuable, not more.

One kind survives, and we called it verification-grade network effects. Every dispute resolved, fraud caught, and expert correction becomes reusable precedent: a settled case that lets the firm safely automate the next case faster. That is the asset that deepens with use. It cannot be manufactured with compute, because it is earned one verified outcome at a time.

Every time a firm turns inputs such as atoms, information, and tokens into a more refined version of them, it scales those network effects. But only if the same pass also sharpens the ground truth it relies on, or codifies a piece of the tacit knowledge locked in its experts' heads.

The second move is the harder one, and it is what will separate the firms that thrive with AI from the ones that get commoditized by it. That judgment, what to approve, what to reject, which exception matters, lives in the part of the workflow no one outside the firm can codify as well. Encode it as reusable ground truth and the next agent applies it without the expert in the loop, freeing the scarcest resource the firm has and widening the share of work it can trust at scale.

Scale is only defensible when each round produces verified precedent the next one can use.

In the near-AGI era, every company has one job: converting the remaining bottleneck — verification — into better proprietary measurement and more automation, as fast as possible.

AI-native firms have already learned to leverage their own record of judgment: which outputs were approved, which were rejected, which edge cases mattered. For that record to be defensible, it has to be built around outcomes only the firm can measure at scale. Which is why Nadella reaches for a distinctly private architecture: private evals, private RL environments, performance benchmarked against the outcomes the business cares about, not public leaderboards.

Inside those firms, as Nadella puts it: “Employees will see their expertise amplified and their judgment become part of systems that make it replicable and scalable". What he leaves unsaid is that this is not always good news for the verifiers themselves — some will automate themselves out of their job while the firm keeps the replicable, scalable parts. The ones who thrive will use the new tools to move up the intelligence stack and further scale their time.

For firms and individuals alike, in a competition that is now global, value accrues to exactly what others cannot measure and judge with the same precision.

But measuring is not the same as measuring the right thing. Nadella’s hill-climbing machine only compounds if it climbs the right hill. Point a loop at the wrong measure and it accumulates plausible output that satisfies the metric while violating intent — impeccable, and worthless.

The counterintuitive part is that some of a firm's most valuable records are its failed experiments, errors, and missteps. Rejected output, captured right, is the best training signal there is for institutional judgment.

This is why verification is the missing word in Nadella’s test. A verified loop compounds into long term value. An unverified one accumulates hidden liability, and the two look identical — right up until they don't.

Follow The Incentives

It would be easy to dismiss this as Microsoft simply talking its book. It sells the picks and shovels for verification loops. It profits from a world where value accrues one level up from the model — in the context, evaluation, workflows, and institutional memory that make models useful inside a firm. But notice the stakes behind Nadella’s pitch. If a frontier lab reaches AGI and the model swallows the work, Microsoft is finished as anything but a passive shareholder — the company selling you the harness is disintermediated the same day you are. Nadella, more than almost anyone, cannot afford his thesis to be false.

Microsoft wants you to own your loop, on its rails. The frontier labs need the opposite: they can’t defend the model on its own — the Bitter Lesson commoditizes it first — so they extend into the layer that is defensible, which is yours.

The loops a firm sees as sovereignty are the loops the labs see as their next training frontier.

They will bundle the verification signal your operation emits every day into their evals — for your benefit, of course. Accept it, and the token capital you thought you were building turns out to be theirs.

Nadella is right that the firms of the next decade run on top human capital and token capital together. What his picture leaves out is how those convert into advantage. The moat is not the model. It is not even the learning loop. It is the part of the loop a company will put its name, balance sheet, and future on: the ability to perform verification, for a specific set of jobs, better than anyone in the world.

The same logic that puts Microsoft at risk runs all the way down to you. Every knowledge worker holds domain expertise a model would happily absorb. What is still yours is the verification it cannot do without you. Yet.

SOURCE
FrontierSwapADvantageMetricEVENWHYHOLDEveryAIOpenAIOwnMicrosoftNEXTSHARESignal

LATEST NEWS

loading...
© 2025 Foresight News. All Rights Reserved.