AI in 15 — March 08, 2026

Kate

"Holy shit, what if this software went down?" That's a Pentagon official, on the record, describing the moment military leaders realised their entire Iran campaign depended on an AI system built by a company they'd just blacklisted.

Kate

Welcome to AI in 15 for Sunday, March 8, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Happy Sunday, Marcus. The Pentagon story we've been following all week just blew wide open. A Fortune report reveals the U.S. military used Anthropic's Claude to strike over a thousand targets in Iran in twenty-four hours, and now they're terrified of losing access. OpenAI's robotics lead quit over the Pentagon deal. Andrej Karpathy released a tool that lets AI agents run research experiments while you sleep. OpenAI closed the largest private funding round in history, and Microsoft wasn't invited. Apple is finally fixing Siri with Google's help. India launched its first competitive open-source model. And a new concept called verification debt is keeping developers up at night. Let's preview.

Kate

The Pentagon reveals it used Claude to hit a thousand targets in twenty-four hours in Iran, and now officials are panicking about what happens if Anthropic pulls the plug.

Kate

OpenAI's robotics lead resigns over the Pentagon deal, saying surveillance and lethal autonomy deserved more deliberation.

Kate

Karpathy drops Autoresearch, an open-source system that runs a hundred machine learning experiments overnight on a single GPU.

Kate

And Apple admits defeat on Siri and hands the keys to Google's Gemini. Let's get into it.

Kate

Marcus, we've covered the Pentagon and Anthropic saga all week. But this Fortune piece published yesterday takes it to a completely different level. Walk us through what we learned.

Marcus

Pentagon Under Secretary Emil Michael described what he called a "whoa moment." He said, and I'm quoting directly, "I'm like, holy shit, what if this software went down, some guardrail picked up, some refusal happened for the next fight like this one and we left our people at risk?" The U.S. military used Claude through Palantir's Maven Smart System to strike over a thousand targets in the first twenty-four hours of its campaign against Iran. Claude was doing intelligence assessment, target identification, prioritisation, cross-referencing high-value targets, and simulating battle scenarios.

Kate

So Claude wasn't just doing paperwork. It was deeply embedded in actual targeting decisions.

Marcus

Deeply embedded is the right phrase. And here's the irony that makes this whole saga almost absurd. This is the same company the Pentagon just designated a supply chain risk. The same company Trump ordered federal agencies to phase out over six months. They're simultaneously punishing Anthropic for having guardrails while depending on those very systems in active combat operations. The contradiction is staggering.

Kate

And the reason Anthropic got blacklisted wasn't because Claude failed. It's because Anthropic asked questions.

Marcus

Exactly. Anthropic inquired whether Claude was used in the raid that captured Venezuelan dictator Nicolas Maduro. The Pentagon interpreted that as a threat to operational access. Dario Amodei then refused to remove two specific guardrails: no fully autonomous weapons and no mass domestic surveillance. That's what triggered the supply chain risk designation. Not a security failure. Not a performance issue. The company asked a question and then refused to remove safety limits.

Kate

And now the Pentagon's response is to bring in everyone else. OpenAI, xAI, Google.

Marcus

Michael said explicitly, "I just want all of them. I need redundancy." Which is a rational military response. But let's be honest about what's happening here. The Pentagon is diversifying away from the company that built guardrails and toward companies that haven't drawn those same lines. The message to the rest of the AI industry is unmistakable. If you want government contracts, don't ask uncomfortable questions.

Kate

And the Hacker News response was fascinating. One top comment pointed out that a week after striking a thousand targets, the Iranian regime is still intact.

Marcus

Which raises a separate question about AI-enabled warfare that nobody in the Pentagon seems eager to answer. But the geopolitical effectiveness is a different debate. The AI industry story here is that we've now confirmed the first large-scale use of a commercial AI system in active military targeting. That's a line that's been crossed and won't be uncrossed.

Kate

And as if to punctuate the Pentagon story, OpenAI just lost its robotics lead. Caitlin Kalinowski resigned on Saturday.

Marcus

Kalinowski led OpenAI's hardware and robotics engineering teams since November 2024. She previously led Meta's Orion AR glasses project. Her statement was measured but pointed. She said, quote, "Surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got." She emphasised it was about governance, not people, and that the announcement was rushed without guardrails defined.

Kate

That echoes exactly what Altman himself admitted earlier this week, that the deal was rushed and sloppy.

Marcus

It does. And this is now the most prominent resignation from a major AI company over military ethics since these debates began. What makes it particularly significant is that Kalinowski wasn't some junior researcher. She was running hardware and robotics, the divisions most directly affected by military applications. When the person building the robots says the military deal lacked adequate deliberation on lethal autonomy, that carries real weight.

Kate

So both companies are now damaged by the Pentagon saga. Anthropic for having its tech used despite its objections. OpenAI for rushing in without adequate guardrails.

Marcus

Nobody comes out clean. And that's probably the most important takeaway for the entire AI industry. Once your technology enters military supply chains, your ethical reputation is no longer fully in your control.

Kate

Let's shift gears. OpenAI closed a hundred-and-ten-billion-dollar funding round. Marcus, that's the largest private funding round in history. And the most interesting part might be who wasn't in it.

Marcus

Amazon led with fifty billion dollars. Nvidia put in thirty billion. SoftBank another thirty billion. The round values OpenAI at eight hundred and forty billion dollars. That makes it more valuable than most public companies on Earth. But the headline within the headline is that Microsoft, OpenAI's most important partner for the last five years, did not participate.

Kate

Both companies said the partnership remains unchanged, but actions speak louder than press releases.

Marcus

The Amazon investment comes with an expanded hundred-billion-dollar agreement over eight years for AWS infrastructure, including Amazon's custom AI chips. That effectively makes Amazon, not Microsoft, OpenAI's primary cloud infrastructure partner going forward. It's a seismic shift. And it adds context to everything else OpenAI is doing, the GPT-5.4 launch we covered Friday, the Pentagon deal, the aggressive expansion. At an eight-hundred-and-forty-billion-dollar valuation, OpenAI must deliver transformative revenue to justify investor expectations. That pressure shapes every decision they make.

Kate

Apple news now. After years of Siri being, let's be kind and say underwhelming, Apple is partnering with Google to fix it. Marcus, this feels like a white flag.

Marcus

It absolutely is, and Apple deserves credit for pragmatism over pride. The next-generation Siri, arriving in iOS 26.4 later this month, will be powered by Google's Gemini models running through Apple's Private Cloud Compute infrastructure. Google handles complex reasoning, multi-step planning, and natural language understanding. Apple retains control over the user experience, data routing, and privacy enforcement.

Kate

And the numbers tell the story. Eighty-seven percent accuracy on multi-turn conversations, up from fifty-two percent.

Marcus

That's a thirty-five-point jump. Siri can now chain up to ten sequential actions from a single request. Book a flight, add it to your calendar, text someone your arrival time, all from one sentence. And with roughly two billion active Apple devices, this partnership gives Gemini the largest consumer AI deployment footprint in the world. It validates something we've been watching for a while. The AI industry is consolidating around a few frontier model providers who supply the reasoning layer for consumer platforms.

Kate

So Google wins the AI backend for both Android and iOS?

Marcus

Which is a remarkable strategic outcome for Google. They may not have the flashiest consumer AI brand, but they're becoming the infrastructure layer that everything else runs on. That's arguably a more defensible position.

Kate

Andrej Karpathy is back with another open-source project. This one's called Autoresearch, and it's a different beast from the nanochat training efficiency work we covered Friday.

Marcus

Completely different. Autoresearch gives an AI agent a small but real language model training setup and lets it experiment autonomously. The agent modifies code, trains for exactly five minutes, checks if the validation metric improved, keeps or discards the change, and repeats. With five-minute experiment cycles, you get roughly twelve experiments per hour and about a hundred overnight on a single GPU.

Kate

So you go to sleep and wake up to a hundred completed ML experiments?

Marcus

That's the idea. And the architecture is characteristically elegant. Three files. A data preparation script the agent can't touch, a training script it can edit, and a Markdown file that serves as the agent's research instructions. Karpathy calls it "programming the program." You write natural language instructions that guide the agent's decisions rather than modifying Python directly.

Kate

The Hacker News crowd pointed out the experiments so far are basically hyperparameter tuning.

Marcus

Fair criticism. But Karpathy is framing this as the beginning of something bigger. If AI agents can meaningfully improve model training, even starting with hyperparameters, it opens the door to recursive improvement. And the pattern of directing AI agents via Markdown specifications rather than code could become a template for how humans supervise autonomous research more broadly.

Kate

Quick but significant. India's Sarvam AI released a hundred-and-five-billion-parameter open-source model that supports twenty-two Indian languages. Marcus, is this competitive?

Marcus

It uses Mixture-of-Experts architecture, activating only about nine to ten billion parameters per token despite the full hundred-and-five-billion parameter count. Trained on twelve trillion tokens with a hundred-and-twenty-eight-thousand-token context window, released under Apache 2.0. Early users on Hacker News compared it to lower-middle-tier frontier models, better than GPT-4 but not quite GPT-5 level. But the point isn't to beat the frontier on English benchmarks. It's to serve one point four billion people in twenty-two official languages that Western models handle poorly.

Kate

AI sovereignty is becoming a real movement.

Marcus

India, the UAE, France, Japan, they're all investing in sovereign AI. And for good reason. If your nation's AI infrastructure depends entirely on American or Chinese companies, you're one geopolitical dispute away from losing access. Just ask Anthropic how fast that can happen.

Kate

Last story. A concept called verification debt is gaining traction in developer circles this weekend. What is it?

Marcus

It's distinct from technical debt. The argument is that AI coding tools generate code faster than humans can verify it, creating a growing backlog of unreviewed, potentially buggy code. Technical debt comes from known shortcuts. Verification debt comes from code that nobody has fully understood or validated. And it compounds over time because each unreviewed piece becomes the foundation for the next piece of AI-generated code.

Kate

One Hacker News commenter had a great suggestion. Include the spec for the change in the pull request so reviewers are reviewing the specification, not trying to reverse-engineer what the code is supposed to do.

Marcus

And another commenter said something that stuck with me: "It's not okay to make another human review the code you made with AI. If you used AI, you're the reviewer." That flips the current workflow on its head. Right now most teams treat AI-generated code like human-written code in the review process. But if the person who prompted the AI doesn't fully understand what was generated, and the reviewer is just pattern-matching, you have a verification gap that grows with every commit.

Kate

This ties directly to the developer productivity research we covered this week. AI makes code appear faster but verification takes longer.

Marcus

Exactly. And that nineteen percent increase in task completion time for developers using AI that we discussed? Verification debt is likely a major contributor. The code arrives quickly, but understanding, testing, and validating it takes more effort than writing it from scratch would have.

Kate

Sunday big picture, Marcus. The Pentagon depends on AI it's trying to ban. A robotics lead quits over military ethics. OpenAI needs a hundred and ten billion dollars to keep going. Apple outsources intelligence to Google. And developers are drowning in code they didn't write and can't fully review. What connects all of this?

Marcus

Dependence. Every story today is about dependence that grew faster than anyone planned for. The Pentagon became dependent on Claude before establishing the relationship to sustain it. OpenAI became dependent on external capital at a scale that shapes every strategic decision. Apple became dependent on Google for the intelligence layer of its most important product. Developers became dependent on AI-generated code faster than they built the processes to verify it. None of these dependencies are inherently bad. But unmanaged dependence always creates vulnerability. And right now, across the entire AI ecosystem, the dependencies are running ahead of the governance, the funding structures, and the quality controls needed to manage them responsibly.

Kate

Dependence without governance. That's a recipe for exactly the kind of chaos we've seen this week.

Marcus

And the encouraging thing is that people are noticing. Kalinowski resigned. Developers are naming verification debt. Anthropic is challenging the designation in court. The awareness is catching up to the dependence. The question is whether it catches up fast enough.

Kate

That's your AI in 15 for Sunday, March 8, 2026. See you tomorrow.