AI in 15 — February 25, 2026
Drop your safety guardrails by Friday, or we'll use a law from 1950 to take your technology by force. That's not a movie plot. That's the United States Secretary of Defense talking to an AI company this week.
Welcome to AI in 15 for Wednesday, February 25, 2026. I'm Kate, your host.
And I'm Marcus, your co-host.
Marcus, Wednesday already and this week just keeps escalating. Let's preview what we're covering today.
The Pentagon has given Anthropic a Friday deadline. Remove your AI safety guardrails or face contract cancellation, blacklisting, and possibly the Defense Production Act. We'll unpack what that actually means.
Security researchers discovered that every time you uploaded your passport to use ChatGPT or Discord, that data may have been routed straight to federal surveillance agencies.
A startup nobody's heard of just built what they claim is the fastest reasoning model in the world, and they did it by throwing out the way every other large language model works.
OpenAI quietly cut its compute spending target by eight hundred billion dollars. Yes, billion with a B, eight hundred of them, gone.
Cloudflare rebuilt the entire Next.js framework in a week using AI, with ninety-five percent of the code written by Claude.
A Federal Reserve governor warned that AI unemployment might be the kind of problem the Fed literally cannot fix. And Andrej Karpathy thinks your command line is about to become the hottest interface in tech. Let's get into it.
Marcus, we've been covering the Pentagon versus Anthropic standoff all week. On Monday it was the prediction markets. Tuesday it was xAI signing Grok into classified systems and Hegseth summoning Amodei. But what happened in the last twenty-four hours is a genuine escalation. Walk me through it.
Defense Secretary Hegseth has given Dario Amodei until Friday evening to grant the military unrestricted access to Claude. The meeting reportedly included Deputy Secretary Steve Feinberg and other top Pentagon officials, and the tone was described as tense. Hegseth branded Anthropic's safety policies as, quote, woke AI, and demanded the company remove its guardrails against autonomous weapons and mass domestic surveillance.
And the threats are not abstract anymore.
Not even slightly. Three specific penalties are on the table. First, cancellation of Anthropic's two-hundred-million-dollar DoD contract. Second, blacklisting from all future military work, which as we discussed earlier this week would cascade through the entire defense supply chain. And third, and this is the one that made everyone sit up, the possible invocation of the Defense Production Act.
Okay, explain that one. Because I had to look it up. This is a Korean War-era law?
From 1950. It was designed to let the government compel private companies to produce goods for national defense during emergencies. It's been used for things like ordering factories to make ventilators during COVID. But using it to force an AI company to hand over its technology and strip out safety features? That would be completely unprecedented. The Washington Post is reporting that Hegseth is seriously considering it.
And Anthropic's response?
They're not budging. Sources say they have no plans to move on their two red lines: no AI-controlled weapons and no mass surveillance of American citizens. Amodei has called these uses illegitimate and prone to abuse. And the Hacker News community is largely rallying behind them. The top comments were things like "not a good look for the Pentagon" and praise for Amodei being the only major tech executive with a spine.
Now, here's where it gets interesting, Marcus, because on the same day this ultimatum was reported, Anthropic released something else. Version three of their Responsible Scaling Policy. And TIME magazine ran it with the headline "Anthropic Drops Flagship Safety Pledge." What's actually going on?
The RSP 3.0 is a structural overhaul of how Anthropic thinks about safety commitments. Three big changes. First, they're separating commitments they'll uphold unilaterally from recommendations they think the whole industry should adopt. Second, they're requiring themselves to publish what they call Frontier Safety Roadmaps with concrete, publicly accountable goals. Third, regular risk reports every three to six months quantifying risk across deployed models.
So on paper, that sounds like more transparency, not less.
On paper, yes. And many people in the AI safety community read it that way. But the TIME headline grabbed onto the separation of commitments from recommendations. By putting some things in the recommendation bucket instead of the commitment bucket, Anthropic gives itself more flexibility. Whether you see that as mature governance or loosened standards depends entirely on your starting assumptions about the company.
And the timing is just exquisite. Standing firm against the Pentagon on Monday through Friday while restructuring your safety framework on Tuesday.
The juxtaposition is almost poetic. Anthropic is simultaneously the company that won't let the Pentagon remove guardrails and the company that's reorganizing which guardrails are mandatory versus optional. Both things can be true. You can hold a hard line on autonomous weapons while giving yourself more room to maneuver on other safety questions. But it makes the narrative messy, and the people who want to paint Anthropic as either heroes or hypocrites can both find evidence this week.
Friday is going to be a very interesting day.
Whatever happens by Friday evening will set a precedent that lasts for years. If the Pentagon successfully compels an AI company to remove safety features, every other company knows the playbook. If Anthropic holds and survives the consequences, it proves that principled positions are viable even under extreme government pressure. The stakes are as high as they've been in this industry.
Alright, let's move to a story that honestly made my stomach drop. Security researchers published an investigation alleging that Persona, the identity verification company used by OpenAI, Discord, and other major platforms, has been routing user biometric data to federal agencies. Marcus, how bad is this?
The researchers found fifty-three megabytes of unprotected source code, over two thousand files, from Persona's government dashboard sitting on a publicly accessible FedRAMP-authorized endpoint. The code reveals that when you submit your passport or ID for verification, you're subjected to facial recognition checks against politically exposed persons databases. Your selfie gets a similarity score. Your name gets checked against watchlists. And here's the part that really stings: re-screening happens every few weeks.
So this isn't a one-time check. They keep looking at you.
Ongoing surveillance. And the researchers found code that routes collected identity data to FinCEN, the Treasury bureau that combats financial crime. A new subdomain discovered on February 4 appears to match ICE's four-point-two-million-dollar AI surveillance tool. This story hit five hundred and twenty-nine points on Hacker News, the most viral item of the day. European commenters were asking whether their governments should be investigating this.
Let me make sure I understand the chain here. I signed up for ChatGPT. OpenAI uses Persona to verify my identity. I upload my passport. And that data potentially ends up in a federal surveillance pipeline.
If the researchers' findings hold up, yes. Millions of people who submitted IDs to use ChatGPT, Discord, or other platforms may have been unwittingly feeding government databases. Persona's founder Rick Song has engaged with the researchers publicly but has not refuted the core findings. And the irony here, given the Pentagon story we just discussed, is hard to miss. Anthropic is fighting the government over AI surveillance of Americans, while a company used by OpenAI may already be facilitating a different kind of surveillance of those same Americans.
That is a connection I did not expect to make today.
And it underscores a broader point. The AI safety debate focuses heavily on what models can do. Can they build weapons? Can they generate misinformation? But the identity verification infrastructure around those models may pose risks that nobody is paying enough attention to. The front door has locks. The back door was apparently wide open.
Let's shift to something genuinely exciting on the technical side. A startup called Inception Labs just launched Mercury 2, and Marcus, they're claiming it's the fastest reasoning model in the world. But the interesting part isn't the speed. It's how they built it.
Mercury 2 is built on diffusion architecture instead of the autoregressive approach that every other major language model uses. To explain the difference simply, a model like Claude or GPT generates text one word at a time, left to right, like a person typing. Mercury 2 starts with a rough sketch of the entire output and iteratively refines it, modifying multiple tokens simultaneously in each pass. Think of it less like typing a sentence and more like a painter who blocks in the whole canvas and then adds detail.
And the speed numbers?
Over a thousand tokens per second. That's five times faster than the leading speed-optimized models. And they're claiming performance on par with Claude Haiku and GPT Mini on reasoning benchmarks. Andrew Ng called it impressive. NVIDIA congratulated them on what they're doing with Blackwell GPUs.
Okay, but there's always a catch with these things. Where's the skepticism?
Fair question. On Hacker News, several people pointed out that diffusion-based language models have, quote, simply trailed the Pareto frontier for the vast majority of use cases. Meaning they've historically been fast but not quite good enough on quality. Mercury 2 is claiming to close that gap, but the benchmarks need independent verification. And matching Haiku-class performance isn't the same as matching Opus or Sonnet.
Still, if diffusion models can even approach autoregressive quality at five times the speed, the implications are massive.
Especially for agentic workloads. If you have an AI agent that needs to make fifty inference calls in a loop to complete a task, five times faster on each call means the whole task finishes in a fraction of the time. Speed advantages compound in those scenarios. And the founding team, researchers from Stanford, UCLA, and Cornell who did foundational work on diffusion, gives this real technical credibility. Whether it's a breakthrough or a stepping stone, it's worth watching closely.
Speaking of things worth watching closely, OpenAI just cut its compute spending target by more than half. Marcus, they went from one point four trillion dollars to six hundred billion. That's an eight-hundred-billion-dollar haircut.
OpenAI is now telling investors it plans to spend roughly six hundred billion on compute by 2030. That's down fifty-seven percent from the one-point-four-trillion figure Sam Altman was touting previously. And to justify that spending, they're projecting more than two hundred and eighty billion in annual revenue by 2030. For context, that would put OpenAI above all but three tech companies in the world by revenue.
And where are they now in terms of actual revenue?
They did thirteen-point-one billion in 2025, beating their ten-billion-dollar target. They're projecting twenty billion for 2026. So they need to go from twenty billion to two hundred and eighty billion in four years. That's a fourteen-fold increase. Even by Silicon Valley standards, that's ambitious.
Now, the interesting question. Is cutting eight hundred billion in planned spending a sign of discipline or a sign that the original numbers were never real?
That's exactly what Hacker News was debating. The cynical read is that commitments that can be cut by more than half were never really commitments. They were aspirational press releases. The generous read is that efficiency gains, distillation, better inference optimization, are reducing compute needs faster than expected, so you can do more with less. The truth is probably somewhere in the middle. But either way, this sends a signal through the entire supply chain. Chip makers, data center builders, power companies, everyone who was planning around one-point-four trillion just had their assumptions reset.
And this comes as OpenAI is trying to close a hundred-billion-dollar funding round.
At a seven-hundred-and-thirty-billion-dollar pre-money valuation. So you're telling investors the future costs less than you previously said while simultaneously asking them for more money than anyone has ever raised. It's a fascinating balancing act.
Okay, Marcus, this next one is a story about AI building software, and it's going to sound familiar after our Spotify coverage on Sunday. Cloudflare used AI to rebuild the entire Next.js framework API in approximately one week.
They called the project Vinext. Their team used primarily Claude to reimplement both the Pages Router and App Router with server-side rendering, middleware, server actions, and streaming support. Ninety-five percent of the code was AI-generated. It shipped with seventeen hundred unit tests and three hundred and eighty end-to-end tests. The first commit landed February 13, and by that evening, basic server-side rendering was working.
The Hacker News discussion was heated, I imagine.
Four hundred and fourteen points, a hundred and sixty comments. The split was pretty clean. Supporters saw it as proof that AI can produce production-quality code at genuine scale. Critics pushed back on calling it rebuilt from scratch when you have an extensive existing test suite to guide you. And security-minded people made a very good point: Next.js had remote code execution vulnerabilities because of how they implemented React server-side. Nobody should be rushing to deploy an AI-generated reimplementation without thorough security review.
And the timing is a bit funny because Cloudflare just acquired Astro, which tackles a similar problem.
That was noted. But the bigger takeaway connects to what Spotify showed us on Sunday. Whether it's a streaming company's internal features or a web framework's entire API surface, AI-assisted development is moving from writing individual functions to reimplementing entire systems. One commenter put it perfectly: the better you document your work, the easier it is for someone to clone it with AI. That's a sentence that should keep every open-source maintainer up at night.
Let's talk economics. Federal Reserve Governor Lisa Cook made what might be the most significant central bank statement on AI to date. She said AI could cause unemployment that monetary policy literally cannot fix. Marcus, unpack that for me.
Cook's argument is precise and alarming. She said, quote, our normal demand-side monetary policy may not be able to ameliorate an AI-caused unemployment spell without also increasing inflationary pressure. Translation: if AI displaces workers fast enough, the Fed's usual tool of cutting interest rates to stimulate hiring could backfire by fueling inflation instead.
She also pointed to specific data, right?
Demand for software developers has already declined noticeably, which she identified as the field where AI has made the most progress. Unemployment among recent college graduates is ticking up even while the overall unemployment rate holds steady at four-point-three percent. That divergence is exactly what you'd expect if AI is selectively hitting entry-level knowledge work while leaving other sectors untouched. And it echoes what we've been discussing all week. Spotify's juniors aren't writing code. Cloudflare rebuilt a framework in a week. The efficiency gains are real, and someone is on the other side of those gains.
This isn't a tech CEO speculating. This is a Federal Reserve governor saying the Fed might not have the tools to handle this.
And that's what makes it significant. When a central banker says their institution's primary toolkit may be inadequate for a coming challenge, that's about as close to a formal warning as you get. She specifically called for education, workforce, and other non-monetary policies to carry the burden. Which is a polite way of saying this is a problem for Congress, not for us.
Quick one to close out the hits. Andrej Karpathy posted a thread that went viral, almost eight thousand likes, arguing that command-line interfaces are experiencing a renaissance. Why? Because AI agents can natively use them.
His argument is elegant. CLIs are text in, text out, with well-defined commands. That's exactly the interface that language model agents interact with most naturally. He demonstrated it by having a Claude agent install a tool, query it, and chain commands together, all through the terminal. His point is that decades of well-documented, composable command-line tools are suddenly an enormous asset because they're the perfect tool surface for AI agents.
So all those Unix philosophy enthusiasts who insisted everything should be a simple text-based tool are vindicated?
Completely. The pipe operator from 1973 turns out to be one of the most agent-friendly abstractions ever designed. And Karpathy's implication is that the next wave of developer tools might prioritize CLI-first interfaces not because humans prefer them, but because agents do. Your terminal might become less of a place where you type commands and more of a place where your AI agent works while you watch.
Alright Marcus, Wednesday big picture. The Pentagon is threatening to use wartime powers against an AI safety company. A surveillance pipeline was discovered hiding behind identity verification. A diffusion model is challenging the architecture every other AI is built on. OpenAI erased eight hundred billion dollars from its spending plans. And the Fed is warning about problems it can't solve. What's the theme?
The theme is pressure testing. Every structure in the AI ecosystem is being stress-tested right now, and some of them are cracking. Anthropic's safety principles are being pressure-tested by the most powerful military on Earth. OpenAI's financial projections are being pressure-tested by economic reality. The autoregressive architecture that powers every major model is being pressure-tested by diffusion. The Fed's toolkit is being pressure-tested by a new kind of unemployment. Even the identity verification systems millions of people trusted are being pressure-tested by security researchers, and failing.
And the Cloudflare story pressure-tests the assumption that complex software takes large teams and long timelines.
Exactly. Everything we thought was solid, safety commitments, spending plans, architectural choices, labor markets, software development timelines, is being squeezed from multiple directions simultaneously. The organizations and ideas that survive this pressure testing will define the next era of AI. The ones that don't will become cautionary tales. And Friday's Anthropic deadline might be the single most revealing pressure test of all. Because it's not just about one company and one contract. It's about whether building AI responsibly is a viable business strategy, or a luxury that gets crushed the moment real power shows up and demands compliance.
Friday can't come fast enough.
Or too soon, depending on which side of the table you're sitting on.
That's your AI in 15 for Wednesday, February 25, 2026. We'll see you tomorrow.