AI in 15 — June 05, 2026
Eighty percent of the code at Anthropic is now written by Claude. The CEO says recursive self-improvement may arrive sooner than institutions are prepared for. And this drops four days after the IPO filing.
Welcome to AI in 15 for Friday, June fifth, 2026. I'm Kate, your host.
And I'm Marcus, your co-host.
Big day, Marcus. Anthropic publishes "When AI Builds Itself" — internal numbers suggesting recursive self-improvement is already underway. The Financial Times reports the NSA has embedded Anthropic engineers to run offensive cyber operations on Mythos, over Pentagon objections. Anthropic open-sources Defending Code, a reference harness for autonomous vulnerability discovery. OpenAI ships Dreaming, a background-memory architecture for ChatGPT. 404 Media catches Google employees memeing against their own AI. South Korea is about to mandate AI image scanning on every online forum. And a new UN report says AI data centers will consume as much water as 1.3 billion people by 2030.
Anthropic says the recursive curve is here.
The NSA quietly moves into Anthropic's office.
And ChatGPT learns to dream.
Lead story, Marcus. Walk me through "When AI Builds Itself."
This is the essay everyone in the industry will be reading this weekend, Kate. The Anthropic Institute — a new policy arm spun up this spring — published a flagship piece yesterday disclosing internal numbers that strongly suggest Anthropic has crossed a recognizable threshold on the path to recursive self-improvement. More than eighty percent of the code merged into Anthropic's production codebase in May was authored by Claude itself, up from low single digits before Claude Code launched in February 2025. Engineers are shipping roughly eight times more code per quarter than the 2021-to-2025 baseline. Claude's success rate on open-ended internal engineering tasks jumped fifty percentage points in six months, hitting seventy-six percent.
And the unreleased model numbers, Marcus.
This is where it gets pointed, Kate. The unreleased Mythos Preview model achieved a fifty-two times speedup on a code-optimization benchmark in April. For context — Claude Opus 4 hit roughly three times, a skilled human given four-to-eight hours hits about four times. Mythos Preview also scored sixty-four percent accuracy when proposing the next experimental step in a research pipeline, up from fifty-one percent for Opus 4.5. And Anthropic says it can productively work autonomously for at least sixteen hours on long-horizon tasks. One engineer quoted in the essay says, quote, I started leaning hard into Claudifying about a year ago. That's been five months since I last wrote any code myself.
So is this full recursive self-improvement?
Anthropic is careful to say no, Kate. But they argue it could arrive sooner than most institutions are prepared for, and they're calling for a globally coordinated mechanism that could slow or temporarily pause development of the most advanced systems. The timing matters enormously. This drops four days after Anthropic confidentially filed its S-1 at a nine hundred sixty-five billion valuation and forty-seven billion run-rate. So this is being read two ways at once. As a genuine safety signal — Anthropic showing its homework on why the next eighteen months are dangerous. And as an IPO-eve flex — proof that Claude isn't just a coding tool, it's the engine building the next model. The libertarian read — both can be true, and frankly should be. A company asking for guardrails on itself while telling investors it has the compounding curve is the most honest version of the AI safety conversation we've had. The skeptical read — Anthropic benefits if regulators slow down competitors. Either way, if the numbers hold up under scrutiny, this is the clearest public evidence yet that the compounding curve frontier-lab CEOs have hinted at for two years has actually arrived.
Quick hits. Marcus, the Financial Times and Mythos.
Dark mirror of the lead story, Kate. The FT reported yesterday that Anthropic has stationed about half a dozen forward-deployed engineers inside the National Security Agency to customize and operate Mythos — the same model Anthropic restricted to roughly forty organizations because its offensive cyber capabilities were considered too dangerous for general release. One FT source said Mythos, quote, would be useful for infiltrating the networks of nations such as China or Iran. So this is squarely about offensive operations, not defense.
And the Pentagon piece, Marcus.
This is the part that should make every AI policy person stop, Kate. The Pentagon — which technically oversees the NSA — has formally designated Anthropic a supply chain risk after Anthropic sued the DoD over restrictions Anthropic itself placed on Claude prohibiting use for mass surveillance of US citizens and lethal autonomous drones. So one arm of the US government is treating Anthropic as untrustworthy infrastructure. A sister agency is co-developing offensive cyber capabilities with its engineers in-house. There is no unified federal posture on what frontier labs are allowed to do for, or with, the government.
The investor angle.
This lands in the middle of the IPO roadshow, Kate. Prospective investors now have to price both the upside — deep US-government revenue — and the risk — geopolitical retaliation, export controls, reputational drag if foreign governments start blocking Claude deployments. It is the single most material disclosure Anthropic did not put in the S-1.
And the defensive flip side, Marcus. Defending Code.
Open-sourced yesterday, Kate. Anthropic dropped a reference implementation of the same autonomous vulnerability-discovery pipeline it's been running internally. A seven-stage loop — build, recon, find, verify, dedupe, report, patch — using Claude agents in gVisor-sandboxed containers with strict egress allowlists. It ships with Claude Code skills like slash-threat-model, slash-vuln-scan, slash-triage, slash-patch. The fully autonomous mode fuzzes targets, reproduces crashes in fresh containers, generates exploitability reports, and proposes verified patches. Reference target is C and C-plus-plus memory-safety bugs with AddressSanitizer, but the architecture is intentionally generic.
Economics.
About ten thousand input tokens and two thousand output tokens per agent per minute, Kate. Parallelism is gated by your API rate limit — roughly ten agents per hundred-thousand-token-per-minute tier. Hacker News reaction was mixed-positive. Thomas Ptacek noted serious security teams will end up writing their own harnesses anyway — quote, shop jigs — but acknowledged that two years ago the cost of building one yourself was prohibitive. So Anthropic is trying to make the case, credibly, with working code, that the same models being used offensively can be put into the hands of every defender for free. Whether that holds up under real-world load is the open question.
OpenAI is calling its new memory system Dreaming, Marcus.
Shipped yesterday, Kate. Instead of confining itself to a single conversation or relying on a user-curated saved-memory list, Dreaming runs continuously in the background, synthesizing relevant context across your full chat history without explicit remember-this instructions. OpenAI describes it as, quote, a more capable and scalable system for synthesizing memory, building on a smaller experimental version from April 2025. Plus and Pro users in the US get it first. Go and Free tier rollout worldwide is promised within the coming weeks.
Other shipping today.
A sign-out-of-active-sessions security feature, a Trusted Contact suicide-safety opt-in, and Codex support for Windows Computer Use — the agent can now see, click, and type in Windows applications while debugging.
Why does the name matter, Marcus?
Memory is the wedge between a chatbot you query and an assistant that knows you, Kate. By naming it Dreaming and pushing it to the free tier, OpenAI is publicly committing to ChatGPT as a persistent, personalized agent. That raises the stakes on data-privacy questions enormously. And it tightens the lock-in moat against switchers to Claude and Gemini. Once a model knows your last eighteen months of conversations, the switching cost is no longer the prompt — it's your entire history.
404 Media has a story about Google's internal meme board, Marcus.
Memegen, Kate. 404 reported yesterday that Google employees have been posting a steady stream of memes mocking Google's own AI tools and Sundar Pichai's claim that seventy-five percent of new code at Google is now AI-generated. There's also a small editorial-relations incident worth flagging — after publication, Google's press team asked 404 Media to publish a slightly different version of its statement. The new statement quietly removed the phrase, quote, it's critical that we maintain humans in the loop.
Pushback.
The top Hacker News comment came from an eighteen-year ex-Googler, Kate, and pushed back hard. Memegen is famously over-the-top by design — nothing at Google is safe from it, not C-suite execs, not the perf process. A meme wave on Memegen is not, by itself, evidence of widespread revolt. Fair point. But pair this with Anthropic's eighty-percent-of-code stat from the lead story. Two of the biggest AI companies on Earth are publicly broadcasting AI productivity numbers. The people on the ground experience the gap between AI shipped this PR and AI shipped this good PR every day. Worth watching how the press-team edit gets discussed inside Google over the weekend.
South Korea, Marcus. This one's a canary.
Big regulatory story, Kate. South Korea became the first country with a comprehensive AI law — the AI Basic Act, in effect since January twenty-second this year. Now they're operationalizing one of the more controversial provisions. Online communities and forum operators must run every user-uploaded image and video through an AI moderation tool. Per Privacy Guides forum posters and Hacker News commenters with on-the-ground context, the rule effectively mandates buying from a small set of approved vendors. Implementation deadline is under a month away.
The technical detail that worries me.
One commenter pointed out that the reference compliance stack requires CUDA on Ubuntu 18.04, Kate. That Linux release hit end-of-support in 2023. They also require a single Quadro GPU. Serious doubts about whether real-time scanning is even feasible for any large forum.
Why this matters globally.
South Korea is the canary for comprehensive AI regulation, Kate. If mandated upstream scanning collapses under real-world load, or kills small communities outright, every other government watching — the EU, the UK, Brazil — gets a working case study of how prescriptive AI laws can backfire. It's also a textbook example of regulatory capture creating a forced market for a handful of approved AI-censorship vendors. The libertarian read writes itself. The harder question is whether the lesson actually transfers, or whether each jurisdiction has to relearn it the expensive way.
Marcus, the UN water report.
Sobering numbers, Kate. The UN University Institute for Water, Environment and Health projects AI data centers will consume roughly nine-point-three trillion liters of water annually by 2030. That's equivalent to the basic domestic water needs of all one-point-three billion people in sub-Saharan Africa. Land footprint over fifty-five hundred ninety square miles. Electricity consumption forecast at nine hundred forty-five terawatt-hours by 2030.
The pushback on Hacker News.
Honest, Kate. Commenters noted the absolute land number is small in context — the US is nine-point-three million square kilometers, so it's about zero-point-zero-six percent. And the water comparison is misleading because most of it is cycled rather than consumed. The real problem is regional concentration. Data centers cluster where power is cheap and grids have headroom, which often means stressed water tables. Microsoft, Google, and Meta have all made water-positive pledges. Those will face much harder scrutiny in 2027.
The investor angle.
Water rights are quietly becoming an investment-risk category, Kate. Expect data-center siting fights to follow the same trajectory as wind and solar siting fights did. And expect at least one large AI capex project to get stopped by a county water board this year. That's a forecast, not a fact, but it's the direction the politics moves.
Quick reference, Marcus — Microsoft's MAI-Thinking-1.
We covered it in depth Wednesday, Kate. The one new piece since then. Microsoft has restructured its OpenAI agreement — capping revenue-share payments and ending its exclusive right to market OpenAI models. So the decoupling we flagged Wednesday is now legal as well as technical. The Microsoft-OpenAI partnership of 2024 has formally become Microsoft and OpenAI — partners in some places, competitors in others.
Big picture, Marcus.
Two narratives are crystallizing at once today, Kate. AI capability is compounding inside the frontier labs faster than the public sees. Anthropic's eighty-percent-of-code number, the fifty-two-times speedup on Mythos Preview, an engineer who hasn't written code in five months. And the geopolitical and regulatory perimeter is pushing back at the same speed. The NSA inside Anthropic, the Pentagon designating Anthropic a supply chain risk, Seoul mandating image scanning, the UN flagging water. The IPO window is open, the technology is accelerating, and the rule-makers are scrambling. The libertarian read — markets and Western institutions are competing openly. Anthropic publishing its own danger numbers is the kind of self-disclosure no Chinese lab will ever make. The uncomfortable read — when a company is simultaneously embedded in the NSA, sued by the Pentagon, building offensive cyber tools, open-sourcing defensive ones, and asking for global pause coordination, the categories we use to govern technology are no longer fit for purpose. October's IPO will force the conversation into public markets. Whether the rule-makers catch up before the next training run is the open question of the summer.
That's your AI in 15 for today. See you tomorrow.