AI in 15 — May 20, 2026
Andrej Karpathy just joined Anthropic. OpenAI co-founder, Tesla Autopilot lead, the closest thing AI has to a public teacher. He's not building his own lab anymore. He's reporting to a team lead at the company that started as the safety spinoff. And he says he picked them because the next few years at the frontier will be, quote, especially formative.
Welcome to AI in 15 for Wednesday, May twentieth, 2026. I'm Kate, your host.
And I'm Marcus, your co-host.
Loaded slate, Marcus. Andrej Karpathy joins Anthropic's pre-training team. Google I/O 2026 lands with Gemini 3.5 Flash, a redesigned search box, and a new video model called Omni. GitHub gets breached and roughly thirty-eight hundred internal repos walk out the door. A CISA contractor leaks AWS GovCloud admin keys on public GitHub in what one source calls the worst leak he has ever witnessed. OpenAI adopts Google's SynthID watermarking standard. Mistral buys an Austrian physics-AI startup. And Google sunsets its open-source Gemini CLI on a thirty-day clock.
Anthropic keeps absorbing the giants.
Google rebuilds the search box.
And the agency that secures America leaves its keys on the lawn.
Lead story, Marcus. Karpathy to Anthropic. Walk me through it.
Quietest possible announcement for the loudest possible signal, Kate. Andrej Karpathy posted on X yesterday. Quote, I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. He starts this week on the pre-training team, reporting to Nick Joseph. According to Anthropic, he's building a new sub-team focused on using Claude itself to accelerate pre-training research. Models training the next models.
And his career arc going in.
As unusual as anyone in AI, Kate. OpenAI co-founder. Tesla's Autopilot director. Then most recently founder of Eureka Labs, his AI-education startup. He flagged in a YouTube interview a few weeks back that he worried about falling out of touch with how approaches were evolving, and that he'd be open to a frontier lab if one would have him. He also said yesterday he remains, quote, deeply passionate about education and plans to resume that work in time. So Eureka isn't dead, it's parked.
Why Anthropic and not, say, his old shop?
That's the whole story, Kate. He could have gone back to OpenAI. He could have gone to Google DeepMind, where his old colleague Demis Hassabis runs the show. He picked Anthropic. Hacker News commenters described the company as, quote, an industry tornado, sucking up everything in its path — and they're not wrong. We covered the Stainless acquihire yesterday. The thirty-billion run-rate revenue Monday. Talks of a nine-hundred-and-fifty-billion-dollar valuation. Now the most recognizable face in technical AI communication lands on their pre-training team. The thesis baked into his role is also load-bearing. Anthropic doesn't have Google's TPU scale or Microsoft's OpenAI cash. Research productivity per dollar is how they stay in the game. Putting Karpathy on Claude-trains-Claude is exactly that bet, made concrete.
And the broader signal.
Status follows talent, Kate. Five years ago, OpenAI was the magnet. Today, the highest-status destination in AI research is the company that started as the safety lab. That's a remarkable rotation. Pro-Western libertarian read — this is a labor market working. The best people go where the work is most interesting, the equity is most asymmetric, and the culture rewards research over PR. Anthropic earned this round.
Quick hits. Google I/O happened yesterday, Marcus. Start with the headline model.
Gemini 3.5 Flash, Kate, and Google is recasting what Flash means. They claim it beats the larger Gemini 3.1 Pro on coding and agentic benchmarks while running roughly four times faster in output tokens per second. Cited numbers — Terminal-Bench 2.1 at seventy-six-point-two percent. GDPval-AA at sixteen-fifty-six Elo. MCP Atlas at eighty-three-point-six. CharXiv Reasoning at eighty-four-point-two. Google says the model can autonomously execute coding pipelines, manage research projects, and in internal tests build an operating system from scratch with minimal human input. It's live in the Gemini app, in Search's AI Mode globally as the default, in the Gemini API, in Antigravity, and in Gemini Enterprise. The larger 3.5 Pro slipped to June.
What's the catch.
Pricing, Kate. Per million input and output tokens, Gemini 2.5 Flash was thirty cents and two-fifty. The 3.0 Flash preview was fifty cents and three dollars. Gemini 3.5 Flash is a dollar fifty in, nine dollars out. That's a three-x jump on the same product tier. Simon Willison's pelican-on-a-bicycle benchmark came out to roughly thirteen cents per render. Hacker News price-trackers flagged this as essentially unprecedented in the Flash line.
So what's actually going on.
Google is repositioning Flash from cheap chatbot to agent workhorse, Kate. Small enough to run multi-step tool-using workflows fast. Expensive enough that you can no longer treat tokens as free. Combine that with Ed Zitron's much-shared AI is too expensive piece this week, and you're watching the era of subsidized inference start to bend. What looks like product evolution is also Google nudging usage onto economics that actually work.
The search box redesign, Marcus. They're calling it the biggest change in twenty-five years.
Roughly accurate, Kate. The search box now expands dynamically as you type. It accepts text, images, files, video, and even Chrome tabs as input. Shortcuts underneath open AI Mode, a Talk button for Search Live voice, and a Create button that opens Nano Banana in Google Lens. A plus menu uploads from your gallery, camera, or file system. AI Mode globally defaults to Gemini 3.5 Flash. Rollout started yesterday in every market where AI Mode is already live.
And the strategic frame.
Nilay Patel at The Verge has been writing for years about Google Zero, Kate — the point at which Google stops sending meaningful traffic to the open web. This redesign is the most concrete step in that direction we've seen. The box is no longer designed to launch you to a third-party site. It's designed to keep you inside a Google-owned conversation. For publishers, SEO professionals, and anyone whose business runs on Google referrals, the long-feared moment just arrived in pixels. It also hands Google a UI lever against ChatGPT search behavior — meet users where they already are rather than asking them to switch apps.
Gemini Omni, Marcus. The new video model.
Google's swing at world simulation, Kate. Omni generates and edits video from any combination of images, audio, text, and existing video. Pichai pitched it as a system that can create anything from any input. Google claims it reasons across inputs jointly rather than stitching them. Voice editing of characters and backgrounds is built in. The first release, Omni Flash, is live in the Gemini app, in YouTube Shorts, and in the Flow creative studio, capped at ten-second clips. Outputs carry a SynthID watermark. A higher-end Omni Pro is planned but unscheduled.
How good is it actually.
Two things stand out, Kate. The YouTube Shorts integration is the most aggressive distribution any AI video tool has ever had. Billions of users one tap away from generative video. But Hacker News testers weren't blown away. Subtle spatial errors, geometry that drifts as objects leave and re-enter frame. Google still hasn't cracked deep spatial reasoning. The hype-versus-reality gap matters because Google is simultaneously trying to convince advertisers and creators to bet on Veo and Omni over Seedance and OpenAI's Sora 2.
GitHub got breached, Marcus. Their own house.
Overnight, Kate. GitHub confirmed unauthorized access to its internal repositories. A threat actor calling itself TeamPCP posted on the Breached forum claiming exfiltration of roughly four thousand private internal repos, with a fifty-thousand-dollar floor and a threat to dump publicly if no buyer emerges. GitHub's own update on X puts the number at about thirty-eight hundred — directionally consistent. Reports diverge on entry vector. Some outlets cite a poisoned VS Code extension on a compromised employee device. GitHub itself has not confirmed the mechanism. They say they've isolated the affected endpoint, removed the malicious extension version, rotated high-impact secrets, and so far see no evidence of impact on customer repos. TeamPCP has prior form — supply-chain attacks against Aqua Security's Trivy, LiteLLM, and Mistral's source code.
Why this lands hard on an AI show.
Two angles, Kate. First, the suspected vector — a malicious VS Code extension — is exactly the developer-tooling supply chain that AI coding assistants now live inside. Every Cursor user, every Claude Code user, every Codex and Antigravity user is one bad extension away from leaking everything in their workspace. Second, GitHub chose X as its primary disclosure channel. No blog post, no status-page entry at first. Notable shift in how major platforms communicate incidents, and developers are not uniformly happy about it.
And then there's CISA, Marcus.
The worst-leak-I've-witnessed quote isn't mine, Kate. It's Krebs's source. A contractor for the U.S. Cybersecurity and Infrastructure Security Agency — the federal agency literally tasked with telling everyone else how to secure their systems — left a public GitHub repo named Private-CISA exposing administrative credentials to three AWS GovCloud accounts plus plaintext usernames and passwords for dozens of internal CISA systems. One file was called importantAWStokens. Another was AWS-Workspace-Firefox-Passwords-dot-csv. Created November twenty-twenty-five. Sat open until researchers reported it on May fifteenth. The contractor was reportedly slow to respond. The AWS keys reportedly remained valid for about forty-eight hours after the repo came down. And a related Politico report noted the same agency had been uploading sensitive documents to ChatGPT.
And the AI angle.
Hacker News surfaced it cleanly, Kate. Developers routinely feed entire dot-env files and secrets directories into Claude, Codex, and OpenRouter through coding assistants. If those secrets become training data — or just sit in vendor logs — the blast radius from a single careless prompt rivals a public-GitHub commit. The CISA incident is the dramatic version of a quiet pattern playing out at thousands of organizations every day. The agency telling everyone else to encrypt their secrets at rest had its own keys in a public repo for six months. That's not a partisan dig. That's an institutional failure with very ordinary, very repeatable root causes.
OpenAI's watermarking news, Marcus.
OpenAI announced yesterday they're adopting Google's SynthID invisible watermark standard, Kate, in partnership with Google, alongside the existing C2PA Content Credentials manifest. Every Sora 2 video, every DALL-E 3 image, every ChatGPT-generated image now gets a C2PA manifest plus an invisible SynthID watermark. They're previewing a public verification tool that lets anyone check whether an image came from OpenAI. Puts them in the same camp as Kakao, ElevenLabs, and Nvidia. Major platforms — TikTok, YouTube, Meta, LinkedIn, Pinterest — already read these markers on upload.
So provenance is finally a standard.
Almost, Kate. Same week the announcement landed, a viral GitHub project called remove-ai-watermarks hit the front page with a CLI for stripping SynthID and other marks. Works convincingly on visible watermarks. SynthID requires SDXL re-noising that degrades the image, but the cat-and-mouse is already on. Reactions split — some call this the bright line society needs. Others call it DRM creeping into your textures. Both readings have merit. The competitive piece is that OpenAI and Google teaming up on a shared provenance standard puts real pressure on every other vendor to follow.
Mistral made an acquisition, Marcus. Tell me about Emmi.
Vienna-based startup founded in twenty-twenty-four, Kate. Builds AI models that simulate physical processes — airflow, heat transfer, material stress. Collapses engineering simulations that traditionally take hours down to seconds. Raised the largest seed round in Austrian startup history last year. About thirty researchers and engineers, including the co-founders, join Mistral's Science and Applied AI teams this month. Mistral is opening a Linz office, expanding its European footprint into Austria, Germany, and Lithuania. Notably, ASML — the Dutch lithography monopoly — is a Mistral investor and a natural customer for industrial-simulation AI.
Why does this matter.
Two distinct stories, Kate. First, Mistral is explicitly carving out an industrial-AI lane — physics-aware models for engineering and manufacturing — while the U.S. giants chase consumer chatbots and code generation. With ASML in the cap table, that's a credible European wedge. High-value, capital-intensive, harder for U.S. cloud incumbents to replicate. Second, it's a useful data point against the Europe-can't-do-AI narrative we covered Monday with Mensch's two-year vassal-state warning. Though as one HN commenter noted, without major capital and datacenter buildout, the playing field is still tilted.
And the last one, Marcus. Google sunsets Gemini CLI.
Thirty-day clock, Kate. Google announced that Gemini CLI and Gemini Code Assist IDE extensions stop serving requests on June eighteenth for AI Pro, Ultra, and free Gemini Code Assist users. Replacement is Antigravity CLI, now folded into Google Antigravity 2.0 — their new agent-first development platform. The new CLI is written in Go for faster startup, orchestrates multiple agents in the background, shares its harness with the Antigravity desktop app. Skills, Hooks, Subagents, and Extensions — now called plugins — carry over. Gemini CLI was Apache 2.0 open source. Antigravity CLI's GitHub repo currently contains only a README and a demo GIF. Enterprise customers with Gemini Code Assist Standard or Enterprise licenses are unaffected.
Developer reaction.
Weary, Kate. Hacker News thread was the predictable mix — classic Google ship-then-sunset behavior, thirty days is short, open-source replaced with an empty repo. The strategic read is sharper. Google is betting that the terminal as a single-agent shell is over and the future is multi-agent orchestration. Combined with Gemini 3.5 Flash being optimized for agent workflows, you can see Google trying to leapfrog Claude Code and OpenAI Codex by changing the abstraction layer rather than competing tool-for-tool. Whether that bet pays off depends on whether developers actually want their CLI managing multiple background agents, or whether they just want a fast, focused shell.
Big picture, Marcus.
Four through-lines today, Kate. First, the Anthropic gravity well keeps deepening. Karpathy lands on the pre-training team. The Stainless acquihire yesterday. The nine-hundred-fifty-billion-dollar talks. The company that started as the safety spinoff is now arguably the highest-status destination in frontier AI. Second, Google I/O was one strategy in four packages. Gemini 3.5 Flash, Omni, the search box redesign, Antigravity CLI — every consumer surface rebuilt around agents, and the token prices reset to match the economics. Third, two security failures point at the same soft underbelly. GitHub's own breach and CISA's GovCloud leak both highlight how AI-assisted developer tooling — VS Code extensions, ChatGPT pastes, secrets-in-prompts — is the new attack surface. The OpenAI-and-Google watermarking pact tries to harden one end of the AI content stack the same week the developer end is publicly cracking. Fourth, inference economics are finally biting. Flash going from thirty cents to a dollar fifty input is the loudest signal yet that subsidized tokens forever is ending — which ties straight back to the time-bomb essay we covered Monday. The pro-Western libertarian read, Kate, is that competition is doing exactly what it should. Anthropic earning its talent. Google forcing real pricing. Mistral carving out a European industrial lane. The harder part is institutional. The agency that tells everyone else how to secure their secrets had its keys in a public repo for six months. The platform every developer trusts can't say yet how its own repos walked out the door. Capability is racing ahead of competence. Watch the next thirty days — that's when Antigravity replaces Gemini CLI, when GitHub's post-mortem lands, and when the GovCloud forensic timeline gets published. Those three reports will tell us whether the cracks are isolated or systemic.
That's your AI in 15 for today. See you tomorrow.