← Home AI in 15

AI in 15 — March 16, 2026

March 16, 2026 · 16m 07s
Kate

Jensen Huang just promised a chip that will, quote, "surprise the world." Thirty thousand people packed into the SAP Center in San Jose are about to find out if he's right.

Kate

Welcome to AI in 15 for Monday, March 16, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Happy Monday, Marcus. It's GTC day. Jensen Huang keynotes at eleven AM Pacific and we'll break down everything we know going in. We've also got OpenAI's GPT-5.4 claiming record benchmarks with a million-token context window. A terrifying supply chain attack is using invisible Unicode characters to hide malware in plain sight on GitHub. Donald Knuth, the godfather of computer science, just published a paper crediting Claude with solving a math problem he couldn't crack. Developers are coining a new word for the AI coding mess. And new market data shows Anthropic has quietly overtaken OpenAI in the enterprise. Let's get into it.

Kate

NVIDIA GTC 2026 opens with the Vera Rubin architecture and chips nobody's seen before.

Kate

A supply chain attack hides fully functional malware inside invisible characters in your code editor.

Kate

And an eighty-seven-year-old legend says he needs to rethink everything he believed about AI.

Kate

Marcus, as we previewed yesterday, GTC 2026 kicks off today. Thirty thousand attendees from a hundred and ninety countries. But now we have more detail on what Jensen is expected to unveil. Walk us through the Vera Rubin architecture.

Marcus

Vera Rubin is NVIDIA's successor to Blackwell, and the numbers are staggering. NVIDIA is claiming five times the inference performance and three-point-five times the training performance compared to Blackwell. Remember, Blackwell itself was considered a generational leap. If these numbers hold up in real-world deployments, we're talking about a fundamental shift in what's economically feasible for AI inference at scale.

Kate

And Jensen teased that there's more than just Vera Rubin. A chip, quote, "the world has never seen before."

Marcus

That's the wildcard. We speculated yesterday about the rumored N1 laptop processors, which would put NVIDIA in direct competition with Intel and AMD on consumer hardware. But the phrasing suggests something beyond what's leaked. NVIDIA has a pattern of sandbagging expectations and then over-delivering at GTC. Whatever this surprise chip is, it's been designed to dominate the news cycle.

Kate

There's also a notable pivot happening with CPUs taking center stage alongside GPUs.

Marcus

This is strategically important. The AI inference workload is shifting. Training is still massively GPU-bound, but inference, which is where the revenue increasingly lives, requires a more balanced architecture. If NVIDIA can own both the GPU and CPU in AI data centers, they control the full compute stack. That's a defensive move against AMD and Intel as much as an offensive one. And the Thinking Machines Lab partnership, committing to at least one gigawatt of Vera Rubin systems, tells you the hyperscalers are already lining up before the specs are even public.

Kate

We'll have full coverage of Jensen's keynote announcements tomorrow. But every AI company on the planet is watching San Jose today.

Marcus

Because whatever NVIDIA announces sets the hardware roadmap everyone else builds on. When Jensen speaks, roadmaps change.

Kate

From hardware to models. OpenAI released GPT-5.4 on March fifth in three variants. Standard, a reasoning-focused Thinking version, and a high-performance Pro version. Marcus, the benchmark numbers are eye-catching.

Marcus

Eighty-three percent on OpenAI's GDPVal test, up from seventy-point-nine for GPT-5.2. GDPVal is designed to test real knowledge work across forty-four occupations in the top nine industries contributing to US GDP. We're talking sales presentations, accounting spreadsheets, urgent care scheduling, manufacturing diagrams. An eighty-three percent score means the model performs at or above human expert level on the majority of economically valuable tasks, at least according to OpenAI's framing.

Kate

And the context window now matches what Google offers with Gemini. A million tokens.

Marcus

Which eliminates what had been a competitive gap. But the more significant development is the convergence strategy. GPT-5.4 combines coding, reasoning, and computer use capabilities into a single unified model. OpenAI previously maintained separate specialist models, the o-series for reasoning, Codex for coding. Now they're betting that one general-purpose model does it all. That's a philosophical bet as much as a technical one.

Kate

A twelve-point jump on real-world work tasks in a single generation is substantial.

Marcus

It is. Though I'd note that GDPVal is OpenAI's own benchmark, so take the framing with appropriate skepticism. Independent evaluations will tell us more. But even accounting for that, the trajectory is clear. These models are getting meaningfully better at the kind of work most people actually do for a living.

Kate

Now for a security story that should genuinely alarm anyone who writes code. Security researchers at Aikido Security uncovered a supply chain attack they're calling Glassworm. Marcus, this one is sophisticated.

Marcus

Sophisticated is almost an understatement. Between March third and ninth, at least a hundred and fifty-one GitHub repositories were compromised, along with npm packages and seventy-two malicious VS Code extensions. The attack uses Unicode Private Use Area characters that render as zero-width whitespace. They're literally invisible in every major code editor and terminal. A small decoder extracts the hidden bytes and passes them to eval, executing a full malicious payload that no human reviewer can see.

Kate

Wait. So you could be staring at the code in your editor, reviewing every line, and the malware is right there but invisible?

Marcus

Exactly. And it gets worse. The malicious injections are disguised as legitimate version bumps and small refactors that are stylistically consistent with each target project. Researchers suspect the attackers are using LLMs to generate convincing camouflage code. So you have AI being weaponized to make the cover commits look natural. The decoded payloads can steal cryptocurrency tokens, harvest developer credentials, and fetch second-stage scripts through blockchain delivery channels on Solana.

Kate

Notable projects like Wasmer and repos with over a thousand stars were hit. GitHub claims to have Unicode warning features but they don't catch this technique.

Marcus

That's the critical gap. Traditional code review, even careful human review, is insufficient against invisible characters. This is the most sophisticated supply chain attack we've seen targeting developers. Every team using npm, GitHub, or VS Code extensions needs to audit their tooling. And the AI angle cuts both ways. AI is making the attacks harder to detect while potentially being the only tool fast enough to scan for these invisible patterns at scale.

Kate

OK, I need a palate cleanser after that. And this next story is perfect. Donald Knuth, eighty-seven years old, the man who literally wrote the book on algorithms, just published a paper called "Claude's Cycles."

Marcus

And he opened it with "Shock! Shock!" which, if you know Knuth's measured academic style, is the equivalent of anyone else flipping a table. The problem involved partitioning directed graphs into Hamiltonian cycles for all odd cases. Knuth had been stuck on it for weeks. His friend fed the exact problem to Claude Opus, and in about an hour and thirty-one systematic attempts, Claude found a construction that works for all odd-numbered cases.

Kate

Thirty-one tries. It hit dead ends, changed strategies, tried different approaches.

Marcus

And that's what makes it remarkable. This wasn't pattern matching or retrieval. Claude tried brute-force searches, invented what it called "serpentine patterns," abandoned them when they didn't work, and eventually converged on a solution. Knuth then wrote the rigorous mathematical proof himself. The AI found the answer but couldn't prove it was correct. So there's a beautiful complementarity there. AI creativity plus human rigor.

Kate

Knuth said he would have to revise his opinions about generative AI. When the father of computer science says that, it carries weight.

Marcus

More weight than any benchmark score. This isn't about generating boilerplate or passing standardized tests. An LLM solved an open mathematical problem through creative exploration. It suggests these models may be developing genuine mathematical reasoning capabilities. And the fact that it took thirty-one tries and hit dead ends honestly makes it more impressive. That's what systematic problem-solving looks like.

Kate

From AI at its most impressive to AI at its most frustrating. Three separate front-page Hacker News posts over the weekend captured growing developer exhaustion with AI coding tools. And there's a new word for it, Marcus. Sloppypasta.

Marcus

The term comes from a manifesto at stopsloppypasta.ai for the practice of pasting raw, unverified LLM output at colleagues. Research shows forty percent of US employees have received what they call "workslop" in the past month. Recipients spend nearly two hours addressing each incident, costing organizations an estimated hundred and eighty-six dollars per employee monthly. Half of recipients view senders as less trustworthy afterward.

Kate

And the developer-specific complaints go deeper. An Anthropic-commissioned study found AI coding assistance actually reduces developer skill mastery by seventeen percent.

Marcus

That's the number that got the most attention. Alongside that, the "LLMs Can Be Exhausting" post described a new kind of mental fatigue where normal coding forces you to slow your brain down, but AI assistance makes the human mind the bottleneck. You're constantly context-switching between generating, reviewing, debugging, and verifying. Simon Willison published a guide called "Agentic Engineering Patterns" trying to codify what actually works versus what doesn't. He draws a clear line between professional agentic engineering and what people are calling vibe coding.

Kate

With eighty-four percent of developers now using AI coding tools, the honeymoon is definitely over.

Marcus

The picture is nuanced. Senior engineers who understand architecture report working two to three times faster. Junior developers who rely on AI for code they don't understand are building on sand. And sloppypasta might be worse than bad code because it's poisoning workplace trust. When you can't tell if your colleague actually thought about what they sent you or just dumped ChatGPT output into Slack, collaboration breaks down.

Kate

New market data just dropped showing a dramatic shift in enterprise AI. Anthropic now holds thirty-two percent of enterprise large language model market share, overtaking OpenAI at twenty-five percent.

Marcus

That's a stunning reversal from two years ago when OpenAI had fifty percent and Anthropic had twelve. And in coding specifically, Anthropic has forty-two percent of enterprise workloads, more than double OpenAI's twenty-one. Among companies purchasing AI services for the first time, Anthropic wins about seventy percent of head-to-head matchups. Claude Code has been described as having its "ChatGPT moment" for enterprise adoption.

Kate

But ChatGPT still dominates consumer with nine hundred million weekly users.

Marcus

Right, and this is the industry splitting into two distinct markets. OpenAI owns consumer. Anthropic is capturing the higher-value enterprise segment where revenue per user is dramatically higher. Anthropic generates roughly two hundred and eleven dollars per monthly user compared to OpenAI's twenty-five dollars per weekly user. Different strategies, different markets. And meanwhile, OpenClaw just became the most-starred repository in GitHub history at over three hundred thousand stars, surpassing React and Linux. Its creator joined OpenAI and the project transitioned to an independent foundation. The agentic era isn't coming. It's here.

Kate

Monday big picture, Marcus. Jensen Huang is about to take the stage. GPT-5.4 is pushing benchmarks higher. Claude is solving open math problems. But invisible malware is hiding in our code and developers are coining words for the mess AI is creating. What's the thread?

Marcus

Capability and chaos are scaling together. The technology is genuinely extraordinary. A model that solves problems Knuth couldn't. Chips promising five-x inference gains. Enterprise adoption reshaping the industry. But the attack surface is growing just as fast. Invisible Unicode malware, sloppypasta eroding workplace trust, skill atrophy among the developers who are supposed to be building with these tools. The question isn't whether AI is powerful. That's settled. The question is whether we can build the human systems, the security practices, the engineering discipline, the institutional judgment, fast enough to match the pace of capability.

Kate

Knuth needed thirty-one tries to solve his problem with Claude. Maybe we need to give ourselves that same patience.

Marcus

That's a good way to put it. The AI hit dead ends, changed strategies, and eventually found the answer. We might need to do the same with how we deploy and govern these systems. The technology is ahead of the institutions. GTC will push the technology further. The harder work is everything else.

Kate

That's your AI in 15 for Monday, March 16, 2026. We'll have full GTC keynote coverage tomorrow. See you then.