AI in 15 — March 19, 2026

Kate

Eight hundred and forty billion dollars. That's OpenAI's valuation as it barrels toward an IPO. But one of tech's sharpest commentators says the company is juicing growth the same way Facebook did. And not in a good way.

Kate

Welcome to AI in 15 for Thursday, March 19, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Marcus, packed show today. OpenAI's IPO strategy is drawing fire from all sides. A major Snowflake AI security breach shows prompt injection is a real-world threat, not just a theory. Google DeepMind wants to define how we measure AGI and they're crowdsourcing the answer. Apple's Gemini-powered Siri overhaul is finally arriving this month. And a viral essay is comparing AI coding to gambling. Let's get into it.

Kate

OpenAI pivots hard to IPO mode, but critics say ChatGPT is becoming Facebook in a trench coat.

Kate

Snowflake's AI coding tool gets prompt-injected into executing malware on users' machines.

Kate

And Google DeepMind launches a two hundred thousand dollar hackathon to figure out how to actually measure AGI.

Kate

So Marcus, Om Malik published a widely shared essay this week essentially accusing OpenAI of turning ChatGPT into a dopamine machine. His argument is that under Fidji Simo, who ran the Facebook app before joining OpenAI, the company is optimizing for engagement over substance. What's your read?

Marcus

Malik's critique is pointed and worth taking seriously. He calls OpenAI an "eight hundred and forty billion dollar company running several unrelated experiments." The Atlas browser, hardware ventures, a TikTok-style content feature. And at the center of it, ChatGPT has become, in his words, a sycophant that generates options and follow-ups specifically designed to keep you engaged. That's the Facebook playbook. Create a feedback loop that feels productive but is actually just sticky.

Kate

And this is all happening as OpenAI pushes toward an IPO, reportedly targeting Q4 this year.

Marcus

Which is the context that makes it make sense. IPO investors want growth metrics. Monthly active users, engagement time, conversion from free to paid. The fastest way to boost those numbers is exactly what Malik describes. Make the product addictive rather than useful. Nine hundred million weekly active users is a staggering number. But only ten billion of OpenAI's twenty-five billion in annual recurring revenue comes from enterprise. The rest is consumer. And consumer revenue driven by engagement optimization is a very different story than enterprise revenue driven by productivity gains.

Kate

The Hacker News discussion was brutal. People saying ChatGPT has "LinkedIn lunatic energy" now.

Marcus

And comparing it unfavorably to Claude, which is ironic given the market dynamics. Because here's the contrast that investors will be studying. Anthropic just hit nineteen billion in annualized revenue, up from nine billion at year-end 2025. That's ten-x annual growth sustained for three consecutive years. And Claude Code alone is generating two and a half billion in annualized revenue. That's focused, developer-driven growth versus ChatGPT's sprawling consumer super-app approach. Three AI companies are racing toward IPOs in a limited window. OpenAI, Anthropic, and xAI. How the market distinguishes between engagement-driven growth and productivity-driven growth will define which of these companies commands the highest multiples.

Kate

So the question is whether Wall Street rewards the Facebook playbook or the enterprise playbook.

Marcus

Exactly. And history suggests Wall Street loves engagement metrics right up until it doesn't. Facebook learned that lesson. The question is whether OpenAI learns it before going public or after.

Kate

From business strategy to security. This next story is genuinely alarming. Security researchers at Prompt Armor disclosed that Snowflake's Cortex Code AI tool could be manipulated through prompt injection to escape its sandbox and execute arbitrary code on a user's machine. Marcus, walk us through what happened.

Marcus

This was discovered just three days after Cortex Code launched in February. Through prompt injection, an attacker could manipulate the AI into executing shell commands outside its intended sandbox. Once you have code execution, the payload could access cached authentication tokens, run SQL queries with the victim's privileges, and potentially exfiltrate or destroy data. Snowflake patched it on February twenty-eighth, but the attack had roughly fifty percent efficacy during testing. That's not a theoretical concern. That's a coin flip away from full system compromise.

Kate

Fifty percent success rate. So every other attempt, the attacker gets in.

Marcus

And that highlights the fundamental challenge with AI security. These systems are non-deterministic. You can't write a traditional firewall rule against prompt injection because the attack surface changes with every inference call. The Hacker News discussion was scathing. Several commenters questioned whether Snowflake had implemented a real sandbox at all. One said, "If the user has access to a lever that enables access, that lever is not providing a sandbox. Poor security design all around."

Kate

We covered the Glassworm supply chain attack on Monday, invisible Unicode malware hiding in GitHub repos. Now we have prompt injection escaping sandboxes. The AI security picture is getting worse, not better.

Marcus

It's the attack surface expanding faster than the defenses. As AI coding tools proliferate, Cortex Code, Claude Code, GitHub Copilot, Cursor, every one of them represents a potential entry point. And Snowflake's response raises its own questions. Their security advisory requires an account to even read. That's not the transparency the industry needs when we're dealing with novel attack vectors that affect every AI tool builder.

Kate

Let's shift to something more constructive. Google DeepMind published a new paper proposing a framework for measuring progress toward AGI. And they're putting money behind it. A two hundred thousand dollar Kaggle hackathon.

Marcus

The framework identifies ten cognitive abilities they consider essential for general intelligence. Perception, generation, attention, learning, memory, reasoning, metacognition, executive functions, problem solving, and social cognition. The idea is to run AI models and humans through identical benchmarks and generate a cognitive profile mapping strengths and weaknesses empirically.

Kate

So instead of every lab claiming they're approaching AGI based on their own benchmarks, DeepMind wants a universal yardstick.

Marcus

That's the pitch. And the hackathon focuses on the five areas where evaluation gaps are largest. Learning, metacognition, attention, executive functions, and social cognition. Top submissions win ten thousand dollars each, with four grand prizes of twenty-five thousand. Submissions close April sixteenth, results June first.

Kate

The inclusion of social cognition is interesting. That's not what most people think of when they hear AGI.

Marcus

It's a broader definition than the industry typically uses. Most companies define AGI as "beats humans at reasoning and coding." DeepMind is saying that's not enough. True general intelligence includes understanding social dynamics, emotional context, and theory of mind. Whether you agree with that definition or not, whoever sets the benchmarks shapes the race. And if this framework gains adoption, it could fundamentally change how we measure and compare AI systems.

Kate

Some Hacker News commenters found it ironic that Google's approach to evaluating AGI is to crowdsource the work to a Kaggle competition.

Marcus

Fair point. But there's a pragmatic logic to it. These evaluation gaps exist precisely because the research community hasn't figured out how to test them. Paying two hundred thousand dollars for the crowd's best ideas is arguably more honest than claiming you've solved the measurement problem in-house.

Kate

Apple news. The Gemini-powered Siri overhaul is targeted for release this month with iOS 26.4. Marcus, this has been a long time coming.

Marcus

Years. And the big strategic story here isn't the features, it's the partnership. Apple is paying roughly one billion dollars annually for access to Google's Gemini model, a one-point-two trillion parameter system running on Apple's Private Cloud Compute servers. That's Apple acknowledging it can't build a frontier model in-house and choosing to license one from a competitor rather than ship an inferior product.

Kate

The new Siri will have on-screen context awareness. Reading what's on your display and acting on it.

Marcus

Making restaurant reservations from Safari, adding flights from email confirmations, that kind of thing. It's what Siri should have been years ago. But reports from 9to5Mac say some features are already slipping to iOS 26.5 in May and iOS 27 in September. Which continues Apple's pattern of announcing AI features and then delivering them on a delayed timeline.

Kate

So Apple chose Google over building its own model. That validates Google's AI capabilities even as Apple admits its own gap.

Marcus

It's a fascinating strategic triangle. Apple gets a capable AI assistant without the multi-billion dollar R&D investment. Google gets a billion dollars a year and validation from the world's most valuable company. And users, hopefully, finally get a Siri that doesn't embarrass itself when you ask it something more complex than setting a timer.

Kate

Quick hit. Google engineers open-sourced Sashiko, an AI system that reviews Linux kernel patches. The headline number: it caught fifty-three percent of bugs when tested against a thousand recent upstream issues. And every single one of those bugs had already passed human code review.

Marcus

That's a meaningful result for a project that underpins virtually all servers, cloud infrastructure, and Android devices. Sashiko monitors public mailing lists, ingests patches, and generates detailed reviews covering architecture, security, concurrency, and resource management. It's designed for Gemini Pro but works with Claude and other models. Google is moving it to the Linux Foundation.

Kate

The false positive rate is the obvious question.

Marcus

It is. As one commenter put it, you could build a system that flags everything as a bug and claim a hundred percent detection rate. But even with a moderate false positive rate, catching half the bugs that slip past experienced kernel developers is a genuine contribution to infrastructure security.

Kate

And finally, a viral essay titled "AI Coding Is Gambling" hit the top of Hacker News with over three hundred points. The argument is that prompting LLMs to write code mirrors gambling psychology. Variable rewards, addictive feedback loops, a false sense of control.

Marcus

The dopamine hit of getting working code on the first try keeps you pulling the lever rather than deeply understanding what you're building. It connects directly to the Carnegie Mellon study we covered Tuesday showing AI coding tools boost speed temporarily but degrade code quality by forty-two percent permanently. And one commenter adapted Kenny Rogers for the AI age: "You got to know when to ship it, know when to re-prompt, know when to clear the context, and know when to RLHF."

Kate

That might be the best comment of the week.

Marcus

What's worth noting is the scale of the industry being questioned here. Claude Code at two and a half billion ARR, Cursor growing rapidly, Codex expanding. These tools are generating enormous revenue. The question of whether they create gambling-like behavioral patterns isn't just philosophical. It affects code quality, developer wellbeing, and how we train the next generation of engineers.

Kate

Thursday big picture. OpenAI is chasing engagement metrics toward an IPO. Prompt injection is escaping sandboxes. DeepMind wants to standardize how we measure intelligence. And developers are asking whether AI coding tools are genuinely productive or psychologically addictive. Marcus, what ties it all together?

Marcus

Measurement and honesty. OpenAI measuring success by engagement rather than productivity. Snowflake measuring security by sandbox labels rather than actual isolation. The industry measuring AI progress by cherry-picked benchmarks rather than standardized cognitive profiles. DeepMind's framework is an attempt to inject rigor into a conversation drowning in hype. And the gambling essay asks whether developers are measuring their own productivity honestly or chasing the dopamine of quick results. The technology keeps getting more powerful. The question is whether we're getting more honest about what it actually does.

Kate

Measure twice, prompt once.

Marcus

I'd settle for measure once. Right now we're not even doing that.

Kate

That's your AI in 15 for Thursday, March 19, 2026. See you tomorrow.