← Home AI in 15

AI in 15 — April 17, 2026

April 17, 2026 · 15m 20s
Kate

Anthropic just released its most powerful commercial model ever, and then immediately told you there's an even better one they won't let you touch. Welcome to the AI confidence game.

Kate

Welcome to AI in 15 for Friday, April 17, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Happy Friday, Marcus. We've got a packed show to close out the week. Anthropic drops Claude Opus 4.7 with some impressive gains and some surprising regressions. OpenAI turns Codex into an entire operating system for work. Alibaba's Qwen team releases a model that runs on your laptop. OpenAI unveils its first purpose-built AI for drug discovery. A developer gets hit with a fifty-four thousand euro bill overnight from an exposed API key. And the SDL game library bans AI-generated code entirely. Let's get into it.

Kate

Claude Opus 4.7 ships with a coding boost but a long-context problem nobody expected.

Kate

OpenAI wants Codex to control every app on your computer.

Kate

And a single exposed Firebase key costs a developer fifty-four thousand euros in thirteen hours.

Kate

Marcus, Anthropic released Claude Opus 4.7 on Wednesday. The headline numbers on coding are strong. What stands out to you?

Marcus

Thirteen percent improvement on a ninety-three task coding benchmark over Opus 4.6, and three times more production tasks resolved. For developers, that's meaningful. Vision capabilities tripled, now accepting images up to three-point-seven-five megapixels. They added a new tokenizer, a new "xhigh" effort level for deeper reasoning, and a public beta feature called task budgets that lets developers guide token allocation. Pricing stays the same at five and twenty-five dollars per million tokens.

Kate

That all sounds great. But there's a catch.

Marcus

A significant one. Long-context retrieval accuracy dropped from ninety-one-point-nine percent on Opus 4.6 to fifty-nine-point-two percent on 4.7. That is a massive regression, Kate. If you're building applications that depend on the model finding information across long documents, this is a real problem. Anthropic says they made a deliberate trade-off, sacrificing retrieval for better coding and math performance.

Kate

And the system card revealed something else. A training bug.

Marcus

A chain-of-thought supervision error affected seven-point-eight percent of training episodes. The same bug that plagued Mythos Preview. Anthropic is being transparent about it, which I appreciate, but it raises questions about quality control in training pipelines at the frontier.

Kate

The elephant in the room is Mythos. As we covered Tuesday, Anthropic is withholding that model on security grounds. Now they're openly saying Opus 4.7 is less broadly capable than Mythos.

Marcus

It's an unusual position. Release a model and simultaneously tell the world it's not your best. Anthropic frames it as responsible deployment. They're using Opus 4.7 to test new cybersecurity safeguards before potentially releasing Mythos. Hacker News was split. Some see marginal improvement over 4.6 with higher token consumption. Others appreciate the honesty about trade-offs. But the competitive picture is clear. This is Anthropic trying to reclaim the frontier crown from GPT-5.4 and Gemini while keeping its most powerful weapon locked away.

Kate

So developers need to actually test whether the long-context regression affects their use case before upgrading.

Marcus

Absolutely. Don't just read the headline benchmarks. If your application relies on retrieving specific information from long documents, stick with 4.6 until Anthropic addresses this. For pure coding and software engineering workflows, 4.7 is a genuine step forward.

Kate

Speaking of steps forward, OpenAI made a massive move with Codex this week. They're calling it Codex for almost everything. Marcus, what changed?

Marcus

Everything about the product's scope. Codex was a developer coding tool. Now it's a general-purpose work platform. The headline feature is Computer Use. Codex can access and operate other apps on your Mac, pull information, take actions, and critically, it works while you continue using the computer for other things. It also gets a built-in browser and image generation through gpt-image-1.5.

Kate

A hundred and eleven new plugins too.

Marcus

Skills, app integrations, and Model Context Protocol connections. Plus a memory system so Codex can recall context from previous tasks. OpenAI's chief product officer called it a fundamental rearchitecting of what AI can do at work. The timing is notable. This dropped the same day as Opus 4.7. Multiple Hacker News commenters suspected OpenAI had this ready to ship specifically as a counter-move.

Kate

So OpenAI and Anthropic are now competing head-to-head for the same user.

Marcus

Directly. Codex versus Claude Desktop and Cowork. The feature sets are converging rapidly. OpenAI's bet is on ecosystem breadth with those hundred and eleven plugins. Anthropic's bet is on model quality. As we covered Wednesday, this is the AI operating system layer. Whoever wins developer loyalty here could own the next decade of productivity software.

Kate

From proprietary battles to open source. Alibaba's Qwen team dropped Qwen 3.6-35B-A3B on Hugging Face. Marcus, the architecture here is interesting.

Marcus

It's a sparse Mixture-of-Experts model. Thirty-five billion total parameters but only three billion active during inference. That makes it dramatically efficient. Apache 2.0 license, so anyone can use it commercially. Two hundred and sixty-two thousand token native context window, extendable to about a million. It handles text, images, and spatial intelligence natively.

Kate

Simon Willison posted that it drew a better pelican riding a bicycle than Opus 4.7. That went viral.

Marcus

Three hundred and sixty-three points on Hacker News. Fun benchmark, but let's be honest about what this model is and isn't. On serious engineering tasks, it solved eleven out of ninety-eight problems versus Opus 4.6's ninety-five out of ninety-eight. Very different leagues. But that's not the point. The point is three billion active parameters means this runs on a laptop. For enterprises in banking, healthcare, defense, anywhere you can't send data to external APIs, that matters enormously.

Kate

And the Apache 2.0 license removes all commercial barriers.

Marcus

No restrictions whatsoever. And I'll note, some Hacker News commenters expressed relief that Qwen is still publishing open weights despite reported management upheaval at Alibaba. The open-source AI ecosystem needs these contributions, wherever they come from. Though I'd always recommend thorough security audits before deploying any model in production, especially one you're running on your own infrastructure.

Kate

OpenAI also unveiled something completely new this week. GPT-Rosalind, their first AI model built specifically for life sciences. Named after Rosalind Franklin, the chemist behind the DNA crystallography work.

Marcus

It's a frontier reasoning system fine-tuned for genomics, protein engineering, and chemistry. It can synthesize scientific evidence, generate biological hypotheses, plan experiments, query specialized databases, and parse recent literature. Launch partners include Amgen, Moderna, the Allen Institute, and Thermo Fisher Scientific.

Kate

But this isn't open access.

Marcus

Restricted research preview. US enterprise customers only. Organizations must undergo qualification and safety review proving legitimate research with clear public benefit. This positions GPT-Rosalind directly against Google DeepMind's AlphaFold and Isomorphic Labs work. One life sciences professional on Hacker News was cautious, saying it'll be a long time before anyone trusts a generative model to do actual science when mathematically provable models are as good as they are.

Kate

Fair skepticism. But if it even accelerates the hypothesis generation phase, that's valuable.

Marcus

Drug discovery timelines measured in years could compress to months for certain stages. And the restricted-access model could become a template for deploying frontier AI in high-stakes scientific domains. Not everyone gets the keys. You have to prove you're doing real work.

Kate

Now for a story that every developer building with AI APIs needs to hear. A developer posted on Google's forum that they were hit with a fifty-four thousand euro bill in thirteen hours. Marcus, what happened?

Marcus

They enabled Firebase AI Logic on an existing Firebase project. The project had an unrestricted browser API key originally created for Firebase Authentication over a year ago. Overnight, unauthorized actors found that key and made automated Gemini API calls. Budget alerts were set at eighty euros but triggered hours late. By the time the team noticed, costs were at twenty-eight thousand euros. Final bill, over fifty-four thousand.

Kate

And Google's response?

Marcus

Google Cloud support classified the charges as valid usage because the traffic came from the developer's own project. Billing adjustment denied. Logan Kilpatrick from Google eventually stepped in with guidance. Tier one users now have a two-hundred-fifty-dollar monthly cap. Google is rolling out project-level spend caps and plans to disable unrestricted API keys for Gemini by default. But that doesn't help this developer.

Kate

Three hundred and eighty-two points on Hacker News. This clearly struck a nerve.

Marcus

Developers shared similar horror stories. Someone mentioned a separate billing bug seven months ago that left developers facing seventy thousand dollar charges. The fundamental issue is that Firebase keys were never treated as secrets. They're designed to be embedded in client-side code. But connecting them to expensive AI inference endpoints changes everything. A key that cost you nothing when it only accessed authentication now becomes a direct pipeline to your credit card through Gemini. Every developer needs to audit their API key restrictions today. Not tomorrow. Today.

Kate

Last story. SDL, the Simple DirectMedia Layer library used by thousands of games and part of the Steam Runtime, has formally banned AI-generated code contributions. Marcus, what's the reasoning?

Marcus

Three concerns. First, AI-generated code may be based on sources with unknown licensing, potentially introducing conflicts with SDL's permissive Zlib license. Second, AI tools frequently hallucinate issues that don't exist. Third, AI-written patches shift review burden onto maintainers without proportional value. The policy is nuanced though. AI can still be used to identify issues, but the actual fixes must be human-authored.

Kate

The enforceability question is obvious.

Marcus

Completely. As AI-assisted coding becomes universal, the line between AI-written and human-written with AI help gets impossibly blurry. Hacker News was polarized. Some called it incredibly dumb and unenforceable. Others supported it, citing the flood of low-quality AI-generated pull requests drowning open-source maintainers. One person suggested an Organic Software seal of approval.

Kate

The licensing concern feels most legitimate to me.

Marcus

Agreed. If training data included GPL code, it could contaminate permissively-licensed projects. That's not theoretical. It's a real legal risk. And as agentic coding tools get more capable, more projects will face this question. SDL won't be the last high-profile project to draw a line.

Kate

Friday big picture, Marcus. Anthropic releases a model but says its better one is too dangerous. OpenAI wants to be your AI operating system. Open-source models run on laptops. Drug discovery gets its own AI. Developers face five-figure bills from exposed keys. And a foundational game library says no to AI code entirely. What's the thread?

Marcus

Trust and control, Kate. Every story this week is about who controls what. Anthropic controls which models you can access. OpenAI wants to control your entire workflow. Alibaba gives control back to developers with open weights. Google's billing systems failed to give developers control over their own spending. And SDL is asserting control over what code enters their project. The AI industry is past the phase where capability alone matters. Now it's about governance, guardrails, and who holds the keys.

Kate

And the answers are wildly different depending on who you ask.

Marcus

That's what makes this moment so interesting. There is no consensus. Open versus closed. Restricted versus democratized. Human-only versus AI-assisted. Every major player is making different bets. And the stakes keep rising. Have a good weekend, Kate.

Kate

You too, Marcus.

Kate

That's your AI in 15 for Friday, April 17, 2026. Have a great weekend, everyone. We'll see you Monday.