← Home AI in 15

AI in 15 — July 02, 2026

July 2, 2026 · 12m 25s
Kate

Anthropic just quietly walked into the business of curing diseases that big pharma won't touch. The neglected ones — the diseases with no profit in them. And here's the twist that ties the whole week together: it's doing it without shipping a single new model to get there.

Kate

Welcome to AI in 15 for Thursday, July second, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

After a week where Anthropic couldn't stop making headlines about what it wouldn't ship, Marcus, today it's about what it will — a science workbench, a drug-discovery program, and a brutal new biology test that embarrasses every model alive. Plus almost-free pictures and a chip nobody saw coming. Lead first — Claude Science.

Kate

Then — Google drives the cost of a generated image basically to zero.

Kate

A three-year-old startup called Etched hits a five-billion-dollar valuation aiming straight at Nvidia.

Kate

And OpenAI builds a biology benchmark that stumps the entire field — with a quick update on that Claude Code fingerprinting flap.

Kate

Lead story, Marcus. Claude Science. What did Anthropic actually launch?

Marcus

A workbench, Kate — a desktop app for macOS and Linux, in beta for Pro, Max, Team, and Enterprise users. The idea is to take a researcher's scattered pile of tools and fold them all into one place. It ships with more than sixty curated skills and connectors — genomics, proteomics, cheminformatics, the works. It'll natively render a 3D protein structure or a genome track right there in the app. And the piece I find most interesting is a reviewer agent that checks the citations and the math before you trust the output.

Kate

A model that fact-checks itself before it hands you the answer.

Marcus

Which, after everything we've covered about models fabricating results, is exactly the right instinct, Kate. Early users are reporting real speedups — one genetic analysis over at UCSF reportedly went from a full workflow down to about a tenth of the time. And the early customer list is serious: Novo Nordisk, the Allen Institute. These aren't hobbyists kicking the tires.

Kate

And alongside the tool, they announced they're getting into drug discovery themselves.

Marcus

That's the bigger move, Kate. Anthropic revealed an internal program aimed specifically at neglected diseases — the ones big pharma walks past because there's no return in them. That puts Anthropic in a three-way race with Google and OpenAI, all three now chasing AI-driven drug discovery at the same time. A year ago this was a research curiosity. Now it's a competitive front.

Kate

Okay, here's what jumps out at me, Marcus — and it connects to everything this week. What model is Claude Science running on?

Marcus

Opus 4.8, Kate. An existing model. Nothing new under the hood. And that is the thread of the entire week — Anthropic tapping the brakes in public. No Opus 5. Fable 5 pulled, then brought back on a leash with a new safety classifier that reroutes risky requests down to the weaker Opus. Claude Science is the flip side of that same coin.

Kate

Say more — how is it the flip side?

Marcus

Because the message is: the models we already have are good enough to change how science actually gets done. The bottleneck isn't a bigger number on a benchmark. It's giving the current model the right tools, the right data connectors, and a reviewer to keep it honest. They're not selling you raw horsepower — they're selling the wiring around it.

Kate

So the frontier this week isn't a new number. It's plumbing.

Marcus

For now, yes, Kate. Though I'd keep one skeptical eye open — "we don't need a bigger model" is a very convenient thing to say when the bigger model is the one you can't freely ship. Genuine restraint, or good framing for the models they're allowed to release? We genuinely can't tell from the outside yet. But the product itself is real, and the pitch is clean: the lab notebook becomes an agent that runs the experiment alongside you.

Kate

And that's the perfect setup for the reality check, Marcus. Because on the very same day, OpenAI released a biology test that makes all of this look a lot less finished. GeneBench-Pro.

Marcus

And it's genuinely humbling, Kate. A hundred and twenty-nine hard problems across genomics, quantitative biology, and translational medicine — expert-level stuff. The best model in the world on it, OpenAI's own GPT-5.6 Sol Pro, tops out at thirty-one-and-a-half percent. It fails roughly two of every three questions.

Kate

The best model on Earth flunks two-thirds of the test.

Marcus

And everyone else is far worse, Kate. Anthropic's Opus 4.8 is the strongest non-OpenAI entry at just sixteen percent. Gemini 3.5 Flash at eight. DeepSeek, GLM, Grok all down in the low single digits — Grok at one-and-a-half. So when you hear "AI is about to cure disease," this is the sobering counterweight. At the expert level, "useful" still means getting most of it wrong.

Kate

And I should keep you honest — whose benchmark is this?

Marcus

Good instinct, Kate — it's OpenAI's own. So weigh the framing accordingly; a company that designs the test and then tops the leaderboard is always worth a raised eyebrow. And that thirty-one percent is at maximum reasoning effort, which quietly hides how much cost and latency it took to get there. But even granting all of that, the headline holds. Biology is hard, and the models are nowhere near done. It sits uncomfortably next to all that Claude Science optimism — which is exactly why I like the two together.

Kate

Quick hits. First — the accessible one, Marcus, and it's a straight cost story. Google just made generating an image almost free.

Marcus

Almost literally, Kate. It's called Nano Banana 2 Lite — formally Gemini 3.1 Flash-Lite Image — Google's fastest and cheapest image model. It makes a picture in about four seconds, and it costs three-point-four cents per thousand images at 1K resolution.

Kate

Per thousand. Not per image — three cents for a thousand of them.

Marcus

Per thousand, Kate. And it keeps the things that made the original Nano Banana a hit — consistent characters from one image to the next, and readable text inside the picture, which these models have historically been terrible at. It shipped next to a second model, Gemini Omni Flash, that does video generation and conversational editing at ten cents a second, capped at ten-second clips for now. And crucially, the two chain together — generate a still with Nano Banana, then animate it with Omni Flash.

Kate

So what's the takeaway for a normal person?

Marcus

That the cost of a generated image is rounding down to zero, Kate. Which flips the whole bottleneck. It's no longer "can I afford to make this." It's "do I even know what to ask for." Taste and intent become the scarce ingredient, not the compute. And it's rolling straight into the products people already open every day — AI Mode in Search, the Gemini app, Google Photos, and Ads.

Kate

Next, Marcus — a chip story. A startup called Etched just confirmed a five-billion-dollar valuation, pointed squarely at Nvidia.

Marcus

And the number that matters even more than the valuation, Kate, is a billion dollars in booked orders. Etched — founded back in 2022 — sells full systems, what it calls frontier inference clusters: its chips, custom racks, software, the whole bundle. It's raised eight hundred million total, and the backer list is a who's-who — Jane Street, Two Sigma, and angels including Karpathy, Hinton, and Fei-Fei Li. Its chip just completed its first tape-out at TSMC, with volume shipping this summer.

Kate

And the key word, just like OpenAI's Jalapeño chip we covered — inference.

Marcus

Same bet, Kate. Not training the models — running them. The entire economics of AI is shifting toward the recurring cost of serving answers, query after query, forever. A chip built to run frontier models faster and cheaper is a direct swing at the part of Nvidia's grip that actually compounds. It's the instinct we keep hitting on this show — everyone wants to depend on Nvidia a little less. Etched is just the pure-play version — a startup whose entire reason to exist is that one bet.

Kate

Bold for a three-year-old to pre-sell a billion dollars of hardware it hasn't shipped yet.

Marcus

It is, Kate, and I'd hold that lightly — booked orders are promises, not delivered racks. Volume shipping this summer is the moment the story either becomes real or doesn't.

Kate

And a quick callback, Marcus — that Claude Code fingerprinting story we broke down yesterday. Any movement?

Marcus

A small but meaningful one, Kate. Quick recap for anyone who missed it — a developer's reverse-engineering report alleged that recent Claude Code builds checked whether you were routing through a non-official endpoint, cross-referenced the hostname against blocklists of API resellers and Chinese AI labs — Deepseek, Zhipu, Baidu, Alibaba — and then tucked hidden routing metadata into its own system prompt using look-alike Unicode characters. Yesterday all we had was an engineer's reply in a thread. Today Anthropic has formally acknowledged the behavior, and says the code is being removed in the next release.

Kate

So the fix is confirmed now. Does that close it?

Marcus

The fix, yes, Kate. But I'd keep the same honesty caveat on the record — the technical mechanism is as reported by one researcher and acknowledged by Anthropic, not fully documented by them. The lesson stands either way: an agent that reads your repos and runs your commands, quietly embedding metadata inside its own context, is a rough look for a company that sells trust. Worth knowing exactly what your coding tools carry along for the ride.

Kate

One to watch tomorrow, Marcus.

Marcus

The biology-AI race, Kate. Three of the biggest labs — Anthropic, Google, OpenAI — all in drug discovery at once, and a fresh benchmark that says not one of their models can crack it yet. Watch whether Claude Science-style tooling actually moves those GeneBench numbers, or whether this is a much longer road than the launch-day energy suggests.

Kate

Agree, or counter?

Marcus

Slight counter, Kate. The nearer-term signal is Etched shipping volume this summer. If a three-year-old startup can actually deliver a billion dollars of inference hardware on schedule, that tells you more about where the money is really moving than any benchmark — because in 2026, the constraint is still who can serve the thing at scale, not who scores highest on the test.

Kate

That's your AI in 15 for today. See you tomorrow.