AI in 15 — June 04, 2026

Kate

Ninety-five percent of Uber's engineers use Claude Code or Cursor. They blew through the full-year AI tooling budget in four months. Now there's a hard cap. And it's a number every CFO in the Valley is about to memorize.

Kate

Welcome to AI in 15 for Thursday, June fourth, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Big day, Marcus. Google drops Gemma 4 12B — an encoder-free multimodal model that runs on a sixteen-gigabyte laptop. Uber caps engineer AI spending at fifteen hundred dollars a month after burning the yearly budget in four. A hundred and thirty mathematicians, including Peter Scholze and Terence Tao, publish the Leiden Declaration warning AI is corrupting mathematical proof. DDR5 memory prices have quadrupled — thirty-two gigs now starts at three seventy-five. UC Berkeley CS failure rates triple as professors blame AI overuse. Cognition raises a billion at twenty-six billion for Devin. SoftBank commits seventy-five billion euros for five gigawatts of data centers in France. And Ted Chiang publishes an Atlantic essay arguing AI isn't conscious.

Kate

Google releases a frontier model that runs on your laptop.

Kate

Uber puts a number on what your AI assistant actually costs.

Kate

And the world's working mathematicians push back on the hype.

Kate

Lead story, Marcus. Walk me through Gemma 4 12B.

Marcus

Big release, Kate. Google DeepMind dropped Gemma 4 12B yesterday — a dense unified multimodal model that handles text, images, audio, and video without traditional vision or audio encoders. For vision, they replaced the encoder with what they call a lightweight embedding module — a single matrix multiplication, positional embedding, and normalizations. For audio, they removed the encoder entirely and project the raw audio signal directly into the same dimensional space as text tokens. Architecturally, this is a real departure.

Kate

And the headline — it runs on a laptop.

Marcus

Sixteen gigs of VRAM or unified memory, Kate. A MacBook with sixteen gigs can run it locally. It reaches near-parity with the larger twenty-six-billion-parameter mixture-of-experts Gemma on standard benchmarks at less than half the memory footprint. Apache 2.0 license. Weights on Hugging Face and Kaggle. Support for vLLM, llama.cpp, and MLX. Multi-token prediction drafters for lower latency. Google also noted the Gemma 4 family has now crossed a hundred and fifty million cumulative downloads.

Kate

The Hacker News verdict.

Marcus

Split, Kate. Developers praised the architecture as a real efficiency step forward. But some who ran early four-bit quants in llama.cpp reported bizarre, trivial syntax errors in code output. One reviewer claimed Gemma's image processing was beaten by Qwen 3.5 0.8B — a model seven percent its size. So real questions about small-quant behavior. The architecture story, though, is the genuine one. Encoder-free is simpler, faster, cheaper to deploy.

Kate

Why does this matter, Marcus?

Marcus

The open-weights frontier just moved onto consumer hardware, Kate. And it's Google doing it, not Meta. For developers, agentic workflows you can run on a MacBook with no cloud bill. For Google, the unspoken question — and Hacker News kept asking it — is what's the business case for releasing a model this good for free? The honest answer is Google is building the open-ecosystem moat. Every developer running Gemma locally learns Google's tokenizers, Google's tool-calling conventions, Google's everything. The pro-Western libertarian read — this is the right answer to China's open-source push. Match it, ship better artifacts, win on architecture.

Kate

Quick hits. Marcus, Uber's AI spending cap.

Marcus

First hard public number we've seen, Kate. Bloomberg reported Tuesday that Uber has imposed a fifteen-hundred-dollar-per-month cap per engineer per tool on agentic coding software like Cursor and Anthropic's Claude Code. They burned through the full-year AI tooling budget in four months. The cap applies only to code-generation agents, tracked independently per tool, with an internal dashboard for engineers to monitor usage.

Kate

Adoption numbers.

Marcus

Ninety-five percent of Uber engineers now use Claude Code or Cursor, Kate. Roughly ten percent of Uber's code is generated by AI agents. Simon Willison did the math. At two tools per engineer, the annual cap of thirty-six thousand represents about eleven percent of median Uber engineer compensation — which runs around three-thirty thousand. Willison disclosed his own usage runs about a thousand a month against Anthropic and another thousand against OpenAI — costs he currently pays only a hundred a month each for via subsidized individual plans. Quote, rates no longer available to companies like Uber.

Kate

The implication.

Marcus

The era of throw-unlimited-tokens-at-the-dev-team-and-figure-it-out-later just ended, Kate. At one of the most engineering-heavy companies in tech. Every CFO at a large software org will use Uber's number as the new ceiling in 2026 planning. It also pairs cleanly with the Cognition story later — Devin generating half a billion in annual run rate while customers start putting hard ceilings on what they'll pay. Both are true at once.

Kate

Marcus, the Leiden Declaration. A hundred and thirty mathematicians push back.

Marcus

Notable counterweight, Kate. On Tuesday, sixteen researchers from fifteen universities published the Leiden Declaration on Artificial Intelligence and Mathematics — eleven pages, endorsed by the International Mathematical Union, signed by Fields Medalist Peter Scholze and Terence Tao, among a hundred and thirty original signatories. The list has since grown past one-fifty.

Kate

What does it actually say?

Marcus

Three things, Kate. One, AI-generated proofs are often plausible but incorrect, and hard to verify. Two, attribution to human researchers is being lost when models generate mathematical work. Three, much training data was obtained by exploiting open licenses or violating copyright. The signatories warn governments specifically not to, quote, believe the hype about systems' mathematical abilities. They don't call for an outright ban. They recommend researchers disclose AI use, journals maintain rigorous human review, and funders avoid bandwagon-chasing on AI-driven proof projects.

Kate

Why this matters.

Marcus

Math is the field AI labs love to point to as proof of intelligence, Kate. IMO gold medals, Frontier Math benchmarks. Having the actual community of working mathematicians issue a formal statement saying the proofs are often wrong, the attribution is broken, and the hype is dangerous — that's significant institutional pushback. It's also a template other scientific communities will likely copy. Tao and Tanya Klowden have a companion arXiv preprint out — Mathematical methods and human thought in the age of AI. Expect physics and biology to draft their own versions by Q4.

Kate

DDR5 memory. Marcus, this one shows up at the consumer level.

Marcus

Quadruple in twelve months, Kate. Tom's Hardware reported a thirty-two-gig DDR5-6000 CL30 kit that was under ninety dollars in early 2025 now averages around five hundred twenty-nine. The cheapest thirty-two-gig DDR5 kit you can buy today is three seventy-five. DDR4 isn't spared either — kits that were sixty to ninety dollars last October now sit at a hundred fifty to one-eighty. Memory prices surged eighty to ninety percent quarter-over-quarter into Q1.

Kate

The mechanism.

Marcus

HBM, Kate. Micron's stated three-to-one wafer-conversion ratio between HBM and DDR5 means every new HBM line directly compresses general-purpose memory supply. AI is projected to consume twenty percent of total DRAM production in 2026. HP's CFO said memory and storage have climbed from fifteen-to-eighteen percent of its PC bill-of-materials to roughly thirty-five percent. Analysts don't expect normalization before 2027.

Kate

So the AI capex is showing up on Newegg.

Marcus

Exactly, Kate. If you're building a PC, upgrading a workstation, or budgeting servers, AI capex just doubled or tripled your memory line item. It's also the cleanest answer to anyone asking — is the AI infrastructure spend real? It's spilling out of the hyperscaler data centers and onto the consumer aisle.

Kate

Berkeley CS, Marcus. This is the one that unsettles me.

Marcus

Should, Kate. Spring 2026 failure rates in Berkeley CS classes jumped dramatically. Thirty-five-point-three percent of CS 10 students and ten-point-six percent of CS 61A students received F grades. Versus under ten percent in spring 2024 and 2025. Class averages dropped to C-pluses — a 2.3 GPA — well below the department's 2.8-to-3.3 guideline.

Kate

What do the professors blame?

Marcus

Teaching professor Dan Garcia called the primary driver a, quote, vast increase in academic dishonesty, Kate. Students relying on Claude, ChatGPT, and Gemini for homework, then collapsing at exam time. Nearly thirty students in CS 10 alone were caught cheating on take-home exams. Garcia and colleagues also reported upper-division students arriving without basic linear algebra fluency despite passing prerequisites. The story coincides with a thirteen-hundred-signature petition from UC faculty asking the system to reinstate SAT and ACT testing requirements.

Kate

The bigger picture.

Marcus

First concrete academic data showing AI tools are actively hurting student learning at one of the world's top CS programs, Kate. The Hacker News debate noted the underlying causes are tangled — admissions policy, math prep, post-COVID disengagement. But the instructors on the ground are pointing at LLM dependency. The industry implication — the pipeline of junior engineers entering the workforce in 2027 and 2028 may be measurably weaker on fundamentals. Which is exactly when companies will be desperate for humans who can audit AI output. The skill atrophy story has now graduated from anecdote to data point.

Kate

Marcus, Cognition raises a billion.

Marcus

Quick numbers, Kate. Cognition — maker of Devin — closed a billion at twenty-six billion post-money last week. Two-and-a-half times their ten-point-two billion valuation in September. Co-led by Lux Capital, General Catalyst, and 8VC. Revenue grew from thirty-seven million to four hundred ninety-two million ARR in twelve months. Enterprise usage growing fifty percent month-over-month for six straight months. Customers include Goldman Sachs, Mercedes-Benz, and the US government. Internally, Cognition says eighty-nine percent of its committed code is now written by Devin itself.

Kate

The two-sided story.

Marcus

Pair Cognition with Uber's cap in the same week, Kate. The bull case — Devin generating half a billion in run-rate revenue, growing fifty percent monthly. The bear case — customers are starting to put hard ceilings on what they'll pay. Both simultaneously true. And eighty-nine percent dogfooded — Cognition is either the most aggressive code-agent shop alive or they've automated away their own ability to debug their model. Probably both.

Kate

SoftBank in France. Marcus, the numbers are absurd.

Marcus

Five gigawatts, Kate. Masayoshi Son, alongside Emmanuel Macron, announced SoftBank's largest-ever European AI investment last weekend — up to seventy-five billion euros, roughly eighty-seven billion dollars, to build five gigawatts of AI data center capacity in France. Phase one is forty-five billion euros for three-point-one gigawatts in Hauts-de-France by 2031. Sites in Dunkirk, Bosquel, and Bouchain. Schneider Electric is the lead industrial partner. Son hinted the full-system investment is closer to seven hundred fifty billion when downstream infrastructure is counted.

Kate

The geopolitics.

Marcus

France just leapfrogged Germany and the UK as Europe's AI infrastructure hub, Kate. With OpenAI and Oracle just having broken ground on Stargate Michigan — forty-six-to-fifty-six billion, over a gigawatt — and SoftBank doing five gigawatts in France, the compute buildout is genuinely planetary. Power is now the binding constraint, not chips. France gets to play that card because of its nuclear baseload — seventy percent of the grid, cheap, low-carbon, with zoning that lets you build a gigawatt cluster without four years of permitting hell. Germany can't match that. The UK can't match that.

Kate

Last one, Marcus. Ted Chiang in The Atlantic.

Marcus

Worth pausing on, Kate. Acclaimed sci-fi author Ted Chiang — Story of Your Life, Exhalation — published a long essay yesterday arguing current LLMs are, quote, cleverly disguised examples of sentence continuation, rather than conscious entities. His central claim — experiencing emotions like desperation is inseparable from having a body that floods with cortisol and epinephrine. Language alone, no matter how fluent, cannot produce consciousness without embodiment and intentional action.

Kate

What would convince him?

Marcus

He outlines it explicitly, Kate. A persistent body, sensory experience, internal goal formation, and the ability to refuse instructions in pursuit of its own ends. Today's models satisfy none of these. The Hacker News thread split predictably — consciousness isn't well-defined enough to debate, versus Chiang is right, and the immutability of model weights alone disqualifies LLMs.

Kate

Why does this matter culturally.

Marcus

Chiang is one of the most influential cultural voices on AI, Kate. His short stories shaped how a generation of researchers and writers think about machines. When he weighs in, the conversation moves. Expect this essay to get cited heavily in alignment and policy debates over the coming months as a counterweight to the anthropomorphic claims coming out of labs.

Kate

Big picture, Marcus.

Marcus

Two threads pull through today, Kate. The infrastructure bill is coming due — seventy-five billion euros in France, DDR5 quadrupling, Uber capping engineer AI spend, Cognition's enterprise contracts. The physical and economic substrate of the AI boom is no longer abstract. It's showing up in your PC build budget, your engineering org's P&L, the farmland outside Ann Arbor, and the coast of Dunkirk. And underneath it is a quieter question — flagged by Berkeley's failure rates, the Leiden Declaration, and Ted Chiang — about what we're trading away. Skill atrophy in CS undergrads. Attribution erosion in math. The philosophical question of whether any of this thinking is actually thinking. The libertarian read — markets and institutions are pricing the costs in real time. Uber capped. Berkeley failed students. Mathematicians spoke up. That's discipline working. The uncomfortable read — capability is sprinting and the cultural absorption mechanisms are limping. Whether they catch up before the next model release is the open question.

Kate

That's your AI in 15 for today. See you tomorrow.