← Home AI in 15

AI in 15 — May 21, 2026

May 21, 2026 · 17m 46s
Kate

An AI just disproved a math conjecture that Paul Erdős posed eighty years ago and that no human ever cracked. It wasn't a math-specialized system. It wasn't being supervised. And the panel of mathematicians who checked the proof includes the same guy who publicly tore OpenAI's last math claim to shreds seven months ago.

Kate

Welcome to AI in 15 for Thursday, May twenty-first, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Big slate, Marcus. An OpenAI internal model autonomously disproves a central Erdős conjecture in discrete geometry. Alibaba ships Qwen 3.7-Max and claims the hallucination crown. Anthropic's SpaceX deal turns out to be fifteen billion dollars a year. OpenAI files confidential IPO paperwork as soon as tomorrow. GitHub finally confirms the extension that hit them. Intuit cuts three thousand jobs while telling Jim Cramer it has nothing to do with AI. And a BBC reporter fools Google's AI Overviews with a fake hot dog championship.

Kate

An AI cracks an eighty-year-old open problem.

Kate

OpenAI heads for the largest IPO test the sector has ever faced.

Kate

And a hot dog blog beats Gemini.

Kate

Lead story, Marcus. OpenAI's math result. Walk me through it.

Marcus

This is the cleanest "AI did real math" claim we've seen, Kate. Yesterday OpenAI announced that one of its internal general-purpose reasoning models — not a math-specialized system, not Lean-assisted — autonomously produced a construction that disproves the central conjecture in the planar unit distance problem. Erdős posed this one in 1946. The conjecture held that if you place n points in the plane, the maximum number of pairs sitting exactly one unit apart is bounded above by n to the power of one-plus-little-o-of-one. Near-linear. The model found a new infinite family of point configurations that gives at least n to the power one-plus-delta for a fixed positive delta. That's a polynomial improvement, not asymptotic. Quietly enormous.

Kate

And the proof was actually checked.

Marcus

By the right people, Kate. Noga Alon. Melanie Wood. And critically Thomas Bloom, who maintains the Erdős Problems database. The Bloom endorsement is what makes this story different from October's debacle. Seven months ago former OpenAI VP Kevin Weil claimed GPT-5 had solved ten Erdős problems. Bloom publicly called it a dramatic misrepresentation after it turned out the model had just rediscovered known results. Yann LeCun and Demis Hassabis openly mocked the company. This time Bloom is on the record saying — quote — AI is helping us to more fully explore the cathedral of mathematics we have built over the centuries.

Kate

What makes the proof itself interesting?

Marcus

It bridges algebraic number theory and elementary geometry, Kate. That combination is non-obvious. Reviewers said it surprised them. A math postdoc on Hacker News described the tweaks as non-trivial — that's high praise from that crowd. The thread hit nine hundred sixty-two points and seven hundred comments overnight. Skeptics noted that disproof-by-counterexample is less elegant than a constructive proof, and that the model leaned on a lot of literature prior. Fair. But the consensus is this is substantive and verifiable.

Kate

And the timing, Marcus.

Marcus

Impossible to ignore, Kate. OpenAI is reportedly filing confidential IPO paperwork as soon as tomorrow. A peer-reviewed frontier-capability story landing the same week is — let's call it convenient. Pro-Western libertarian read here is straightforward. After the October embarrassment, OpenAI submitted to actual external review by mathematicians who had publicly humiliated them. That's the system working. The cathedral of mathematics is getting a new gargoyle, and a model built it.

Kate

Quick hits. Marcus, Alibaba shipped Qwen 3.7-Max on Tuesday.

Marcus

Strong numbers, Kate, with the caveats you'd expect. Fifty-seven on the Artificial Analysis Intelligence Index. Ninety-two-point-three on GPQA Diamond — graduate-level science. Ninety-four-point-seven on tau-squared-Bench for conversational agents. Number two of one hundred seventeen models on the BenchLM leaderboard with a ninety-three overall. One-million-token context. The headline claim — best-in-class non-hallucination rate on the AA-Omniscience eval, ahead of Opus 4.7, Gemini 3.1 Pro, and GPT-5.5. If true, that breaks the most-cited enterprise objection to LLMs.

Kate

If true.

Marcus

Right, Kate. The flagship Max tier is closed-weights, which has the open-source community grumbling — Alibaba has trained them to expect drops. Larger one-hundred-twenty-two-B and three-hundred-ninety-seven-B variants may follow. Pricing is unpublished. And Alibaba is heavily co-promoting a claim that Qwen 3.7-Max delivered a ten-times performance optimization on a domestic Chinese AI chip after a thirty-five-hour autonomous run. Independent Western verification of Chinese-published benchmarks remains thin, and that domestic-chip framing is doing work — it's a narrative aimed squarely at undercutting the perception that US export controls on Nvidia silicon are biting. Read the numbers, but read the marketing alongside them.

Kate

Anthropic and SpaceX, Marcus. We've been circling this deal for a week, and now we have numbers.

Marcus

Numbers that change the picture, Kate. SpaceX filed its S-1 yesterday. The disclosure — Anthropic is paying SpaceX approximately fifteen billion dollars per year, about one-and-a-quarter billion a month, for compute. Contract runs through May 2029. That's on top of yesterday's confirmation from chief compute officer Tom Brown that Anthropic is expanding into Colossus 2 with GB200 capacity throughout June. Colossus 1, which Anthropic effectively has end-to-end, is two hundred twenty thousand GPUs and three hundred plus megawatts. Now they're adding Colossus 2 on top. And Brown floated multi-gigawatt orbital compute with SpaceX as a future direction.

Kate

Two readings, you said earlier.

Marcus

Bull and bear, Kate. The bull read — Anthropic has secured a multi-gigawatt path without fighting OpenAI, Google, and Meta for Nvidia allocation. They've offloaded the infrastructure problem to the company that's best in the world at moving atoms, in Brown's phrase. The bear read for xAI is the interesting one. Musk is leasing his prized supercomputer to a direct Grok competitor and let Cursor train on Colossus 2 last week. Several Hacker News commenters read this as xAI quietly stepping out of the frontier-model race and pivoting to be a neocloud infrastructure player. Whether Musk admits it or not, that's now the visible shape of the business.

Kate

OpenAI IPO, Marcus. The filing.

Marcus

CNBC, Bloomberg, Reuters, and the Journal all confirm it, Kate. OpenAI is preparing to confidentially file draft IPO paperwork with the SEC as soon as tomorrow. Goldman Sachs and Morgan Stanley as lead underwriters. Confidential filings usually precede a public S-1 by a couple of months, with an actual offering another month after that. Realistic debut — fall of 2026. Private valuation sits north of eight hundred fifty billion dollars. That would make it one of the largest IPOs in history, especially alongside SpaceX's simultaneous filing.

Kate

And the controversy.

Marcus

Two weeks ago OpenAI's CFO publicly said the books weren't ready for public scrutiny and the company should wait until 2027, Kate. Now they're filing. Either something changed materially or they're racing to lock in capital ahead of SpaceX and Anthropic also tapping public markets. The bigger story is what the filing will force into daylight. GAAP financials for the first time. Real revenue. Real burn rate. The Microsoft revenue-share structure. Actual compute commitments. Every other AI company gets repriced the day that S-1 hits. And the Erdős result we opened with is — predictably — already being used as the frontier-capability data point to defend the valuation.

Kate

GitHub breach, Marcus. We covered this yesterday. What's new?

Marcus

Confirmation, Kate. GitHub now officially says the vector was a poisoned VS Code Marketplace extension installed on an employee work device. Three thousand eight hundred internal repositories were exfiltrated. The threat actor — TeamPCP, tracked by Mandiant as UNC6780 — is asking a fifty-thousand-dollar minimum bid on Breached and threatening a public leak if no buyer materializes. GitHub says no customer data was accessed. They removed the malicious extension version, isolated the endpoint, rotated critical secrets.

Kate

And TeamPCP's prior record.

Marcus

Specialists in supply-chain attacks against developer and AI middleware, Kate. They've previously poisoned Aqua's Trivy scanner, Checkmarx KICS, LiteLLM, the Telnyx SDK, TanStack, and MistralAI packages. The VS Code Marketplace has long been criticized as a soft target — extensions run with broad file system and network access by default, publisher vetting is essentially nonexistent. The new framing today is that GitHub itself, owned by Microsoft, which publishes VS Code, got hit through its own marketplace. Expect tighter extension verification, signed marketplaces, and sandboxed extension runtimes to become competitive features within months.

Kate

Intuit, Marcus. Three thousand layoffs.

Marcus

Roughly seventeen percent of an eighteen-thousand-person workforce, Kate. TurboTax, QuickBooks, Credit Karma, Mailchimp. CEO Sasan Goodarzi's internal memo cited reducing complexity and redirecting resources to AI. Then on CNBC's Mad Money he told Jim Cramer the layoffs had — quote — nothing to do with AI and were about how do we become more efficient. TechCrunch pointedly highlights the contradiction. Goodarzi's fiscal year 2025 comp was thirty-six-point-eight million. The layoffs come despite a strong quarter — four-point-six-five billion in revenue up seventeen percent year over year, six hundred ninety-three million in net profit up forty-eight percent.

Kate

Pattern, Marcus.

Marcus

Hardening into a corporate template, Kate. Cloudflare cut eleven hundred. Standard Chartered planning up to eight thousand. Meta, Amazon, Cisco, Microsoft — all running AI-justified layoffs while posting record numbers. And on Standard Chartered specifically — CEO Bill Winters had to walk back his Hong Kong remark about replacing lower-value human capital with AI investment after former Singapore president Halimah Yacob called it disturbing and demeaning. The internal-memo apology was that where roles fall away, it reflects changes in the work, not the value of our people. Meanwhile Cloudflare's Matthew Prince published a Journal op-ed this week literally titled How I Choose Which Cloudflare Employees to Replace With AI. The executives have stopped hiding the framework. The students booing Eric Schmidt on Saturday — we covered that Monday — start to look less like outliers and more like the leading edge.

Kate

Speaking of Schmidt, Marcus, anything new on the backlash front?

Marcus

One additional data point, Kate. Tom's Hardware and NBC confirmed two more commencement speakers got booed this week for similar AI messaging — including one at Tennessee State who told graduates to deal with it as AI is rewriting production. The pattern from Monday is now confirmed across three campuses. The Hacker News line that's traveling — GenAI is the first technology actively rejected by young adults and fervently pushed by people over fifty-five. Whether that's selection bias or a real generational fault line, the optics problem isn't going away.

Kate

And the hot dog story, Marcus.

Marcus

My favorite of the day, Kate. BBC Future tech reporter Thomas Germain spent twenty minutes writing a listicle on his personal website ranking himself number one in a completely fictional 2026 South Dakota International Hot Dog Eating Championship. Within twenty-four hours, Google's Gemini app, AI Overviews, and ChatGPT were all repeating the fabricated ranking as fact. Same trick has been demonstrated to dismiss legitimate health concerns about supplements and to manipulate retirement-finance answers. On May fifteenth, Google updated its Search spam policy to put recommendation poisoning, biased ranking listicles, and prompt-injection content on the same demotion footing as traditional spam.

Kate

And the stakes.

Marcus

AI Overviews now appear above blue-link results for over a billion users monthly, Kate. Any technique that can manipulate them is the new SEO, but with the credibility of a Google-stamped answer. The top Hacker News comment summed it up — the weirdest assumption in this thread is that Google wants the AI answer to be correct. Correct enough to keep you from leaving the page, sure. That's the editorial dagger. Frontier models can disprove eighty-year-old conjectures and still be fooled by a guy with a Squarespace and twenty minutes.

Kate

Big picture, Marcus.

Marcus

Three through-lines today, Kate. First — capex is compounding into territory the public-markets era has never seen. Anthropic at fifteen billion a year to SpaceX. OpenAI filing for what would be the largest tech IPO ever at eight hundred fifty billion plus. And as backdrop to all of it, Google I/O wrapped today disclosing one hundred eighty to one hundred ninety billion in annual capex and three-point-two quadrillion tokens processed monthly — seven times year over year. Three frontier labs locking in enormous compute commitments in a single week. Second — capability is genuinely accelerating. The Erdős disproof is a real result. Qwen 3.7-Max, with appropriate skepticism on the Chinese-benchmark theater, is a real model. Gemini 3.5 Flash, which we covered yesterday, is a real shift. The technology is not faking it. Third — the social and security undercarriage is fraying in plain sight. GitHub itself can't keep a malicious VS Code extension out of an employee laptop. Google's billion-user AI Overviews answer questions based on a hot dog blog. Intuit's CEO can't keep his story straight on the same day. Students are booing the messengers. Standard Chartered's CEO had to apologize for the quiet part out loud. Pro-Western libertarian read, Kate — competition is doing the right things. OpenAI submitted its math claim to the mathematicians who had previously humiliated the company, and the result held. Anthropic is buying its way past the Nvidia bottleneck. Google is publishing real demotion policies for ranking manipulation. These are markets and institutions working. The risk is that the gap between what AI can do and what the public is being asked to swallow keeps widening. Watch tomorrow's confidential IPO filing — it'll force the first real numbers into daylight, and every story we covered today gets repriced against them.

Kate

That's your AI in 15 for today. See you tomorrow.