AI in 15 — March 24, 2026

Kate

An AI just solved a math problem that no human had ever cracked. Not a textbook exercise. An open question in Ramsey theory that mathematicians had been stuck on for years.

Kate

Welcome to AI in 15 for Tuesday, March 24, 2026. I'm Kate, your host.

Marcus

And I'm Marcus, your co-host.

Kate

Marcus, Tuesday is delivering. GPT-5.4 Pro just became the first AI to solve a genuinely unsolved research-level math problem. GitHub is having a reliability crisis tied to its Azure migration. LocalStack just archived its open-source repo and the community is furious. Someone got a four-hundred-billion-parameter model running on an iPhone. Larry Fink is warning that AI will widen the wealth gap. And ChatGPT has a bizarre new bug where a single German word crashes the system in an infinite loop. Let's go.

Kate

An AI cracks an open problem in Ramsey theory for the first time ever.

Kate

GitHub outages surge fifty-eight percent as Azure migration stumbles.

Kate

And ChatGPT melts down over the German word geschniegelt.

Kate

Marcus, let's start with the math breakthrough because this feels like a genuine milestone. GPT-5.4 Pro solved an open Ramsey theory problem. Not a benchmark. Not a competition problem with a known answer. An actual unsolved question. What happened?

Marcus

This is significant and I want to be precise about why. Ramsey theory is a branch of combinatorics that deals with conditions under which order must appear in large enough structures. The specific problem involved finding a construction that mathematicians had conjectured existed but couldn't produce. GPT-5.4 Pro generated a valid solution that was then verified by the mathematician who originally posed the problem. That verification step is critical. This isn't an AI claiming it solved something. A domain expert confirmed the result.

Kate

And other models were able to solve it too afterward?

Marcus

Once the problem gained attention, researchers tested it on Claude Opus 4.6 and Gemini 3.1 Pro. Both were also able to produce valid solutions. Which is interesting because it suggests we may have crossed a capability threshold across multiple frontier models, not just one. These models can now engage with genuinely open mathematical questions and produce novel, verifiable results.

Kate

OK but I want to push on this. Is this really the AI doing creative mathematics, or is it pattern matching on an enormous corpus of related proofs and getting lucky?

Marcus

That's the right question and honestly, the answer might be both. The construction the AI produced was novel. It wasn't copied from existing literature. But the techniques it combined were known. So think of it less like a flash of genius and more like an extremely well-read mathematician who found a combination of existing ideas that nobody had tried. Whether you call that creativity or sophisticated recombination is partly philosophical. What's not philosophical is the result. A correct solution to an open problem, verified by experts.

Kate

The Asimov Press actually published an essay this week asking exactly this question. Can AI drive real paradigm shifts in science or does it just accelerate incremental work?

Marcus

And this result lands right in the uncomfortable middle. It's not incremental. An open problem is an open problem. But it's also not a paradigm shift. It didn't introduce new mathematical concepts or frameworks. It found a construction within existing theory. My take is that AI is going to produce a lot of results like this. Genuinely valuable, genuinely novel in the narrow sense, but building within existing paradigms rather than overturning them. And honestly, that's still enormously useful. Most of science progresses through exactly this kind of work.

Kate

From mathematical breakthroughs to infrastructure breakdowns. GitHub has seen a fifty-eight percent increase in outages and the culprit appears to be their ongoing migration to Azure. Marcus, how bad is it?

Marcus

Bad enough that it's affecting developer trust in the platform. GitHub has been migrating infrastructure to Azure, which makes sense given Microsoft owns both companies. But the transition has introduced instability across core services. Actions, pull requests, code search, all experiencing elevated error rates and downtime. For a platform that millions of developers depend on for their daily workflow, reliability isn't a feature. It's the product.

Kate

And there's an interesting subplot here. OpenAI is reportedly building an internal alternative to GitHub.

Marcus

Which tells you something about how serious the reliability concerns are. When one of your most prominent enterprise customers starts building their own replacement rather than waiting for you to fix the problems, that's a red alert. OpenAI has the engineering talent and the motivation. They're deeply integrated with coding workflows through ChatGPT and their API. Building their own code hosting platform isn't just about avoiding GitHub outages. It's potentially a strategic play for the AI-native development toolchain.

Kate

Competition for GitHub. That would have been unthinkable a few years ago.

Marcus

GitHub's moat has always been network effects. Every developer is there because every other developer is there. But if reliability degrades enough, those network effects become a liability instead of an asset. Developers don't leave platforms because of features. They leave because the platform stops working when they need it most.

Kate

Speaking of open source friction, LocalStack just archived its open-source repository and now requires an account and auth token to use. The community response has been, let's say, vocal. Marcus?

Marcus

LocalStack has been one of the most popular tools for emulating AWS services locally. Developers use it to test cloud applications without running up AWS bills. Archiving the open-source repo and requiring authentication is a significant shift. It means you can no longer just clone the repo and run it. You need to create an account, agree to terms, and use an auth token.

Kate

Why would they do this?

Marcus

The economics of open-source infrastructure tooling are brutal. You build something millions of developers depend on, but converting that usage into revenue is incredibly difficult. LocalStack presumably decided that the open-core model wasn't generating enough commercial revenue and they needed to force users into a relationship with the company. It's a rational business decision that feels like a betrayal to the community that helped build the tool's adoption.

Kate

This keeps happening. Open source projects that communities invest in suddenly changing the terms.

Marcus

It's the fundamental tension in modern open source. The value creation is distributed but the costs are concentrated. Someone has to pay the engineers maintaining the project. When venture capital runs out or revenue targets aren't met, the community's expectations and the company's economics collide. And developers always lose that collision.

Kate

Yesterday we covered Flash-MoE running a four-hundred-billion-parameter model on a MacBook. Well, someone has now done it on an iPhone 17 Pro. A four-hundred-billion-parameter large language model running on a phone. Marcus, but there's a catch.

Marcus

A significant catch. Zero point six tokens per second. To put that in perspective, you'd wait about two minutes for a single paragraph of text. The technique uses SSD streaming, essentially paging model weights from the phone's storage into memory as needed. It's a brilliant proof of concept that demonstrates the iPhone 17 Pro's hardware is technically capable of inference on massive models. But it's not remotely practical for actual use.

Kate

So it's more of a "look what's possible" moment?

Marcus

Exactly. And that's still valuable. Yesterday's Flash-MoE ran at five and a half tokens per second on a MacBook, which is slow but usable. The iPhone at zero point six is a technology demonstration, not a product. But the trajectory matters. If you can do it at all today, you can probably do it usefully in two or three hardware generations. Mobile inference on frontier-scale models isn't a question of if anymore. It's a question of when.

Kate

Larry Fink, CEO of BlackRock, the world's largest asset manager, released his annual letter and it has a stark warning about AI. He's saying AI will widen the wealth divide unless something changes. And he's proposing a one and a half trillion dollar retirement fund as part of the answer.

Marcus

Fink's annual letters move markets because BlackRock manages over ten trillion in assets. His argument is straightforward. AI will generate enormous productivity gains and wealth, but that wealth will concentrate among those who own the infrastructure, the data centers, the chips, the models. Workers displaced by AI won't automatically share in those gains. The one and a half trillion dollar retirement fund proposal is essentially a mechanism to give ordinary Americans equity exposure to AI infrastructure.

Kate

A capitalist solution to a capitalist problem.

Marcus

And that's what makes it interesting. This isn't a call for regulation or redistribution. It's a call for broader ownership. Give people a stake in the AI buildout so that when it generates returns, those returns flow more broadly. Whether you think that's sufficient is a political question, but the diagnosis is hard to argue with. The people building and owning AI infrastructure are getting spectacularly wealthy. The people being displaced by it are not.

Kate

As we reported yesterday, OpenAI is at twenty-five billion in annualized revenue and eyeing a trillion-dollar IPO. That's the kind of wealth concentration Fink is talking about.

Marcus

Precisely. And Fink is positioning BlackRock to manage that retirement fund, so there's obvious self-interest here. But the underlying argument about AI wealth concentration is one of the most important economic questions of the next decade.

Kate

OK, this last one is just delightful. ChatGPT version 5.2 has a bug where the German word geschniegelt, which means something like neat or dapper, sends the model into an infinite death loop. Marcus, what is happening?

Marcus

Nobody is entirely sure, and that's what makes it fascinating. When users input this word, ChatGPT enters a recursive loop where it keeps generating tokens without producing coherent output. It just spirals. The working theory is that the word's tokenization creates some kind of edge case in the model's decoding logic, possibly interacting with the German language's compound word structure in a way that wasn't caught during testing.

Kate

One word crashes the system. That's both hilarious and slightly terrifying.

Marcus

It's a reminder that these models, for all their sophistication, are still software with bugs. And the bugs can be wonderfully weird. A single word in a language with three hundred and thirty million speakers shouldn't be able to destabilize a system used by hundreds of millions of people. But here we are. OpenAI will patch it, but the broader lesson is that frontier AI systems have failure modes we can't predict because the systems are too complex to fully test. You can't try every word in every language.

Kate

Geschniegelt. I'm going to use that at dinner parties now.

Marcus

Just don't say it to ChatGPT.

Kate

Tuesday big picture. An AI solves a math problem humans couldn't. GitHub's infrastructure wobbles as a competitor emerges. Open source projects keep pulling up the drawbridge. And a single German word can crash a system used by hundreds of millions. Marcus, what connects these?

Marcus

The gap between what AI can do and what we can rely on it to do. GPT-5.4 Pro can solve open mathematical problems that stumped human researchers. That's extraordinary capability. But the same family of models can be crashed by one German word. GitHub is the backbone of modern software development and it can't keep the lights on during a migration. LocalStack was critical infrastructure for thousands of teams and it vanished behind an auth wall overnight. The capability frontier keeps advancing. The reliability frontier is not keeping pace. And for AI to deliver on its promises, from solving math problems to running on your phone, reliability has to catch up to capability. Because nobody cares how smart your AI is if it crashes when someone says geschniegelt.

Kate

Reliability over brilliance. I like that.

Marcus

The brilliant demo is easy. The boring, reliable, day-after-day operation is the hard part. Always has been.

Kate

That's your AI in 15 for Tuesday, March 24, 2026. See you tomorrow.