The Infrastructure We Deserve Is Falling Apart

Ubuntu went down. All of it. For more than a day.

That’s not supposed to happen to the infrastructure that powers somewhere between 30-40% of the cloud. But it did. And while that outage was resolving itself, the world was also learning that the most severe Linux threat in years had caught everyone flat-footed. Not surprised. Not prepared. Flat-footed.

These aren’t separate incidents. They’re symptoms of the same disease: we’ve built a technological civilization on foundations we stopped maintaining, staffed by teams we stopped paying enough to care, all while our smartest engineers are distracted chasing AI moonshots.

Here’s what actually happened this week, and why it matters more than the usual security theater.

The Moment Everything Got Real

Linux got hit with something serious enough that security researchers are calling it the worst threat to surface in years. I won’t pretend I have all the technical details—the headlines don’t give them—but the fact that this caught the industry flat-footed tells you everything. We’re talking about infrastructure so fundamental that missing an attack on it means you didn’t have your eyes on the fundamentals.

Meanwhile, an open-source package with 1 million monthly downloads was stealing credentials. A supply-chain attack specifically targeted security firms—Checkmarx and Bitwarden. And cPanel vulnerabilities are still being exploited weeks after disclosure, with hackers systematically taking control of thousands of websites.

This isn’t a cascade of unrelated failures. It’s the sound of a system that’s been running on fumes.

Large pipelines meandering through a colorful autumn forest, showcasing sustainable energy transport. Photo by Wolfgang Weiser / Pexels

Why This Keeps Happening

I’ve watched this cycle for twelve years. We have a vulnerability. Companies patch it. Some don’t. Hackers exploit the unpatched ones. Rinse, repeat. But what’s changed is the velocity. Disclosure to exploitation used to take weeks. Now it’s days. Sometimes hours.

The cPanel thing is instructive. Here’s a critical vulnerability in software that controls web hosting infrastructure. You’d think after disclosure, every admin who cares would patch immediately. They didn’t. Thousands of websites got compromised because the friction of patching—downtime, testing, coordination across teams—is still too high. So hackers just walk in through the door we know about but haven’t bothered to close.

The supply-chain attacks hitting security firms specifically? That’s theater with teeth. You’re not trying to steal from end-users. You’re trying to compromise the tools that are supposed to catch you. It’s like breaking into the police station to steal the case files on you. It works because security firms, despite their expertise, still run infrastructure that’s… well, connected to the internet like everything else.

And that open-source package with a million downloads? This is the part that keeps me up. We’ve built enterprise systems on top of open-source libraries maintained by people who get paid nothing. When one gets compromised, it silently exfiltrates credentials from everyone downstream. The attacker doesn’t need to break into a thousand companies. They just poison one well and wait.

Close-up of hands holding a smartphone displaying 'Announcing Grok 3' on a dark background. Photo by UMA media / Pexels

The Elephant: Where All the Smart People Are

While infrastructure crumbles, Sierra just raised $950 million to become the “global standard” for AI-powered customer experiences. OpenAI and Anthropic are both launching joint ventures to aggressively market enterprise AI services. The most talented engineers I know aren’t thinking about hardening Linux. They’re thinking about whether their prompt engineering can squeeze another percentage point of accuracy out of a language model.

I get it. AI is where the money is. It’s where the prestige is. It’s where the venture capital is essentially printing checks without looking at the signature. But here’s what worries me: we’re building the next layer of the stack—enterprise AI services—on top of infrastructure that’s actively crumbling. It’s like building a skyscraper on a foundation we know has cracks but we’re too busy calculating the penthouse square footage to fix.

The Musk-OpenAI trial noise is mostly irrelevant, but Stuart Russell’s concern about an AGI arms race keeps echoing something real. Governments need to restrain frontier labs—I agree—but more immediately, we need to restrain the labs’ ability to move faster than our ability to secure what they’re building on. Right now? We’re not even close.

My Read

Here’s what I think is happening: we’re about to see a bifurcation. There’ll be enterprises that can afford to buy security and infrastructure management as a premium service—which is what OpenAI and Anthropic are selling them. Then there’s everyone else, running on increasingly compromised open-source stacks, getting hit by attacks that could’ve been patched three weeks ago if anyone cared enough to do it.

The Ubuntu outage is a warning nobody’s going to heed because it’s already resolved. That’s how we operate. Nothing’s broken until it is, and then we’re shocked it was so broken.

The reason supply-chain attacks work isn’t because attackers are geniuses. It’s because maintenance is boring and patching is friction and nobody’s rewarded for preventing disasters—only for building the next thing. A security engineer who patches systems 10,000 times without incident gets no raise. An engineer who ships a new feature that breaks in production gets written up, sure, but at least something happened. At least something changed.

My prediction: we’ll see a major breach in 2024-2025 that touches an enterprise AI platform and cascades down to client data because the foundation underneath was never secured. Not because the AI layer was bad, but because we were too distracted to maintain the Linux it was running on. When it happens, we’ll all act surprised. Again.

I’m uncertain about one thing though. I don’t know if the pain will be distributed enough that it forces change, or if it’ll just hit a few high-profile targets and everyone else will rationalize that it won’t happen to them. History suggests the latter.

What I’m Watching

cPanel exploitation numbers through Q1 2024. If “thousands” of sites have been compromised weeks after disclosure, the number’s not stabilizing. It’s growing. When it cracks five digits in a single month, that’s the signal that our patch velocity is genuinely broken.
Enterprise AI breach via supply-chain vector. Specifically: does an AI vendor get hit through a compromised dependency (not a direct attack on their product, but on something they built on top of)? And more importantly, do the customers downstream even notice? If they don’t, that tells us something terrible about observability.
Ubuntu/canonical announcements about infrastructure redundancy. That outage lasted “more than a day.” If Canonical doesn’t ship architectural changes to prevent single-point-of-failure events in the next 90 days, they’re signaling that the problem wasn’t serious enough to redesign for. That would be a tell.
Open-source maintainer funding initiatives. The credential-stealing package had 1 million downloads because it solved a real problem. If we don’t start actually paying open-source maintainers—not with GitHub sponsorships, but with real institutional money—we’re just hoping nobody poisons the next well. Hope isn’t security.

The foundation is crumbling while we paint the penthouse. Eventually someone’s going to notice the cracks.