-
-
The Shim Tax: The Hidden Engineering Costs of Hybrid Cloud
I recently audited a client’s AWS bill that had spiraled out of control. They hadn’t spun up massive new GPU clusters. They hadn’t doubled their user base. What they had done was connect a legacy on-prem reporting tool to an S3 bucket, assuming “Hybrid Cloud” meant the best of both worlds. Instead, they were hit…
-
The Multi-Cloud AI Stack: Why I’m Done Looking for a “Swiss Army Cloud”
For the first decade of my career, I chased the same goal every architect did: one provider, one control plane, one security model. It looked clean on a slide deck. It even worked—for a while. Then 2025 happened. We watched key AWS teams hollow out, turning incident response into 75-minute archaeology digs. We saw model…
-
The K8s Exit Strategy: Why GCP and Azure are Winning the GenAI Arms Race
I hate writing Kubernetes manifests. But for the last three years, if you wanted to serve a custom GenAI model, you had to build a cluster. AWS Lambda was useless for this. You can’t fit a modern PyTorch model in the zip limit, the cold starts are 10 seconds, and there is no GPU access….
-
The Hangover After the Boom: Why AI Is Forcing an On-Prem Infrastructure Reckoning
For a decade, “Cloud First” wasn’t just a strategy; it was dogma. If you weren’t aiming for 100% public cloud, you were viewed as “legacy.” Buying servers felt retro. Then came the Generative AI boom, and with it, a harsh physical and economic reality check. As we settle into 2026, enterprises are facing an “AI…
-
Broadcom Year Two: The “Stay or Go” Architecture Guide (2026 Edition)
The Year Two Decision: Architecting for expensive stability or painful modernization. The shock is over. The tweets have faded. The “Broadcom killed VMware” headlines are yesterday’s news. Now, you have a quote on your desk. Welcome to Year Two. If Year One was about denial and anger, Year Two is about the cold, hard math…
-
Why Serverless Isn’t Dead for GenAI — It’s Just Misunderstood
Debunking the myth that AWS Lambda can’t power real GenAI workloads by redefining the boundary between the “Brain” and the “Nerves.” Debunking the myth that AWS Lambda can’t power real GenAI workloads requires redefining one boundary. Not technology — anatomy. The difference between the Brain and the Nerves. I recently ignited a firestorm on Reddit…
-
Regulating Generative AI: Lessons from Indonesia’s Grok Ban and What Comes Next
The Grok Ban: What Happened and Why It Matters Indonesia’s Communications and Digital Affairs Ministry temporarily blocked the AI chatbot Grok, developed by xAI and integrated into X, citing the AI’s ability to generate non-consensual sexual deepfake images, including disturbing depictions involving minors. This isn’t a “social media quirk.” It’s a regulatory first — a…
-
Which Workloads Should Never Leave The Cloud
(Even When Repatriation Looks Tempting) After publishing my piece on cloud repatriation, my inbox filled up fast. Not with disagreement—but with a different question: “Okay, fine. Some workloads should come home. But which ones absolutely should not?” That’s the right question. Cloud workload placement — deciding what stays versus what moves — is where repatriation…
-
The Logic of Repatriation: When (and Why) To Move Workloads From Public Cloud Back To On-Prem
Cloud repatriation is no longer a fringe conversation — it is the inflection point where public cloud stops being an accelerator and starts being a tax. For the last decade, “Cloud First” wasn’t just a strategy; it was a religion. If you suggested buying a server, you were treated like a heretic clinging to a…
