
Our platform upgrade ran through all nodes, finished promptly, and at the same time took down every pod for a very high profile application before new pods had the ability to come up and take traffic. What we didn’t plan for was our development teams not having PodDisruptionBudgets deployed for their applications.

Our lifecycle management tooling we used to upgrade our clusters worked a little too efficiently! We did an upgrade on a Kubernetes cluster, in a rolling fashion, and it finished successfully. Luckily we have great engineers and developers who jumped in and we easily identified the problem. We had our first application outage on our Kubernetes platform The Problem It didn’t take long for us to reach critical mass as Kubernetes has proven to be a huge success and our developers love it! We scaled up our consumer base, we shortened, and in some cases did away with those pesky meetings to talk about application onboarding, and we had multiple applications running in production successfully. We also wrote some YAML…lots and lots of YAML! While we built a great relationship with these teams it was obvious this wouldn’t scale as we moved from dozens of teams to hundreds. We even got our hands dirty and helped with code changes to ensure they would be successful on this new crazy container platform. We sat down with each team and talked through their current application architecture and how it could be adapted to run on Kubernetes. When we first started onboarding developers to our Kubernetes platform almost 3 years ago we took a very hands-on approach to developer relations.

We have dozens of Kubernetes clusters, thousands of developers, and hundreds of applications that run everything from coverage maps, to iPhone launches, to point of sale in our retail stores. That’s usually the goal, but you rarely get to scale your operations teams to keep up with the exponential growth in platform consumers. We, like most companies building Kubernetes environments, came to a quick realization that empowering your developers with platforms like Kubernetes can exponentially increase the volume and velocity of applications and releases.


Here at T-Mobile we’re ALL-IN on containers and container platforms like Kubernetes. Hi all, I’m Joe Searcy, Member of Technical Staff on the Platform Engineering team here at T-Mobile and I’m joined by Torin Sandall, Software Engineer at Styra and Co-founder/Core Contributor for the Open Policy Agent (OPA) project, to help tell the story behind one of T-Mobile’s latest open source projects, MagTape.
