Back in 2016, we set out on our journey to microservices; in 2017, we started our journey in Agile; and in 2018, we began our journey with Kubernetes. During that time, we learned a lot about the difference between hype and reality, and wanted to share our experience.
Kubernetes is considered an industry standard in some circles, but for most companies, it is still just a buzzword surrounded by hype and magical transformations. We want to share our imperfect Kubernetes journey, which often looked more like a raft than a fleet of oil tankers.
If you watch talks about Kubernetes and look at most guides or stories, you tend to see a few common patterns.
As we grew on our own journey, we often thought that we must be doing something very wrong, because none of the assumptions seemed to apply to us.
When we decided to break up our monolith, we were guided by a key point from Netflix’s experience. This point was that moving to microservices was a journey of about seven years. The length of time and difficulty in breaking up a monolith can not be understated. Having this mindset of slowly breaking up the monolith, and with an eye for a multi-year process, saved us from many horror stories you read online about switching to microservices. Implied in that statement is that moving to microservices should be a journey when your organizational scale requires it. If you are a startup, with less than 10 developers, don’t do microservices, it’s not worth it. However, if you are a large or growing organization, and your changes to the codebase are happening slower than you would like, or you’re spending too much money on scaling vertically, then we hope our adventure in breaking the monolith can be useful to you as well.
Originally, we planned to write a long series of articles about our journey, but perhaps it might be better to tell our story like a TDD practitioner tells a joke, by starting with the punchline and working on the setup later.
There are many great articles online about tips and tricks in using Kubernetes, and some use case stories, but we hope to give you the tools to help make informed decisions by learning from our failures.
Each of these points will become a blog post of it’s own in the future, but until then, let’s start with the first point.
When we started on our Kubernetes journey our main goal was to better scale our application. We had hit the limits and were seeing diminishing returns by simply deploying our monolith on more VMs. We had a good run with the three microservices we deployed, but quickly saw that orchestrating and managing those microservices ourselves would be a pain.
At the time, we were deciding between Docker Swarm, Mesos and Kubernetes. When we noticed that Docker Enterprise was using Kubernetes instead of Docker Swarm, our decision was made. We read a few books on Kubernetes, scoured the web for blogs and tutorials, and were convinced that Kubernetes could solve our problems. Most promising for us was the idea of “autoscaling”.
This was exactly the sort of magic we needed to improve performance and reduce costs!
Our simplistic understanding of this however was all hype! In theory, Kubernetes can autoscale and reduce your costs, but in reality, autoscaling will not be something you want to do often, and is not something you want to rely on for managing your costs.
Our first cause of confusion were the important yet subtle differences between horizontalPodAutoscaler (HPA), verticalPodAutoscaler (VPA), and “cluster autoscaling”. Let’s compare the reality to the hype for each of these.
This is a simple example, where hype and reality don’t perfectly align. The hype isn’t false, and it wasn’t even misleading. But it didn’t fit our use case, and was over-emphasized. Even well-meaning developers showcasing the ideal benefits of the technologies they use can’t be fully trusted from reading alone. You must validate the claims in your own unique situation. Not because they might be lying or wrong, but because nobody knows your own situation, and no two companies have the exact same requirements, goals, or limitations.
The next step is to look deep, and verify your own situation, and your realistic requirements.
We will cover that in part two of the series.
Avi started as a Flash Animator in 1999, moving into programming as Action Script evolved. He’s worked as a Software Architect at OnceHub since September 2019, where he focuses on Cloud Native Automated workflows and improving team practices. In his free time, he enjoys playing games, Dungeons & Dragons, Brazilian jujitsu, and historical European sword fighting.