Installing and managing your own Kubernetes service

If you ever find yourself looking for advice on how to install and manage your own Kubernetes service, the best one you will get is – don’t do it. But sometimes that advice falls short, and so do Kubernetes-as-a-Service.

Don’t get me wrong, using managed Kubernetes service by the cloud provider of your choice is often the best way to get started. Production-ready Kubernetes clusters are just a few clicks, or preferably, lines of code away and they come with an abundance of features, including:

Managed control plane
Managed node pools
Network and load balancers
Ready-to-use storage
Cluster auto-scaling
Built-in security features
Easy upgrades
Good integration with the rest of the cloud provider offerings

Installing and managing your own Kubernetes clusters requires you to solve these complexities by yourself, and that can be an ordeal, one you don’t necessarily need to go through. That said, there are still cases when a generic managed offering will not fit the ecosystem of a given company.

If you need to build it, make it a perfect fit

Managed services are excellent, but they also come at a cost of limited flexibility. Sometimes you will just not be able to flex them enough to fit into the given context, or ecosystem. Moreover, companies with strict security policies, running in highly regulated businesses, might not even use cloud providers or will use just a subset of their services.

Imagine a company using multiple cloud providers, in addition to an extensive on-premises footprint worldwide. Now add to that equation 100 software engineering teams that might need the managed Kubernetes service. You have just imagined my workplace.

Using multiple cloud providers requires at least a basic understanding of the specifics of a given cloud provider, both between providers and when compared to an on-premises environment. It’s not realistic to expect hundreds of our software engineers to know the nuances, so the underlying platform had to be abstracted as much as possible.

This is how we managed to do that.

If you need to build it, make it a perfect fit

Managed service is, in a nutshell, a service maintained by someone else. From a developers’ perspective, it is managed by the DevOps/SRE team that handles clusters. The DevOps/SRE teams’ perspective is that it is managed by a cloud provider. Cloud provider reduces complexity for the DevOps team, and the DevOps team reduces complexity for the end user – the developer.

At my company, developers prefer as little ops as possible, so dealing with the complexity of the Kubernetes cluster is not something developers should do just to run their services.

With that in mind, we decided to build an internally managed offering that has some key features of managed Kubernetes services, but also some that catered to our specific needs:

Self-service cluster installation
Managed control plane
Managed node pools
Easy upgrades
Auto-scaling
Security patches
Standard cluster add-ons installed
Ready-to-use storage

These features combined should hide the complexity of the platform from the end-user, be available on request (self-service), be scalable, completely automated and, of course, reliable.

We had to keep in mind that if manual actions were needed for the maintenance of the cluster, we would also need to hire more people in the DevOps team as the number of managed clusters grows. Twenty clusters today can become one hundred clusters in three months. Since it’s not realistic to hire people at the rate the service consumption might grow, we underlined reliable automation as a key ingredient for the success of the managed offering.

Of course, we didn’t want to build everything by ourselves, and we were aware of the various open-source tooling available. The search had begun.

Choosing the Stack

In pursuit of the best stack to create our internally managed offering, we have evaluated some of the most popular Kubernetes management tools. VMware Tanzu and Rancher have their qualities, but in the end, we decided to use Kubernetes Cluster API.

Compared to Tanzu and Rancher, Cluster API (CAPI) is the one that supports most infrastructure providers. It allows you to be quite flexible in the way you utilize it. Although Rancher, unlike CAPI, offers a UI for cluster management, which is an excellent feature, we decided it would be hard to use for self-service. It requires a lot of infrastructure-related details, and subnets, VLANs, or datastore ids are not something that a developer needs to know.

The fact that VMware Tanzu is built upon CAPI gave us confidence in CAPI’s quality and future.

With CAPI and its “cluster is described as Kubernetes manifest” approach, we at once knew that git will be the place to store them, and GitOps a way to propagate changes.

GitOps approach for propagating changes was something we were already familiar with. And it made a lot of sense to deal with workload cluster definitions, just like with any other Kubernetes resource. Store them on Git and use Argo CD to reconcile the git repository with the cluster.

The Winning Stack: Open source with some custom-built controllers

Kubernetes Cluster API

As stated by Kubernetes Cluster API documentation:

“Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.

Started by the Kubernetes Special Interest Group (SIG) Cluster Lifecycle, the Cluster API project uses Kubernetes-style APIs and patterns to automate cluster lifecycle management for platform operators. The supporting infrastructure, like virtual machines, networks, load balancers, and VPCs, as well as the Kubernetes cluster configuration are all defined in the same way that application developers operate deploying and managing their workloads. This enables consistent and repeatable cluster deployments across a wide variety of infrastructure environments.”

Simplified – Cluster API uses Kubernetes principles to manage the lifecycle of Kubernetes clusters.

Kubernetes cluster API (CAPI) consists of providers installed on some Kubernetes cluster. That Kubernetes cluster is called the“management cluster”. Clusters created by the management cluster are called “workload clusters”. Infrastructure provider deploys infrastructure (virtual machines, network), bootstrap provider initializes workload cluster control plane and joins nodes to the cluster. Desired workload cluster is described in a series of Custom Resource Definitions.

CRDs are used for customizing workload clusters. For example, the number of machines in a specific node pool is defined in the “MachineDeployment” CRD.

There are multiple controllers in CAPI that are being used to constantly reconcile workload clusters. For example, if you increase the number of replicas in some node pool, CAPI controllers will ensure that the new virtual machine is provisioned and joined to the desired workload cluster.

Out of the eight features we needed, Cluster API solved four and a half for us. Cluster installation, managed control plane, auto scaling, and easy cluster upgrades are fully in the domain of Cluster API.

In our case, node pool management was only partially solved by CAPI, since machines are immutable structures in CAPI. This means that if you want to change something on the machine, CAPI will create a new machine and replace the old one. That’s perfectly fine until we need to do an emergency security patching of 100 Kubernetes clusters and finish in less than 48 hours. As we want to be as fast as possible here, we decided not to follow the immutability path but to patch the cluster nodes via the internal patch management system.

For that, we need to connect every cluster node to the patch management system.

Custom-built controllers

We wrote a couple of controllers that are following CAPI principles to integrate with other systems in the company. One is the controller that gets a free IP address from IPAM (IP address management system) and assigns it to a node. It also removes the reservation when a node is deleted.

Other examples are a controller that adds and removes nodes to a patch management system and a controller that pushes the initial set of secrets to the workload cluster.

All of them work on the same principle. They are listening for a specific event (e.g., a new workload cluster is created) and performing actions when it happens (e.g., adding a cluster to Argo CD).

Using this approach, we have successfully integrated our managed offering with our ecosystem, context, and existing processes. Nothing special about patching virtual machines that are part of a Kubernetes cluster now. They are regularly patched by the same mechanism as any other virtual machine.

GitOps (Argo CD)

As CAPI uses CRDs for workload clusters definition, and CRDs are just manifests, Git is the ideal place to store them. We use Argo CD to reconcile the workload cluster manifests to the management cluster. Standard cluster add-ons are reconciled directly to the workload cluster. Preconfigured storage classes are part of standard cluster add-ons, providing ready-to-use storage for end users.

We are just starting

This stack has allowed us to cover all the requirements but note that this is still a work in progress. We still are on this journey of creating and offering managed services to our internal customers. After our on-premises environment, both of our cloud providers will follow.

We firmly believe in this approach. Combining excellent open-source tools in combination with writing custom integrations where necessary is a path we will follow to provide managed services to our internal customers – the developers. If you have quite a specific set of requirements like we did, it just might be the right path for you as well.