Kubernetes is an open source platform designed to schedule and manage containerized workloads across many hosts. The project was initially launched in 2015 by Google based on their own internal Borg system and is now maintained by the Cloud Native Computing Foundation (CNCF). Since it’s based on Borg which schedules over 2 billion containers per week, the architecture was built to scale and should be able to run your whole data center.
It’s important to realize that Kubernetes is very opinionated and won’t run just any old app. To run well, apps or workloads must be containerized and architectued following most of the Cloud Native tenants. The definition of Cloud Native is somewhat different depending on who you ask, but I think The New Stack gives a pretty good summary.
You need to take into account the different environment apps have inside a container compared to a very reliable server or VM. For example, containers will (and do) die and get rescheduled on an entirely different host. Data inside a container does not persist after the container lifetime without taking additional steps. And of course, connectivity to services for external users is much different than with legacy infrastructure.
Kubernetes brings many benefits, like efficient use of resources, consistent delivery, app portability and automation. But by far, the primary reason you should adopt is to enable faster velocity for developers. Manual steps are the biggest thing that slows down getting features to production. And, Kubernetes is a platform to run containerized apps that can be built, deployed and scaled to meet dynamic load all without manual steps.
Similar to other resource schedulers, at the highest level the architecture consists of a control plane and a pool of capacity to run applications. The control plane is a set of services, “Master Components”, that run on hosts that are often called “Master Nodes”. They are responsible for knowing the total capacity of the pool, how much has been allocated and facilitate scheduling containers in the pool. This process is often called resource scheduling or orchestration.
The primary Master Components include
etcd- a key value store that maintains the state of the cluster
kube-apiserver- provides an API external to the cluster for tools and automation
kube-scheduler- scheduling engine that tracks capacity and determines where to run containers
kube-controller-manager- manages the collection of controllers that manage the Kubernetes environment - one example is
ReplicationControllerthat manages instances of containers
The control plane talks to a set of services, “Node Components”. These services run on hosts, often called “Worker Nodes”, that provide capacity to run the app containers or workloads. Node Components are responsible for executing the allocation and container operations as directed by the control plane.
The main Node Components include
kubelet- the control plane talks to this service to request state changes on its host
kube-proxy- facilitates network rules and connection forwarding on the host
At the highest level, the architecture looks like this
Beyond scheduling, Kubernetes also provides a wealth of building blocks to help you deploy and run workloads. Some examples are
Kubernetes brings it all together in a standardized definition language based on YAML. It’s a declarative specification for the cluster state you want. The control plane takes the specification and makes changes to arrive at the desired state.
The control plane also continues to monitor the cluster state and adjust, which is great when an app unexpectedly crashes. The control plane sees the state is missing an instance and automatically runs a new one to replace it.
In summary, Kubernetes is a platform to run container workloads that comes with a long list of benefits. When used properly, it enables developers to achieve a faster release velocity and makes apps more resilient. In future posts, I’ll go deeper into the components and how to get the most out of the platform.