Welcome Back! I am going to assume that all of you understand what Google Anthos is and the problems it is trying to solve. If you don’t, then you should go back and quickly read through the first post in this series, where we talk about Anthos……. Done? Let’s go then… This is the second blog in the Anthos series, and here we will talk about Google Kubernetes Engine (GKE) On-Prem. GKE On-Prem is the cornerstone of Anthos and it is what delivers the whole Hybrid cloud story for Google Cloud. GKE On-Prem enables users to run Google Kubernetes Engine (GKE) clusters inside their own datacenters. It gives users the ability to manage and maintain Kubernetes clusters running in Google Cloud or running inside their own datacenters using the Google Cloud Console. It gives users the ability to monitor their clusters using Stackdriver, it enables users to deploy Marketplace Applications on the Kubernetes clusters running On-Prem.
This is kind of a big deal if you ask me. Microsoft also has a version of Azure Kubernetes Service running on top of Azure Stack, but all it does is helps you deploy multi-master, multi-node Kubernetes Clusters on top of your Azure Stack stamp using Azure Resource Manager (ARM) Templates. You don’t actually get all the benefits of AKS inside your own datacenter.
Now, let’s look at the GKE On-Prem Architecture:
As you can see in the diagram above, you have three things that GKE On-Prem needs in your datacenter for it to function as expected.
- VMware vSphere cluster running vSphere 6.5 (Only supported version with GKE On-Prem version 1.0)
- F5 BIG-IP LTM to provide Loadbalancer type support for the Application services defined in your Kubernetes clusters.
- Admin Workstation: The GKE On-Prem admin workstation is a bundle for all the different utilities needed to deploy GKE On-Prem clusters and is delivered as an OVA file, that can be deployed on your laptop running VirtualBox or a different management cluster, the only caveat is that it needs to be able to talk to your vCenter instance.
Dependency on VMware vSphere and F5 BIG-IP LTM were design decisions made not only to cater to Google’s early access customers but also because of the vSphere-Kubernetes integrations that GKE On-Prem could directly leverage. VMware vSphere provides two major components to make the On-Prem experience better for users. The vSphere Cluster API and vSphere Cloud Provider plugin into Kubernetes gives users (Or Google) the ability to seamlessly deploy virtual machines to be part of the admin and user Kubernetes clusters and the ability to provide persistent storage to the application pods or containers running on top of your Kubernetes clusters respectively.
The vSphere Cluster API adds the following constructs to enable easier deployments for the underlying virtual machines:
- Machine: Machines are nothing but VMs running on top of your vSphere Infrastructure. Machines are equivalent to the concept of Kubernetes pods. You can define all the virtual machine level details like the CPU, Memory, Storage, Base OS requirements using the Machine Definition.
- Machine Set: Machine Sets are similar to Replica Sets where you define the desired number of Machines needed to support your Kubernetes cluster and the Machine Set controller will basically run a reconciliation loop to ensure that the desired state matches the current state.
- Machine Deployment: Machine Deployments are similar to Deployments in Kubernetes. You can basically perform rolling updates to the underlying base Operating System for your Machines (VMs) using the concept of Machine Deployments.
- Machine Class: This is similar to the Storage Class definition. This basically enables you to specify all the environment-specific configuration parameters in one place, rather than you having to define it with every Machine Definition.
Using vSphere Cluster API, you can deploy, manage and scale your virtual infrastructure similar to how Kubernetes deploys, manages and scales your containerized applications.
Now, let’s talk about vSphere Cloud Provider. vSphere Cloud Provider plugin is built into upstream Kubernetes, so you don’t have to worry about deploying additional CSI plugins to provide persistent storage to your containers. vSphere Cloud Provider plugin enables users to consume storage from any datastore that is mounted on vSphere (NFS, VMFS, or vSAN). When using a vSAN Datastore, you can extend all the Storage Policy Based Management (SPBM) Capabilities to the storage class definition inside Kubernetes. (That’s one of the reasons why Lenovo has decided to deliver GKE On-Prem on top of our ThinkAgile VX Appliances). You can create custom storage policies using the following parameters:
- cacheReservation: Flash capacity reserved as read cache for the container object
- diskStrips: Minimum number of capacity devices across which each replica object is striped
- hostFailuresToTolerate: Number of host and device failures that a VM object can tolerate.
- iopsLimit: Defines the IOPS limit for an object.
- objectSpaceReservation: Percentage of logical size of vmdk that must be reserved or thick provisioned when deploying VMs.
- forceProvisioning: Object is provisioned even if other requirements are not met.
All the persistent volumes defined by Kubernetes are created as VMDK files inside the vsanDatastore and then mounted to the VMs that are running your pods.
One more thing, before we wrap up this blog and talk about what’s next. Let’s talk about the admin and user clusters inside of GKE On-Prem. When you deploy your initial GKE On-Prem cluster using the Admin Workstation, you deploy an admin cluster and a single user cluster. The Admin Cluster is responsible for running all the control plane components for GKE On-Prem. When you want to deploy additional user clusters, you use gkectl (command-line tool specifically built for GKE On-Prem) to talk to the Admin cluster, which will then talk to the vCenter APIs to deploy additional virtual machines for the new User Cluster. User Cluster is where you deploy your applications. The Master nodes for the User Cluster also participate in the Admin cluster. When you buy Anthos licenses, you are paying only for the vCPUs consumed by the VMs in the User Cluster. This architecture is also sometimes referred to as Kubeception.
Hopefully, this post gave you a good starting point to understand GKE On-Prem and its high-level architecture. In the next blog, we will talk about how Networking and Security work inside of GKE On-Prem and also talk about how these clusters connect back to Google Cloud.