After talking about GKE On-Prem’s Architecture, compute and storage integrations into your datacenter in the last blog post, let’s talk about Networking, Security and Cloud Connect features in this one.
Let’s talk about Networking. This covers all east-west and north-south traffic inside the GKE On-Prem user cluster. By east-west I mean the pod-to-pod or service-to-service communication that happens inside the cluster. Whereas by north-south I mean when pods are trying to access resources outside the cluster and also when external resources are trying to access the applications running on top of your cluster. Let’s look at the different subnets and IPs required for the user cluster.
- Node IP Addresses: These are the IP addresses that you assign the VMs that are part of your user cluster. Based on how you set your cluster up, these can either be statically assigned or through DHCP.
- Pod CIDR block: These are used by your pods running on the user cluster for pod-to-pod communication. Best Practice recommendation from Google is to have /24 subnet available for each node(VM) in your cluster.
- Services CIDR block: These are used by the services that you create in your Kubernetes cluster. You can use a non-routable IP range that doesn’t overlap with any other blocks defined in your cluster. Also, keep in mind that this shouldn’t overlap with any public IP address ranges. I spent hours troubleshooting an installation where I used a non-RFC 1918 range for my services CIDR block.
- Services VIP: These are routable IPs that are automatically configured on the F5 load balancer for any services that you expose by using type LoadBalancer. You don’t need to do any manual configuration, just have to make sure that the IPs are available.
- Control VIP: Routable IP address for the Kubernetes API Server that is configured on the F5 load balancer.
- Ingress VIP: Routable IPs that are configured on the F5 load balancer for L7 ingress to work in conjunction with the Envoy proxies running on each node.
For east-west communication, GKE On-Prem relies on Calico to provide BGP support, so that pods running on different nodes in the cluster can talk to each other. Since you can run these clusters in something called “Island Mode”, you don’t have to worry about using routable IP address ranges and you can even use the same subnets for different user clusters deployed On-Prem. For all northbound traffic, the Pods use NAT and use the Node’s(VMs) IP to route traffic outside the cluster. For all southbound traffic, you can either use Ingress IP with different port numbers or you can use dedicated service VIPs that will help load balance the traffic across different pods running on the cluster.
Next, let’s talk about how you can manage all your On-Prem clusters from the Google Cloud Console. For this Google introduced something called Connect. It helps you connect your On-Prem clusters back to Google Cloud, thus allowing you to manage your clusters, deploy applications and workloads and apply autoscaling policies to the pods running on-prem. The Connect Agent needs a service account key that has the correct permissions to connect to google cloud, and it runs as a Kubernetes Deployment inside your cluster. Once connected, it will share details about the cluster, workloads and any metadata with the control plane running in Google Cloud. You don’t need additional VPN or Cloud Interconnect connections for GKE On-Prem, although if you want to access additional GCP services, you might need those additional connections.
Now, let’s talk about security. I like to split this discussion into two topics. One is the end to end OS to App security and the second is secure communication(which also includes authentication and authorization). For the first topic, Google uses Ubuntu for the virtual machines that form your Kubernetes clusters. As part of the installation process, you download a custom image from Google’s repository, so you don’t have to worry about hardening the base OS image for your VMs. Google also controls the container runtime and the version of Kubernetes that you run inside your cluster. To upgrade your clusters, you just have to select from a list of tested and verified versions from the Google Cloud Console. Now, for the topic of secure communication, all traffic between pods and services, and also between your API server and your etcd store is done over TLS channels. Google also offers integration with OpenID Connect (OIDC) for authentication, and you can use native Kubernetes Role Based Access Control(RBAC) policies to define which users and services have access to which resources and operations. I am not a security expert, so knowing all these details is good enough for me. But, if you want to learn more, you can go to https://cloud.google.com/gke-on-prem/docs/concepts/security.
That’s it for now. I hope that these couple of posts helped you understand how GKE On-Prem functions when deployed inside your datacenter. In the next blog, we will talk about how you can install an admin and a user cluster inside your own datacenter running vSphere 6.5.