Implementing Tanzu in vSphere 7U1 Part 2: Enabling Workload Management

In this post, we will walk through enabling and configuring vSphere’s Workload Management for the Tanzu Kubernetes Grid.

In part 1 we went over the pre-requisites and deploying the HAProxy appliance to implement TKG using a distributed vswitch.
Now we will go through the Workload Management configuration and implementation.

In vCenter, we go to Menu then Workload Management.

As we do not have this deployed, we get some information. Click on Get Started to begin the wizard.

As we do not have NSX-T, the only networking stack option is vCenter Server network. There’s also the disclaimer at the top stating we must use HAProxy due to not using NSX-T. If you have a EML vCenter deployment, select the appropriate vCenter then Next.

If you have multiple compute clusters, select the appropriate cluster then Next.

Select the control plane size. These will be the resources provided to the supervisor cluster. There will always be 3x VMs for the supervisor cluster and the sizing is per VM.
I have been unable to find any sizing guidance.
As this is just my homelab, I selected Tiny.

Now select a storage policy that will include the storage you want these VMs to be deployed on. As I created a Tanzu Storage Policy for this purpose, that is the policy I selected.

Now we need to provide information related to the HAProxy appliance we deployed. Provide the name of the appliance, type is HA Proxy, data plane API address(es) would bet he management IP(s) and port.
Provide the local admin account credentials.
IP address ranges for virtual servers is the frontend IP range we configured in part 1; “Load Balancer IP Ranges” provided in CIDR notation. In my case, it was 10.0.12.128/25 which would be 10.0.12.129-10.0.12.254.
We also need to provide the certificate from the HAProxy. SSH to your HAProxy, then:

cat /etc/haproxy/ca.crt

Now provide info for the management network. Provide the starting IP address to provide to the 3x control plane VMs, along with other relevant IP configurations.
Note: I’ve heard that some folks have issues with a successful deployment if DNS Search Domain is left blank. I have not attempted without it, so I have not experienced issues related to that.

Now to configure the workload network. IP address for Services should be left the default, unless it interferes with your existing environment. It does not need to be a routed network, as it’s an internal network for TKG, but it does require a DNS Server.
Click “Add” just under Workload Network.

Provide a name, select your workload portgroup, gateway, subnet mask, and address range to be allocated out. Save.

We should now see this. Next.

Now we need to add the previously created content library with the cluster VM templates.
Click Add.

Select the subscribed content library, then OK.

Then Next.

Done with the wizard, click Finish.

We should see some activity. Resource pool and folder being created, provisioning agent VMs, hosts download VM templates from the content library.

You can review the configuration status in the top pane. If there’s a number in parenthesis under Config Status, you can click on it to bring up a bit more detail.

We can see it’s just informational, stating the Master VM is still being provisioned and configured. The deployment and configuration can take anywhere from 30 minutes to a couple hours.

Just an interesting side note; this was the umpteenth time I deployed this, as I initially made a few mistakes and experienced some strange behaviors.
This time going through, despite my domain account being a vCenter administrator, there were a number of tasks that did not get displayed.
These below were seen when logged in with the SSO administrator account; shows task names of deploy OVF template, reconfigure virtual machine with VM names, etc. These are not displayed when logged in with my domain account.

This is from my domain account.

At the same time, I took this screenshot from the session with SSO administrator account. Notice some additional tasks listed that aren’t displayed on the domain account session.

Now we look at config status again; shows an error configuring cluster NIC on master VM. Whatever caused the error did eventually resolve as I was patient and it did complete. Again, this can take upwards of a couple hours to complete so be patient.

It has finished. We can see the control plane node IP address on the frontend network, which should be accessible in a browser on HTTPS. Config Status is Running, so we should now be enabled.

Validate you can access the page on that IP. We should get links to download the CLI tools.

To help validate this, we can use the CLI tools to connect to our supervisor cluster and get the status of the supervisor cluster nodes.

C:\kubectl>kubectl-vsphere.exe login --vsphere-username [email protected] --server=https://10.0.12.129 --insecure-skip-tls-verify
 Password:
 Logged in successfully.
 You have access to the following contexts:
    10.0.12.129
    10.0.13.129
    tkg-k8s-prod
 If the context you wish to use is not in this list, you may need to try
 logging in again later, or contact your cluster administrator.
 To change context, use kubectl config use-context <workload name>
 C:\kubectl>kubectl get nodes
 NAME                               STATUS   ROLES    AGE   VERSION
 42069924bae46dc843987c4258871a08   Ready    master   33h   v1.18.2-6+38ac483e736488
 4206a9ee43a052d8038c4ef1bc1d61aa   Ready    master   32h   v1.18.2-6+38ac483e736488
 4206b21ac3cd049645f2af3aaee7f4b4   Ready    master   32h   v1.18.2-6+38ac483e736488
 C:\kubectl>kubectl get ns
 NAME                                        STATUS   AGE
 default                                     Active   33h
 kube-node-lease                             Active   33h
 kube-public                                 Active   33h
 kube-system                                 Active   33h
 svc-tmc-c7                                  Active   32h
 tkg-k8s-prod                                Active   25h
 vmware-system-appplatform-operator-system   Active   33h
 vmware-system-capw                          Active   33h
 vmware-system-cert-manager                  Active   33h
 vmware-system-csi                           Active   32h
 vmware-system-kubeimage                     Active   33h
 vmware-system-lbapi                         Active   33h
 vmware-system-license-operator              Active   32h
 vmware-system-logging                       Active   33h
 vmware-system-netop                         Active   33h
 vmware-system-registry                      Active   33h
 vmware-system-tkg                           Active   33h
 vmware-system-ucs                           Active   33h
 vmware-system-vmop                          Active   33h
 C:\kubectl>

We use get nodes to get the status of the individual supervisor cluster nodes. Get ns is to get all the namespaces.

We need to use the –insecure-skip-tls-verify as this is not (yet) using a trusted certificate, as it’s using the default self-signed certificate.

Now that we have Workload Management deployed, the next step will be deploying the TKG cluster and an actual workload. We will get to that in part 3.

Part 1: https://www.frozenak.com/2021/01/21/implementing-tanzu-in-vsphere-7u1-part-1-pre-requisites-and-haproxy/

Leave a Reply

Your email address will not be published. Required fields are marked *