Last week I had to replace all the drives in a VMware vSAN Cluster in my lab (which was built using Lenovo ThinkAgile VX Appliances) to increase the amount of capacity available in the cluster and also use Intel NVMe drives as the Cache Tier. But, I had a number of virtual machines that I was actively using for other projects, which I didn’t want to lose by reinstalling everything. This led me down the path of finding a way to non-disruptively replace the drives one node at a time. I found a few VMware KB articles that helped me along the way, so I decided to write down the exact steps with screenshots that can help others in the community or me in the future when I have to do this again.
Below are the steps on how you can take down one node at a time from a vSAN cluster, replace the drives and then join it back to the vSAN cluster.
- We will start by evacuating all the VMs off of the host. This is a simple migrate operation where we move the VMs off to another host in the cluster.
- Once the migration is complete, you can right-click and put the host into maintenance mode.
- Once you click on that, select full data migration. This will make sure that all the vSAN objects are moved off of the host onto other hosts in the cluster, and all your virtual machine storage policies are still compliant.
Depending on the amount of data you have on your host, it might take anywhere from a few minutes to a few hours to move those vSAN objects to other hosts.
- Once your host is in the maintenance mode, you can go to the cluster –> Configure –> vSAN –> Disk Managment and then remove all the disk groups that exist on the host.
- After you have removed the disk group, you can move the host out of the vSAN cluster.
- Once the host is out of the cluster, then you can gracefully shut down the host, perform the disk swap and then power on the host.
- Once the host is back online, move it back into the vSAN cluster. Keep in mind that it is still in maintenance mode.
- Once the host is back in the cluster, you can navigate to the host –> configure –> storage drives, and then check all the new drives show up. If you are reusing older drives, you should remove partitions from the drives, before you create vSAN disk groups.
- Once all the drives are ready, you can navigate to the cluster –> configure –> vSAN –> Disk Management –> Claim unused drives for vSAN.
Select the appropriate capacity tier and cache tier drives. Once you click OK, it will create disk groups for your host.
- After that, you can remove your host from maintenance mode.
You have now replaced drives on a vSAN node non-disruptively. You can repeat the same process on all the other nodes in the cluster if you have to.
Hopefully, this blog helps you perform non-disruptive upgrades and FRU replacements on your VMware vSAN cluster.