VMware vSAN Best Practices

VMware vSAN Best Practices

This blog post will summarize the best practices and design considerations that you should keep in mind when deploying a vSAN cluster in your own datacenter. If you want to read the detailed guide that is published by VMware, you can follow this link, but if you are looking for a list of top 25 best practices that you can maybe print and keep at your desk, then read ahead.

  1. Always make sure that the hardware you are using is listed in the VMware Compatibility Guide. You do not want to be in a scenario where you don’t get support because you aren’t running on the right hardware.
  2. In addition to hardware, also make sure that you run the supported software, driver and firmware versions in your cluster. You can again refer to the VMware Compatibility Guide to find out what those versions are.
  3. Keep your environment updated. This is not only a vSAN best practice but a general rule of thumb. You should always keep your environment patched.
  4. Similarly configured and sized ESXi hosts should be used to form a vSAN cluster. This should be done to ensure that there is an even balance of virtual machine storage components across all the hosts and their disks.
  5. Design your cluster for growth. This is very important. Although scaling up and scaling out a vSAN cluster isn’t difficult, you should always plan for growth. For eg., If you know that you will be adding 3 additional capacity drives to your hosts in the next year, then make sure you select a Cache drive that would be able to serve your eventual capacity tier.
  6. Having minimum 4 nodes in your vSAN cluster would give you higher availability when compared to running a 3 node cluster(supported configuration).
  7. When you size your vSAN cluster to support the number of VMs that you want to deploy, also make sure to size the ESXi hosts appropriately. You don’t want to end up in a scenario where you have TBs of space available, but not enough memory to support additional VMs.
  8. Always enable vSphere HA. Keep in mind, that to enable vSAN, you will have to disable HA. Remember to turn it back on.
  9. To avoid inconsistent performance, make sure not to use hybrid and all-flash disk groups as part of the same cluster.
  10. Ensure that there are enough hosts in the cluster to accommodate the desired number of failures to tolerate. Use the formula 2n + 1, where n is the number of failures to tolerate. So if you want to tolerate 2 failures, you should at least 5 hosts in your cluster. The maximum number of failures that vSAN can tolerate is 3 so you will need a minimum of 7 hosts in that scenario.
  11.  For Hybrid configurations, you can use both 1G and 10G NICs on your host. In case of 1G NICs, vSAN needs a dedicated NIC.
  12. For All-Flash configurations, vSAN needs at least 10GB of network connectivity between hosts.
  13. If you use NIC-Teaming, then vSAN will only use it for high availability, and not for the aggregated bandwidth.
  14. vSAN works with or without Jumbo Frames, without a considerable performance impact. So follow your standard practices for Jumbo Frames implementation.
  15. Multiple smaller disk groups are recommended vs single larger disk group as multiple smaller groups allow for a higher cache to capacity ratio, thus leading to an accelerated performance of virtual machines.
  16. Consider the cost parameter when buying a PCIe device for the cache tier. Check if you can get the required performance using SSD devices in multiple smaller disk groups.
  17. The Cache tier should be sized to be at least 10% of the capacity consumed by virtual machines.
  18. Allow for 30% slack space when designing capacity, this is because vSAN will start automatic rebalancing when a disk reaches 80% threshold which generates rebuild traffic on the cluster.
  19. If virtual machine snapshots are going to be used heavily in hybrid configurations, then increase the minimum cache to capacity ratio from 10% to 15%.
  20. Multiple storage I/O controllers per host can help eliminate single points of failures and also improve performance.
  21. Choose storage I/O controllers that have as large a queue depth as possible. Having larger queue depths will improve virtual machine I/O performance.
  22. Remember to account for vSAN filesystem overhead when sizing the capacity tier.
  23. All VMs deployed on vSAN will be thin provisioned, so keep an eye on the capacity tier.
  24. vSAN adds around 10% overhead on host CPU, so keep that in mind when sizing your ESXi hosts.
  25. When deploying large vSAN clusters, use fault domains as a way to avoid single rack failures impacting all replicas belonging to a virtual machine.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s