VMware ESX Performance

Requirements

Host:

ESX must be fully patched and licensed version 4.1+ and the host rebooted.

Guest:

Latest VMware tools must be installed and the guest rebooted

Guest OS must be Vista/2003 or SLES11 /RHEL6 / Kernel v2.6.30+ or later for full support

Older linux and Windows operating systems may be supported when excluding the first virtual disk and boot disk which is usually mapped to (boot disk, c:, "/" or rootfs). Usually this is the same disk called "Disk0 or Disk1".

Most common causes of snapshot failure

When a snapshot is called from the hypervisor or a vss(open file) backup is called with an agent notice is given to the Microsoft VSS service to "save all in memory data to disk". The vss service then contacts all services (Exchange, MSSQL, Registry, Hyper-V...etc) one at a time. The vss service requires that this flush is complete within 20 seconds or it will roll back and cancel the transaction(s).

Not enough space in datastore to physically store snapshot(s)

Storage is i/o bound or too many snapshots are in flight

Purpose

Quadruple vm performance and reduce host load to achieve 95% native performance and reduce the time it takes for all VSS enabled windows services to flush their data to disk (inside the guest) so we can consistently stay inside the 20 second timeout window of the Microsoft VSS service and achieve complete snapshot consistency.

Host BIOS Configuration

If the architecture is pre Sandy Bridge Disable HT/Logical Core support in BIOS so the hypervisor scheduler sees only real physical cores 
We do this on 100% of our own hypervisor and linux systems and experience higher reliability and about 30% better io performance as well as better virtual clock consistency in guests without timesync.

Configure NUMA support

On modern hardware 1/2 of the ram is wired to each socket and so data doesn't have to cross the northbridge/bus unless it needs to move a vm/process to the other socket. In practice the BIOS is configured to interleave this data because of potentially missing os support. If NUMA is supported by the Motherboard and BIOS (*properly) and the hypervisor or os significant performance can be gained.

Make sure the memory configuration is NOT in "interleave" mode so NUMA is enabled

You can check with this command on the esx host "esxcli hardware memory get | grep NUMA"

Properly Commit Resources against _physical_ processors

If the system has 12 physical cores and a RAID array, 1 8/10GB FC and 1 10GB NIC we can only run 8 Virtual cpus at the same time! We imagine using 1 core for each 10GB component, 1 core for each set of 4 1GB components and also 1 core for the Hypervisor OS. Formula is: MAX_RUNNABLE_VCPUS <= (Number of Physical Cores)-(number of 8 or 10GB Components)-(number of 1GB components/4)-(1 Hypervisor OS)

Guests should almost never have more than 2 virtual cpus assigned

Guests should almost always have two virtual cpus assigned. The exceptions are heavy sql, webapp, terminal servers or spam filters which are entirely cpu bound.

Windows VMs must be created and windows installed with 2 or more vcpus, reduction to 1 is possible after os installation is complete

Background Information

By default VMware ESX(i) emulates a real scsi controller and real network card and you can use standard drivers and any "Unmodified" os. This emulation requires about 8x copies of each "packet" since we take data from the physical interface->host_ram->host_emulated_card->guest_emulated_card->guest_ram and back.

VMware has developed special "Para-Virtualization" drivers to open a shared memory area on the host which eliminates 3/4 of the copies. PV data flows from the physical interface->host_ram=guest_ram and back. Usage of PV drivers mean we are running a "Modified" OS.

There are some concerns about the security of PV drivers due to allowance of direct access to host ram. All other major hypervisors use PV drivers for disk and network by default after the guest tools are installed. We aren't concerned in private cloud applications but public/hybrid cloud vm administrators should take notice. In the worst cases one guest can scope ram/io/pci in other guests and/or the host. This is why the US Government uses a private cloud within Amazon.

With a hypervisor backup we are only concerned with PV support for disk/scsi

With an agent level backup we are concerned with both PV support for disk/scsi and network

Configuration

Confirm all requirements above are met

Realize that we only provide support for hypervisor/vm configuration issues on a consulting basis

Realize that we are not responsible for any problems you experience due to these recommendations

Realize that if you set PV-disk support for the first virtual disk in an unsupported os or old version of the guest tools it will corrupt the filesystem!

Setup PV net support for all supported virtual machines (one at a time).

Linux specific warnings

Set a 1 hour maintenance window per vm in case you have problems booting due to service start timeouts.

Linux uses "udev" to assign discovered interfaces to static interface names (eth[0-99]) during boot so eth0 will always map to the same physical card on later boots.

The interface names are assigned by the linux kernel at boot in an unpredictable order so udev actually renames them according to the configuration saved in /etc/udev/rules.d/*persistent-net-rules before network startup. The script which generates this file has a very similar name usually containing the word "generator" and usually lives in the same folder.

Linux network configuration could select the card based on either the mac address or the interface name.

Usually it is best to backup the persistent-net-rules file and remove all entries before shut down so the new nic will be named eth0 and stands the best chance of matching the current network configuration.

Sometimes the system won't boot properly and you have to append " s" or " single" to the boot prompt to go into single user mode and configure the new card. In this case usually CTRL+ALT+DELETE will safely restart the system

The network configuration files are in /etc/sysconfig/network or network-scripts or /etc/default/network

Often this re-configuration inserts bad entries into the /etc/hosts file. The file should start with an entry for "127.0.0.1 localhost" and be immediately followed by an entry for "ip.ip.ip.ip hostname.domain.tld hostname". "hostname" should match the single lower case word returned by the command hostname.

(Remove all entries from /etc/udev/rules.d/70-persistent-net-rules)

Stop VM

Edit VM settings in vcenter/vic

Remove network card(s) after recording the vlan,mac and type configurations

Add network card(s) of type "VMXNet3" and apply vlan configuration

Start VM

Configure IP/dns

Setup PV disk support for all virtual disks except the first one on all supported virtual machines (one at a time)

Stop VM

Edit VM settings in vcenter/vic

Change SCSI id all of secondary disks to their own controller ex: 0:1, 0:2 -> 1:1, 1:2

Change type of new virtual SCSI controller to paravirtual

Start VM

Setup PV disk support for the boot disk of 2003/2008 and SLES11+ and triple check with the vmware website before attempting this with any other os.

It is important that windows has activated the pv driver for disk at least one time before it can be used for the boot/c drive. On systems with a single virtual disk it is best to temporarily add a small disk on id 1:0 and set the second controller to paravirtual to get it activated. If you can see the disk in disk manager then you can reboot again with the temp disk/controller removed and the main controller in pv mode.

Stop VM

Edit VM settings in vcenter/vic

Change type of virtual SCSI controller to paravirtual 
Start VM

Request Call

Leave us your information and we will reach out to you shortly!

Download

FREE 30-day trial of SEP sesam, including full support.

You must be logged in to download.

Media Library

Browse videos from SEP's YouTube channel.