VMware I/O Analyzer is a tool to launch orchestrated tests against a storage solution available from the VMware flings website. It can be used as a single appliance where the worker process and the analytics is done within. Additional appliances can be deployed to act as Worker VMs. The Analyzer VM launches IOmeter tests (on the Worker VMs) and after test completion it collects the data. All configuration is done from a web interface on the Analyzer VM.
This post is describing how I deployed VMware I/O Analyzer and how I got to a test with maximized IOs. The first tests were conducted launching a IOmeter from within a virtual machine on the vSAN datastore and showed more or less 300 IOs being generated. In the end 18 Worker VMs with 8 disks each on a 6 host vSAN cluster were used generating 340K+ IOPS. The purpose was to create a baseline for a VSAN datastore maximum IOPs.
6 hosts 1 disk group 1 800GB SSD drive5 1,2 TB 10K SAS vSphere 5.5 U3
The VM OS disks should not be put on the vSAN datastore you want to test, if not the generated IOPs will be part of your report. To keep the Analyser VM IOPS out of the performance graphs, put it on a different datastore.
Deploy one Analyser VM. Deploy a Worker VM per ESXi host. You should end up with as much Worker VMs as you have hosts in your cluster.
I changed the IP of all VMs to static as there was no DHCP server available in the subnet. This means that no DNS entries were required.
Preferably you will want to change the Analyser VM to a static IP as you will manage the solution from a web browser. The Worker VMs you can leave as is if there is DHCP server available. You will need dns entries and change the configuration used here.
To work easily set the Worker VMs on static IPs or create dns aliases as you will be doing a lot of work on the Worker VMs. I prefer static IPs because they add no complexity due to name resolving, etc…
vi/etc/sysconfig/network/routes(The filewill be created ifit doesn’texist)
Add / Change the following line:
(Defaultspace GW space hyphen space hyphen)
Save and close the file (:wq)
Restart the network service:
service network restart
Check if the VM is reachable.
Now shutdown the VM.
Deploying the Worker VM:
Clone the Analyser VM.
Add a Hard Disk of 1GB.
Choose advanced and put the 1GB disk on the VSAN datastore.
I needed to configure static IPs on the Worker VMs, so I had to start each VM and change the IP address. After changing the network settings, shut down the VM and create a new clone. Not changing the IPs will give duplicate IPs.
Ease of access configuration
Two ease of access configurations were applied. The first is configured for easy copying from the Analyzer VM to the Worker VMs. The second because all appliances need to be logged onto for the VMware IO Analyzer solution to work. All commands are executed on the Analyzer VM and then copied to the Worker VMs.
Setup ssh keyless authentication
Generate a key pair
ssh-keygen(with an empty passphrase)
ssh-copy-id will copy your public key to the target machine
The root account password of the destination will need to be supplied for each of the above lines.
BE AWARE: This has the following security downside. If the root account is compromised on the Analyzer vm all worker vms should be considered compromised too.
Change autologon=”” to autologon=”root” in the displaymanager (/etc/sysconfig/displaymanager) file with the following command:
TIP: Create affinity rules in vCenter to keep the Worker VMs on dedicated hosts, otherwise the configuration on the VMware I/O Analyzer dashboard will be outdated soon. The consequence is that certain Worker VMs will not be launching their IOmeter profiles and therefor the reports will not be correct.
Enable the SSH service on the ESXi hosts via the vSphere (Web) Client or through Powershell.
The powershell way: (be aware to filter your hosts if needed). There is a dedicated post about starting and stopping ESXi services through powershell here.
I found that looking at the console of the Worker VMs is interesting for troubleshooting. You can see the IOmeter tests being launched. This was very usefull in the process of creating the IOmeter profile. You don’t need to wait untill the test is finished to see it has failed. Stopping IOmeter tests from the console gives the opportunity to look at, edit and save the launched profile.
In a VSAN project the VMware Compatibility Guide mentioned a different driver version for the raid controller than the one that was installed. So I tried to install a driver update for the raid controller through the CLI. This did not work out as expected because the /altbootbank was in a corrupted state. There were two ways to go ahead, either reinstall from scratch or try to rebuild the /altbootbank from the /bootbank contents. This was not a production server so I had the freedom to apply a more experimental approach and therefor I chose the not supported, not recommended approach to rebuild the /altbootbank from the /bootbank contents.
I ran the following command to install the driver:
The vmware KB is going through the steps to solve this, which in this case didn’t. The better solution is to repair or reinstall but this is a time consuming task.
The steps in the KB didn’t solve it, so I tried to delete it with:
The ghost file/directory would not delete. The first command returned ‘This is not a file’, the second ‘This is not a directory’. I repeated the same commands after a reboot with the same results. As the server was still booting well I knew the /bootbank was still ok. I wanted to replace the /altbootbank with the contents of the /bootbank partition.
THE FOLLOWING IS NOT RECOMMENDED NOR SUPPORTED! DO NOT EXECUTE ON A PRODUCTION ENVIRONMENT !
Identity the naaID and partition number of the /altbootbank:
Scratch the partition through recreating the file system:
Remove the /altbootbank folder:
Create a symlink to the newly created vFat volume with /altbootbank:
Copy all the contents from /bootbank to /altbootbank:
Change the bootstate=3 in /altbootbank/boot.cfg
Run /sbin/autobackup.sh script to update the changes
Reconfigure diagnostic partition with PowerCLI using Get-EsxCli
The following Get-EsxCli command will unconfigure your diagnostic partition and reconfigure with smart selection. This was needed because the install partition uuid had changed due to an option in the NetApp system while doing system testing.