Use iPerf to test NIC speed between two ESXi hosts

Sometimes you want/need use iPerf to test the nic speed between two ESXi hosts. I did because I was seeing a NIC with low throughput in my lab.

How can we test raw speeds between the two hosts? iPerf comes to the rescue. I was looking on how to do this on an ESXi host. I doesn’t come as a surprise that I found the solution here at William Lams’ virtuallyghetto.com. Apparently iperf has been added to ESXi since 6.5 U2. You used to have to copy iperf to iperf.copy. In ESXi 7.0 that has been done for you, although you will need to look for /usr/lib/vmware/vsan/bin/iperf3.copy

ESXi host 1 (iperf server)

Disable the firewall:

Change to the directory containing the iperf binary

Execute iPerf as server

Overview of the used parameters:

-swill start iperf as server
-Bdefines the IP the iperf server will listen to

Disable the firewall

ESXi host 2 (iperf client)

Change to the directory containing the iperf binary

Execute iPerf as client

Overview of the used parameters:

-iwill determine the interval of reporting back
-ttime iperf will be running
-cclient ip, will force the usage of the correct vmkernel interface
-fmdefaults to kbit/s, adding m will use mbit/s

Don’t forget to re-enable the firewall on both systems.

esxcli network firewall set --enabled true

vShield Endpoint SVM status vCenter alarm

vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status’

vShield Endpoint SVM status alarm

Checking vShield for errors:

The DSVA VA console window shows: (as to where it should show a red/grey screen)

Let’s go for some log file analysis

To get a login prompt: Alt + F2

Login with user dsva and password dsva (this is the default)

The log file we are going to check is the messages log file at /var/log/messages

(why less is more: you get almost all the vi commands)

To go to the last line:

For some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces.

To exit the log file display mode:

To gain root privileges:

Enter the dsva user password

Navigate to the /var/opt/ds_agent/slowpath directory

Create the dsva-ovf.env file (if the file exists, delete the existing file first):

Reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically:

Start or stop ESXi services using PowerCLI

Start the ssh service on all hosts:

Thanks to Alan Renouf at virtu-al.net, where I found this snippet: https://www.virtu-al.net/2010/11/23/enabling-esx-ssh-via-powercli/

If you want to start the ssh service on a single host, change ESXiHostName to your ESXi FQDN:

If you want to stop the ssh service on all hosts:

If you have multiple cluster in vCenter, are connected to multiple vCenters, be sure to launch the command only to the necessary hosts:

  • Get-Cluster -Name ClusterName will filter to the specified Cluster
  • Get-VMHost -Name ESXiHostName will filter to the specified ESXi
  • Get-VMHost -Server vCenterServerName will filter to the specified vCenter server

These are other services I frequently use:

  • DCUI (Direct Console UI)
  • lwsmd (Active Directory Service)
  • ntpd (NTP Daemon)
  • sfcbd-watchdog (CIM Server)
  • snmpd (SNMP Server)
  • TSM (ESXi Shell)
  • TSM-SSH (SSH)
  • vmsyslogd (Syslog Server)
  • vmware-fdm (vSphere High Availability Agent)
  • vpxa (VMware vCenter Agent)
  • xorg (X.Org Server)

There are other services available but I have never used them in this context (yet):

  • lbtd (Load-Based Teaming Daemon)
  • pcscd (PC/SC Smart Card Daemon)
  • vprobed (VProbe Daemon)

Change the startup policy for a service:

  • Automatic: Start automatically if any ports are open, and stop when all ports are closed
  • On: Start and stop with host
  • Off: Start and stop manually

Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: ‘/altbootbank/state.xxxxxxx’

In a VSAN project the VMware Compatibility Guide mentioned a different driver version for the raid controller than the one that was installed. So I tried to install a driver update for the raid controller through the CLI. This did not work out as expected because the /altbootbank was in a corrupted state. There were two ways to go ahead, either reinstall from scratch or try to rebuild the /altbootbank from the /bootbank contents. This was not a production server so I had the freedom to apply a more experimental approach and therefor I chose the not supported, not recommended approach to rebuild the /altbootbank from the /bootbank contents.

I ran the following command to install the driver:

I got the following error message:

I found the following two links describing the issue:

https://communities.vmware.com/thread/413441?start=0&tstart=0

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2033564

The vmware KB is going through the steps to solve this, which in this case didn’t. The better solution is to repair or reinstall but this is a time consuming task.

The steps in the KB didn’t solve it, so I tried to delete it with:

The ghost file/directory would not delete. The first command returned ‘This is not a file’, the second ‘This is not a directory’.
I repeated the same commands after a reboot with the same results. As the server was still booting well I knew the /bootbank was still ok. I wanted to replace the /altbootbank with the contents of the /bootbank partition.

THE FOLLOWING IS NOT RECOMMENDED NOR SUPPORTED! DO NOT EXECUTE ON A PRODUCTION ENVIRONMENT !

Identity the naaID and partition number of the /altbootbank:

Scratch the partition through recreating the file system:

Remove the /altbootbank folder:

Create a symlink to the newly created vFat volume with /altbootbank:

Copy all the contents from /bootbank to /altbootbank:

Change the bootstate=3 in /altbootbank/boot.cfg

Run /sbin/autobackup.sh script to update the changes

Reconfigure diagnostic partition

Reconfigure diagnostic partition with PowerCLI using Get-EsxCli

The following Get-EsxCli command will unconfigure your diagnostic partition and reconfigure with smart selection. This was needed because the install partition uuid had changed due to an option in the NetApp system while doing system testing.

Many thanks to http://www.virten.net/2014/02/howto-use-esxcli-in-powercli/