Upgrade Methodology – Upgrade the homelab

This post is not as a end-to-end upgrade guide but a methodology guide. Everything is more or less straight forward if one uses the correct methodology.

As this is a home lab I have chosen not to add complexity by adding additional nodes to the various components for High Availability but depend on vSphere HA. This eases and shortens my upgrade path. In a production environment every application should be evaluated and talked through with the stakeholders whether it can rely on vSphere HA or some form of application HA should be introduced.

The release of vSphere 7 introduced a starting point for an update/upgrade round in the lab. I have not always used the methodology when upgrading components as I should have in the lab. When I used this methodology in the current upgrade round it has come to light that some components were not interoperable with each other. Why is this important? If you have a problem with your environment and you call VMware support, they will go through the logs and verify the environment is on par with the documentation.

So where do I start?

Reading the docs first takes time but it will save a lot of time in the long run. Can’t stress it enough RTFM !!

I always create my own high level upgrade guide. This will include the research that I have done and it also includes the upgrade path to follow.

Phase 1: Information gathering

Determine the components and versions

Be thorough in determining the components and versions (BOM or Bill Of Materials).

  • VMware vSphere 6.7U3
  • vRealize Log Insight 4.5
  • vRealize Operations 7.5
  • VMware Horizon 7.11

You will see below that I had forgotten to include VMware Horizon in my component set. This could have been catastrophic as some components might stop to work when you don’t follow the correct upgrade procedure.

Gather the documentation

After you determine which components are used in the environment you can go and search for the necessary documentation. Find the release notes, upgrade guides and other relevant documentation

I used the following documentation set:

Release notes:

Upgrade documentation:



In the ‘Compatibility Considerations’ in Important information before upgrading to vSphere 7.0 vRealize Operations is mentioned as not compatible with vSphere 7.0. This will be overruled by the VMware Product Interoperability Matrix between the two products (VMware Product Interoperability Matrix overview section),

VMware Hardware Compatibility List

What I learned from these documents was that I was not sure the NVMe drives in my hosts would be compatible. After all they are consumer grade NVMe drives and are not on the VMware HCL.
I recently installed two NVMe drives, a CT500P1SSD8 and a CT1000P1SSD8 and at the time of install the Crucial CT500P1SSD8 was not recognized. A quick googling showed me a blog post from William Lam that replaced the vSphere 6.7 U3 nvme driver with one from a vSphere 6.5U2 install.
I will discuss how I determined if there would be issues around this in ESXi 7.0 in a later phase.

All components should be checked with the vendor and the VMware HCL. Be aware that the vendor and VMware might not always agree and that the VMware HCL might not always be in sync with the the documentation of the hardware vendor. You should always follow the VMware HCL but be aware of the following KB. If vSAN hardware is involved it is advised to use extreme caution as this has a specific section of the VMware HCL for vSAN Ready Nodes and for Build Your Own

VMware Product Interoperability Matrix overview

I did a research on which versions would be compatible with vSphere 6.7 and vSphere 7.0 as I was not yet sure if would be able to upgrade to vSphere 7.0

One product I forgot about was VMware Horizon. I’m currently on 7.10 and the VMware Product Interoperability Matrix show that at least 7.12 is needed to be supported. As I currently use Horizon with a full clone this should not pose much problems and I am planning to upgrade this as well. If I would be using Linked Clones or Instant Clones this could have been worse.

Update sequence for vSphere 7.0 and its compatible VMware products

Now that we checked all the info on whether everything would be compatible and supported after the upgrade, it is time to check the knowledge base article on the update sequence. This article shows what must be upgraded before another component can be upgraded to keep a supported environment.

Phase 2: The upgrade

vRealize Log Insight upgrade

vRealize Log Insight was running version 4.6. Digging back through the release notes showed me that I had to upgrade from 4.6 > 4.7(.1) > 4.8 > 8.1. The VMware Product Interoperability Matrix showed that vRealize Log Insight 8.1 was compatible with either vSphere 6.7U3 and vSphere 7.0

The upgrade process was painless. It just took a lot of time. The process itself is straight forward. Go to Administration > Management > Cluster, upload the pak file and follow the screens. In my case again and again because I did not upgrade Log Insight a long time.

vRealize Operations Manager upgrade

Upgrading the vRealize Operations Manager node is a breeze too, mainly because it is a simple setup with only one master node. vRealize Operations Manager was running version 7.5. I missed the 8.1 release so I upgraded to 8.0 first.

There are a couple of things that need some attention.

  1. Always run the vRealize Operations Manager Pre-Upgrade Readiness Assessment Tool (APUAT pak)
  2. Make sure to upgrade the OS through the OS pak files first, then the vROps pak file
  3. As I upgraded to 8.0 I had to switch files to execute the 8.1 upgrade

The Pre-Upgrade Readiness Assessment Tool showed me warnings for two items:

  1. Validating product version Make sure to run vRealize Operations Manager – 6.6.1, 6.7, 7.0 and 7.5 Virtual Appliance upgrade, as product version is 7.5.0 Ensure product and upgrade versions meet the requirements.
  1. Checking /dev/sda partition size. The size of the partition is less than 20GB. Increase the size of the partition to be greater or equal to 20 GB (https://kb.vmware.com/s/article/75298).

Both are easily addressed. The first one gives a warning to use the correct pak file when upgrading. The second one refers to a KB article that has only a couple of steps:

  • take vRealize Operations offline
  • shut down guest OS
  • increase hard disk in vCenter
  • boot Virtual Machine
  • take vRealize Operations online

After addressing both warnings I was able to upgrade.

vCenter upgrade

Simple and easy when you are already on the VCSA with an embedded Platform Services Controller (PSC).

Run the installer and choose upgrade. Supply the source vCenter information and destination vCenter information and click Next – Next – Finish. Grab a drink and wait. It is a two part process. The first part will deploy the new machine with the chosen information and the second part will migrate the data from the old vCenter to the new vCenter.

ESXi upgrade

Before the actual upgrade could take place I needed to be sure that everything would work after the upgrade. Within the vExpert vCommunity I had seen a nice and easy way to do this. I am sorry that I can’t give credits to the person that I got the idea from.

  • Create a bootable USB ESXi installer or use the iDRAC or equivalent technology to boot your server from the ESXi installer
  • Find an empty USB flash drive
  • Put your server in Maintenance Mode
  • Shutdown and boot your server from the USB installer or the iDRAC or equivalent technology
  • Install to the empty USB drive – BE CAREFUL not to install to the wrong location

Upgrade check workaround

ESXi will create a VMFS volume from the remaining local space where ESXi is installed by default. After installing I tried to add the ESXi host to vCenter but failed because it had detected the local VMFS volume from the original install and that was conflicting with the one that was still present in vCenter but disconnected. I rebooted the ESXi host, booted into the original drive, verified nothing was on the local drive in the original install and deleted the datastore. Rebooted again into the USB drive and now could add the USB installed ESXi 7.0 to vCenter. Now I was able to get a glimpse of how everything was seen from an ESXi 7.0 install. The NVMe drives I was worried about were showing all fine.

Again this is a home lab and not all components are on the VMware HCL so this adds some extra steps like checking from an actual install. This would not be necessary in a production environment where everything has green checks on the VMware HCL.

The actual upgrade

Upgrading ESXi hosts is done easiest through VMware vSphere – Lifecycle Manager (VMware Update Manager has been rebranded)

I imported the Dell customized ISO, created a baseline and did a Host Compliance Check. The Host Compliance Check was Incompatible and led me to the following two knowledge base articles:

I had to remove everything based on the qedi and qedf drivers

VMware Tools / VM Hardware Compatibility

Upgrading the VMware Tools and the VM Hardware Compatibility is the last part in the process. Determine the viability of each VM to upgrade the VMware Tools and the VM Hardware Compatibility. For most VMs this won’t pose a problem. Nevertheless there are some vendor appliances that will need to run a specific version.

vSAN upgrade

Although vSAN is not really a separate component to upgrade, you will need to upgrade the on-disk format. This is an online upgrade that will not impact the running VMs.


Good preparation of an upgrade is key !!

Upgrade Methodology checks:

  1. Determine your BOM (Bill Of Materials)
  2. Check the documentation first
    1. Read the Release Notes
    2. Read the Upgrade guides for each component
  3. Check the HCL
  4. Check the Interoperability Guide
  5. Determine the update sequence
  6. Upgrade according your plan

Use iPerf to test NIC speed between two ESXi hosts

Sometimes you want/need use iPerf to test the nic speed between two ESXi hosts. I did because I was seeing a NIC with low throughput in my lab.

How can we test raw speeds between the two hosts? iPerf comes to the rescue. I was looking on how to do this on an ESXi host. I doesn’t come as a surprise that I found the solution here at William Lams’ virtuallyghetto.com. Apparently iperf has been added to ESXi since 6.5 U2. You used to have to copy iperf to iperf.copy. In ESXi 7.0 that has been done for you, although you will need to look for /usr/lib/vmware/vsan/bin/iperf3.copy

ESXi host 1 (iperf server)

Disable the firewall:

Change to the directory containing the iperf binary

Execute iPerf as server

Overview of the used parameters:

-swill start iperf as server
-Bdefines the IP the iperf server will listen to

Disable the firewall

ESXi host 2 (iperf client)

Change to the directory containing the iperf binary

Execute iPerf as client

Overview of the used parameters:

-iwill determine the interval of reporting back
-ttime iperf will be running
-cclient ip, will force the usage of the correct vmkernel interface
-fmdefaults to kbit/s, adding m will use mbit/s

Don’t forget to re-enable the firewall on both systems.

esxcli network firewall set --enabled true

vShield Endpoint SVM status vCenter alarm

vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status’

vShield Endpoint SVM status alarm

Checking vShield for errors:

The DSVA VA console window shows: (as to where it should show a red/grey screen)

Let’s go for some log file analysis

To get a login prompt: Alt + F2

Login with user dsva and password dsva (this is the default)

The log file we are going to check is the messages log file at /var/log/messages

(why less is more: you get almost all the vi commands)

To go to the last line:

For some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces.

To exit the log file display mode:

To gain root privileges:

Enter the dsva user password

Navigate to the /var/opt/ds_agent/slowpath directory

Create the dsva-ovf.env file (if the file exists, delete the existing file first):

Reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically:

Start or stop ESXi services using PowerCLI

Start the ssh service on all hosts:

Thanks to Alan Renouf at virtu-al.net, where I found this snippet: https://www.virtu-al.net/2010/11/23/enabling-esx-ssh-via-powercli/

If you want to start the ssh service on a single host, change ESXiHostName to your ESXi FQDN:

If you want to stop the ssh service on all hosts:

If you have multiple cluster in vCenter, are connected to multiple vCenters, be sure to launch the command only to the necessary hosts:

  • Get-Cluster -Name ClusterName will filter to the specified Cluster
  • Get-VMHost -Name ESXiHostName will filter to the specified ESXi
  • Get-VMHost -Server vCenterServerName will filter to the specified vCenter server

These are other services I frequently use:

  • DCUI (Direct Console UI)
  • lwsmd (Active Directory Service)
  • ntpd (NTP Daemon)
  • sfcbd-watchdog (CIM Server)
  • snmpd (SNMP Server)
  • TSM (ESXi Shell)
  • vmsyslogd (Syslog Server)
  • vmware-fdm (vSphere High Availability Agent)
  • vpxa (VMware vCenter Agent)
  • xorg (X.Org Server)

There are other services available but I have never used them in this context (yet):

  • lbtd (Load-Based Teaming Daemon)
  • pcscd (PC/SC Smart Card Daemon)
  • vprobed (VProbe Daemon)

Change the startup policy for a service:

  • Automatic: Start automatically if any ports are open, and stop when all ports are closed
  • On: Start and stop with host
  • Off: Start and stop manually

From IOmeter to VMware I/O Analyzer fling

VMware I/O Analyzer is a tool to launch orchestrated tests against a storage solution available from the VMware flings website. It can be used as a single appliance where the worker process and the analytics is done within. Additional appliances can be deployed to act as Worker VMs. The Analyzer VM launches IOmeter tests (on the Worker VMs) and after test completion it collects the data. All configuration is done from a web interface on the Analyzer VM.

This post is describing how I deployed VMware I/O Analyzer and how I got to a test with maximized IOs. The first tests were conducted launching a IOmeter from within a virtual machine on the vSAN datastore and showed more or less 300 IOs being generated. In the end 18 Worker VMs with 8 disks each on a 6 host vSAN cluster were used generating 340K+ IOPS. The purpose was to create a baseline for a VSAN datastore maximum IOPs.

Hardware used

6 hosts
1 disk group
1 800GB SSD drive5 1,2 TB 10K SAS
vSphere 5.5 U3


The VM OS disks should not be put on the vSAN datastore you want to test, if not the generated IOPs will be part of your report. To keep the Analyser VM IOPS out of the performance graphs, put it on a different datastore.

Deploy one Analyser VM. Deploy a Worker VM per ESXi host. You should end up with as much Worker VMs as you have hosts in your cluster.

I changed the IP of all VMs to static as there was no DHCP server available in the subnet. This means that no DNS entries were required.

Preferably you will want to change the Analyser VM to a static IP as you will manage the solution from a web browser. The Worker VMs you can leave as is if there is DHCP server available. You will need dns entries and change the configuration used here.

To work easily set the Worker VMs on static IPs or create dns aliases as you will be doing a lot of work on the Worker VMs. I prefer static IPs because they add no complexity due to name resolving, etc…


Download ova from: https://labs.vmware.com/flings/i-o-analyzer


Deploying the Analyser VM:

Deploy ovf template. Choose your settings in regards to the recommendations above.

Delete the 100MB disk (second disk) from the virtual machine.

Start the Analyser VM via vSphere client and the open console

Login with root – vmware

A terminal window will be opened upon login

To configure static IP:

Change /etc/sysconfig/network/ifcfg-eth0 with your preferred text editor.

Assuming the subnet you’re deploying the vm is

Change the following lines highlighted to your needs:

Leave the other lines as is.

Save and close the file (:wq)

Now we will configure the default gateway

Assuming your default gateway is

Add / Change the following line:

Save and close the file (:wq)

Restart the network service:

Check if the VM is reachable.

Now shutdown the VM.

Deploying the Worker VM:

Clone the Analyser VM.

Add a Hard Disk of 1GB.

Choose advanced and put the 1GB disk on the VSAN datastore.

I needed to configure static IPs on the Worker VMs, so I had to start each VM and change the IP address. After changing the network settings, shut down the VM and create a new clone. Not changing the IPs will give duplicate IPs.

Ease of access configuration

Two ease of access configurations were applied. The first is configured for easy copying from the Analyzer VM to the Worker VMs. The second because all appliances need to be logged onto for the VMware IO Analyzer solution to work. All commands are executed on the Analyzer VM and then copied to the Worker VMs.

Setup ssh keyless authentication

Generate a key pair

ssh-copy-id will copy your public key to the target machine

The root account password of the destination will need to be supplied for each of the above lines.

BE AWARE: This has the following security downside. If the root account is compromised on the Analyzer vm all worker vms should be considered compromised too.


Change autologon=”” to autologon=”root” in the displaymanager (/etc/sysconfig/displaymanager) file with the following command:

This will force the machine to login with root after boot.

Copy the file to all workers:

Affinity rules

TIP: Create affinity rules in vCenter to keep the Worker VMs on dedicated hosts, otherwise the configuration on the VMware I/O Analyzer dashboard will be outdated soon. The consequence is that certain Worker VMs will not be launching their IOmeter profiles and therefor the reports will not be correct.



Enable the SSH service on the ESXi hosts via the vSphere (Web) Client or through Powershell.

The powershell way: (be aware to filter your hosts if needed). There is a dedicated post about starting and stopping ESXi services through powershell here.


Add the hosts to the host list.

Search for the Worker VMs in the list and add preferred IO test.

There are a lot of standard tests included in the appliance. The one that should be generating the most IOPs is 4k, 100% read and 0% random.

Optimized setup

To reach an optimized setup, three Worker VMs per host were deployed and 7 additional disks were added.

Adding the extra disks via PowerCLI:

The following specification was created on the Analyzer VM…

… and copied over to the Worker VMs


I found that looking at the console of the Worker VMs is interesting for troubleshooting. You can see the IOmeter tests being launched. This was very usefull in the process of creating the IOmeter profile. You don’t need to wait untill the test is finished to see it has failed. Stopping IOmeter tests from the console gives the opportunity to look at, edit and save the launched profile.