NSX Host Transport Nodes upgrade fails

NSX Host Transport Nodes upgrade fails

While going through the latest lab upgrade round, I found myself running into an error when upgrading NSX. The NSX Edge Transport Nodes (ETN) upgrade successfully, however, the NSX Host Transport Nodes (HTN) portion fails.

p

info

Not that the solutions is so special but it had me running around a bit, therefore I wanted to share.

The upgrade returns the following error:

A general system error occurred: Image is not valid. Component NSX LCP Bundle(NSX LCP Bundle(4.1.0.2.0-8.0.21761693)) has unmet dependency nsx-python-greenlet-esxio because providing component(s) NSX LCP Bundle(NSX LCP Bundle(4.1.0.2.0-8.0.21761693)) are obsoleted.

At the same time the same error is listed on vCenter:

NSX LCP upgrade error on vCenter
NSX LCP upgrade error on vCenter

When analysing the vLCM configuration, there was nothing that pointed to the fact that the NSX LCP Bundle was causing an issue. However, through a reddit post, I landed on the following Broadcom KB: NSX Host Cluster Upgrade on SDDC Manager fails with “Set desired state operation failed. Image is not valid”

Wile not exactly mentioning the solution, it got me thinking it could be similar. The procedure instructs to download, and work with the JSON export of the vLCM configuration:

I removed the highlighted nsx-lcp-bundle line, saved and imported the JSON again to vLCM. Hereafter I retried the upgrade on NSX and could progress now!

QUICK UPDATE: NSX ALB documentation

QUICK UPDATE: NSX ALB documentation

To be honest, I have been complaining some over the last year, or so, about the NSX Advanced Load Balancer documentation. Mostly that it was not easy to be found, and one was having to fall back on the avinetworks.com site, which was not great either.

On docs.vmware.com the navigation links were not existing. However, if and when you knew the page titles, you could search for them through search engines. That showed that a lot of those documentation pages were there, in fact, but only not visible with non-existing links.

However, since a couple weeks, there is a banner on the avinetworks.com site that 22.1.4 is the latest release that was documented on avinetworks.com.

NSX ALB documentation deprecation on avinetworks.com
NSX ALB documentation deprecation on avinetworks.com

This means that the single source of truth will be on the NSX Advanced Load Balancer page on docs.vmware.com (the link does redirect you to that location 😀).

Quick tip: if you want to search within a site through a browser, e.g. chrome, use the following as an example:

Quick tip: Setting up TrueSSO?

Quick tip: Setting up TrueSSO?

Are you setting up TrueSSO? Are you looking to use signed certificates to secure the communication between the Connection Server and the Enrollment Server?

Try to find the documentation on using signed certificates to secure that communication. I challenge you, you will not find it easily.

What and why?

You are allowing access to the Unified Access Gateway from the internet. You will want those services to have signed certificates to secure the communication, which will turn that icon in the Horizon client green. To enable end-to-end signed communication, you will need to make sure that you have certs all the way. In the end you are creating tunnels to backend services.

On top of that you want to add TrueSSO in the equation as you want a seamless sign-on experience. This means more certificates. You follow the guides (and all the blog posts that are built using this information), so you are almost there.

However, one step is exporting the ‘vdm.ec’ certificate from the Connection Server and import it on the Enrollment Server. That is exactly where the information is missing or at least hard to be found. None of them actually talk about CA signed certificates for this. You are doing this kind of effort to get all those components (Microsoft) CA signed. Don’t you think that you should use signed certificates here as well, if . I think so!

Where can I find the documentation

Here is the documentation on the VMware websites on setting up TrueSSO:

… and also some great blogs articles out there:

Search no more, you can find it here on the docs.vmware.com site, it is just in another section and a bit hard to find.

esxtop output is not displaying as it should

esxtop output is not displaying as it should

When you connect to your ESXi host and you launch esxtop. You look at the esxtop output and it is not displaying as it should. Instead, it is displaying like in the below screenshot:

esxtop displaying incorrect

Your esxtop output will be displayed correctly if you are using a terminal emulator that defaults to xterm as the TERM environment variable. Some terminal emulators will use another terminal emulator value by default, eg. xterm-256color. ESXi does not map xterm-256color to one of the values it knows, so it doesn’t know how to display the output.

There is a KB article that explains how to resolve:

Output of esxtop defaults to non-interactive CSV with unknown TermInfo (2001448)

The value of the environment variable TERM is used by the server to control how input is recognized by the system, and what capabilities exist for output.

Let us have a look first what the TERM variable is in my case:

I am receiving the following output:

echo TERM output

My terminal emulator tries to connect to the endpoint (ESXi) with xterm-256color. Now let’s take a look at what values this endpoint does support:

terminfo_values

So all of the above is possible to assign to TERM. The value my terminal emulator uses is not among the supported terminfo types. So the ESXi host cannot map to any of the known and thus does not know how to display the esxtop info correctly.

When we update the TERM environment variable to xterm and try to run esxtop again, the output will show nicely formatted.

Let’s check esxtop again to make sure the outcome is as expected:

esxtop displaying correct
Use iPerf to test NIC speed between two ESXi hosts

Use iPerf to test NIC speed between two ESXi hosts

Sometimes you want/need use iPerf to test the nic speed between two ESXi hosts. I did because I was seeing a NIC with low throughput in my lab.

How can we test raw speeds between the two hosts? iPerf comes to the rescue. I was looking on how to do this on an ESXi host. I doesn’t come as a surprise that I found the solution here at William Lams’ virtuallyghetto.com. Apparently iperf has been added to ESXi since 6.5 U2. You used to have to copy iperf to iperf.copy. In ESXi 7.0 that has been done for you, although you will need to look for /usr/lib/vmware/vsan/bin/iperf3.copy

ESXi host 1 (iperf server)

Disable the firewall:

Change to the directory containing the iperf binary

Execute iPerf as server

Overview of the used parameters:

-swill start iperf as server
-Bdefines the IP the iperf server will listen to

Disable the firewall

ESXi host 2 (iperf client)

Change to the directory containing the iperf binary

Execute iPerf as client

Overview of the used parameters:

-iwill determine the interval of reporting back
-ttime iperf will be running
-cclient ip, will force the usage of the correct vmkernel interface
-fmdefaults to kbit/s, adding m will use mbit/s

Don’t forget to re-enable the firewall on both systems.

esxcli network firewall set --enabled true