VMware SRM 5.8.1 Embedded Database refuses to uninstall

VMware SRM 5.8.1 Embedded Database refuses to uninstall

VMware SRM 5.8.1 Embedded Database refuses to uninstall. Clicking uninstall in ‘Control Panel – Programs and Features’ showed a progress bar going forward and then rolling back. Afterwards ‘Control Panel – Programs and Features’ showed that the embedded PostgreSQL was still installed.

So I tried it through the command line with the purpose of generating a log file:

msiexec /l*a c:tempuninst.log /x VMware-SRM-Postgres.msi
ShellSession

The log file shows a 1603 error code in the end but msiexec error 1603 is a very generic failure error which does not give a direction to search for.

Microsoft msiexec error codes:

ERROR_INSTALL_FAILURE1603A fatal error occurred during installation.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa376931(v=vs.85).aspx

Going up in the log file, somewhere halfway there is a remark that the “C:ProgramDataVMwareVMware vCenter Site Recovery Manager Embedded Databasedatapostgresql.conf” file cannot be found.

I created the postgresql.conf file in the “C:\ProgramData\VMware\VMware vCenter Site Recovery Manager Embedded Database\data” and tried the msiexec uninstall again. Now the uninstall succeeded.

UCS Manager errors due to Firmware Packages removal

UCS Manager errors due to Firmware Packages removal

UCS Manager is showing 309 warnings because a Firmware Packages have been deleted but the references in the Host Firmware Packages (HFP) still exist.

ucs_errors
UCS warnings

Appears that all of the errors show a cause of ‘image-deleted’. In the ‘Affected object’ the path where the error is originating is shown. In the first error it shows ‘org-root/fw-host-pack-HFP-2.2.7/pack-image-Cisco Systems|R200-1120402W|blade-controller’ The first portion ‘org-root/fw-host-pack-HFP-2.2.7’ is important because this is the path. The second part ‘pack-image-Cisco Systems|R200-1120402W|blade-controller’ is the component image which is missing.

A HFP resides in the ‘Servers’ tab. The referenced one can be found in ‘Servers – Policies – root – Host Firmware Packages – HFP-2.2.7’

ucs_faults_summary
UCS warnings details

Going to the referenced ‘Host Firmware Packages’ some of the components have a presence status ‘Missing’

hfp_227_detail
Host Firmware Package details

Below is a screenshot of the existing ‘Firmware Packages’. You can see that the ‘Firmware Package’ 2.2.6f exists for the ‘B Series’ and for the ‘Infrastructure’ but not for the ‘C Series’.

Important to notice is that ‘Rack Package’ 2.2.7b’ is not present for the ‘C Series’ as you can see in the next screenshot.

fp_overview
Firmware Management > Packages overview

Going to the Host Firmware Package general page and looking in the assigned versions. You can see that ‘Rack Package’ 2.2(7b)C is assigned. In the above screenshot we saw that this package is not in the UCS Manager anymore.

hfp_227_selected
Server Assigned Host Firmware Package

The rack package is empty. It was on ‘Rack Package’ 2.2(7b)C but because the Firmware Package was removed from UCS Manager this is showing blank.

hfp_227_modify_package_versions
Empty Firmware Package

Use ‘Show Policy Usage’ to look if the Host Firmware Package is used somewhere.

hfp_227_show_policy_usage
Verify Policy Usage

The Host Firmware Package is used in Service Profile Template ‘HP_FW_TEST_Cisco_Support_Case’

hfp_227_policy_usage_detail_in_use
Policy in use

Navigate to the Service Profile Template

spt_overview
Service Profile Template overview

Verify the Policy Usage

spt_hp_fw_test_cisco_support_case_show_policy_usage
Verify Policy Usage

It is not in use, so it is safe to delete the Service Profile Template

spt_hp_fw_test_cisco_support_case_policy_usage_detail
Policy not in use

Delete the Service Profile Template

spt_delete
Delete Service Profile Template

Going back to the Host Firmware Package and looking again at the policy usage

hfp_227_show_policy_usage
Verify Policy Usage

 It is not in use anymore

hfp_227_policy_usage_detail_empty_detail
Policy not in use

Delete the Host Firmware Package

hfp_227_delete
Delete Host Firmware Package

The previous actions made UCS Manager go down to 193 errors. The next ones are about the Host Firmware Package ‘default’. I don’t want to delete this Host Firmware Package ‘default’, so I will adapt this one so it doesn’t throw any errors anymore.

The following screenshot is not entirely correct as the package was already changed to the correct one (‘2.2(5d)C’ was ‘2.2(6f)C’ as you will see in a later screenshot) but I still wanted to show that I checked the Policy Usage first:

hfp_default_show_policy_usage
Show Policy Usage

The Host Firmware Package ‘default’ is not used in a Service Profile Template, so it safe to change the assigned ‘Rack Package’.

hfp_default_policy_usage
Policy not in use

Modify the Package version from the Rack package that was deleted

hfp_default_modify_package_versions
Modify Package Version

Set it to one that still exists

hfp_default_modify_package_versions_detail
Select Package Version

Going back to the errors lead me to the next Host Firmware Package package, which was on a sub-organisation level. Looking at the components in the Host Firmware Package I see a presence of ‘Missing’ again

hfp_default_modify_package_versions
Missing Host Firmware Package components

First I was going to modify the package version but I went to look if it was in use first.

hfp_dca_hyp_227d_show_policy_usage_in_use
Verify Policy Usage

It was in use by a Service Profile Template and went to see if the Service Profile Template was in use

spt_hp_update_ssd_to_fw_dm0t_show_policy_usage
Verify Service Profile Template usage

It was not, so I deleted the Service Profile Template and went back up the chain. The Host Firmware Package was not in use anymore. So I deleted the Host Firmware Package

hfp_dca_hyp_227d_show_policy_usage_empty
Service Profile Template not in use

All references to Firmware Packages were corrected

As a result all faults are cleared:

ucs_errors_empty
Clean Fault Summary
Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: ‘/altbootbank/state.xxxxxxx’

Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: ‘/altbootbank/state.xxxxxxx’

In a VSAN project the VMware Compatibility Guide mentioned a different driver version for the raid controller than the one that was installed. So I tried to install a driver update for the raid controller through the CLI. This did not work out as expected because the /altbootbank was in a corrupted state. There were two ways to go ahead, either reinstall from scratch or try to rebuild the /altbootbank from the /bootbank contents. This was not a production server so I had the freedom to apply a more experimental approach and therefor I chose the not supported, not recommended approach to rebuild the /altbootbank from the /bootbank contents.

I ran the following command to install the driver:

esxcli software vib install -d /vmfs/volumes/datastore/patch.zip

I got the following error message:

[InstallationError]
Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: '/altbootbank/state.xxxxxxx'
Please refer to the log file for more details.

I found the following two links describing the issue:

The vmware KB is going through the steps to solve this, which in this case didn’t. The better solution is to repair or reinstall but this is a time consuming task.

The steps in the KB didn’t solve it, so I tried to delete it with:

rm /altbootbank/state.5824665/
rm –rf /altbootbank/state.5824665/

The ghost file/directory would not delete. The first command returned ‘This is not a file’, the second ‘This is not a directory’.
I repeated the same commands after a reboot with the same results. As the server was still booting well I knew the /bootbank was still ok. I wanted to replace the /altbootbank with the contents of the /bootbank partition.

Identity the naaID and partition number of the /altbootbank:

vmkfstools -Ph /altbootbank

Scratch the partition through recreating the file system:

vmkfstools -C vfat /dev/disks/naaID:partitionNumber

Remove the /altbootbank folder:

rm –rf /altbootbank

Create a symlink to the newly created vFat volume with /altbootbank:

ln –s /vmfs/volumes/volumeGUID /altbootbank

Copy all the contents from /bootbank to /altbootbank:

cp /bootbank/* /altbootbank

Change the bootstate=3 in /altbootbank/boot.cfg

vi /altbootbank/boot.cfg

Run /sbin/autobackup.sh script to update the changes

/sbin/autobackup.sh
Usefull vi commands

Usefull vi commands

Some useful commands to work with vi

iinsert
rreplace
9ylcopy nine characters from cursor position, e.g 192.168.1.0 192.168.1 will be copied
ppaste what you have copied
xdelete character
:wqwrite and quit (save and close)
/search_stringfind search_string
ngo to next occurrence of search_string
Ngo to previous occurrence of search_string
dddelete line
10dddelete next 10 lines
yycopy line
10yycopy next 10 lines
rreplace current character at cursor
Rreplace current word at cursor
PowerShell tail

PowerShell tail

Anyone knowing a bit of Linux has come across tail. For those who don’t know tail, it is a tool that will monitor text files, eg log files, for changes and display the newly added content in the terminal window. This comes in handy when troubleshooting actions and looking in the log file what has been logged. I was wondering how to do this in Windows.

Powershell has a similar function:

Get-Content -Wait file_name
ShellSession

Add -tail and a number, this will show you the last 100 lines and keep the file open to output the additions to the file:

Get-Content -Wait -tail 100 file_name
ShellSession