UCS Manager errors due to Firmware Packages removal

UCS Manager is showing 309 errors because a Firmware Packages have been deleted but the references in the Host Firmware Packages (HFP) still exist.

ucs_errors

Appears that all of the errors show a cause of ‘image-deleted’. In the ‘Affected object’ the path where the error is originating is shown. In the first error it shows ‘org-root/fw-host-pack-HFP-2.2.7/pack-image-Cisco Systems|R200-1120402W|blade-controller’ The first portion ‘org-root/fw-host-pack-HFP-2.2.7’ is important because this is the path. The second part ‘pack-image-Cisco Systems|R200-1120402W|blade-controller’ is the component image which is missing.

A HFP resides in the ‘Servers’ tab. The referenced one can be found in ‘Servers – Policies – root – Host Firmware Packages – HFP-2.2.7’

ucs_faults_summary

Going to the referenced ‘Host Firmware Packages’ some of the components have a presence status ‘Missing’

hfp_227_detail

Below is a screenshot of the existing ‘Firmware Packages’. You can see that the ‘Firmware Package’ 2.2.6f exists for the ‘B Series’ and for the ‘Infrastructure’ but not for the ‘C Series’.

Important to notice is that ‘Rack Package’ 2.2.7b’ is not present for the ‘C Series’ as you can see in the next screenshot.

fp_overview

Going to the Host Firmware Package general page and looking in the assigned versions. You can see that ‘Rack Package’ 2.2(7b)C is assigned. In the above screenshot we saw that this package is not in the UCS Manager anymore.

hfp_227_selected

The rack package is empty. It was on ‘Rack Package’ 2.2(7b)C but because the Firmware Package was removed from UCS Manager this is showing blank.

hfp_227_modify_package_versions

Use ‘Show Policy Usage’ to look if the Host Firmware Package is used somewhere.

hfp_227_show_policy_usage

The Host Firmware Package is used in Service Profile Template ‘HP_FW_TEST_Cisco_Support_Case’

hfp_227_policy_usage_detail_in_use

Navigate to the Service Profile Template

spt_overview

Verify the Policy Usage

spt_hp_fw_test_cisco_support_case_show_policy_usage

It is not in use, so it is safe to delete the Service Profile Template

spt_hp_fw_test_cisco_support_case_policy_usage_detail

Delete the Service Profile Template

spt_delete

Going back to the Host Firmware Package and looking again at the policy usage

hfp_227_show_policy_usage

 

It is not in use anymore

hfp_227_policy_usage_detail_empty_detail

Delete the Host Firmware Package

hfp_227_delete

The previous actions made UCS Manager go down to 193 errors. The next ones are about the Host Firmware Package ‘default’. I don’t want to delete this Host Firmware Package ‘default’, so I will adapt this one so it doesn’t throw any errors anymore.

The following screenshot is not entirely correct as the package was already changed to the correct one (‘2.2(5d)C’ was ‘2.2(6f)C’ as you will see in a later screenshot) but I still wanted to show that I checked the Policy Usage first:

hfp_default_show_policy_usage

The Host Firmware Package ‘default’ is not used in a Service Profile Template, so it safe to change the assigned ‘Rack Package’.

hfp_default_policy_usage

Modify the Package version from the Rack package that was deleted

hfp_default_modify_package_versions

Set it to one that still exists

hfp_default_modify_package_versions_detail

Going back to the errors lead me to the next Host Firmware Package package, which was on a sub-organisation level. Looking at the components in the Host Firmware Package I see a presence of ‘Missing’ again

hfp_default_modify_package_versions

First I was going to modify the package version but I went to look if it was in use first.

hfp_dca_hyp_227d_show_policy_usage_in_use

It was in use by a Service Profile Template and went to see if the Service Profile Template was in use

spt_hp_update_ssd_to_fw_dm0t_show_policy_usage

It was not so I deleted the Service Profile Template and went back up the chain. The Host Firmware Package was not in use anymore. So I deleted the Host Firmware Package

hfp_dca_hyp_227d_show_policy_usage_empty

All references to Firmware Packages were corrected

As a result all faults are cleared:

ucs_errors_empty

Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: ‘/altbootbank/state.xxxxxxx’

In a VSAN project the VMware Compatibility Guide mentioned a different driver version for the raid controller than the one that was installed. So I tried to install a driver update for the raid controller through the CLI. This did not work out as expected because the /altbootbank was in a corrupted state. There were two ways to go ahead, either reinstall from scratch or try to rebuild the /altbootbank from the /bootbank contents. This was not a production server so I had the freedom to apply a more experimental approach and therefor I chose the not supported, not recommended approach to rebuild the /altbootbank from the /bootbank contents.

I ran the following command to install the driver:

I got the following error message:

I found the following two links describing the issue:

https://communities.vmware.com/thread/413441?start=0&tstart=0

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2033564

The vmware KB is going through the steps to solve this, which in this case didn’t. The better solution is to repair or reinstall but this is a time consuming task.

The steps in the KB didn’t solve it, so I tried to delete it with:

The ghost file/directory would not delete. The first command returned ‘This is not a file’, the second ‘This is not a directory’.
I repeated the same commands after a reboot with the same results. As the server was still booting well I knew the /bootbank was still ok. I wanted to replace the /altbootbank with the contents of the /bootbank partition.

THE FOLLOWING IS NOT RECOMMENDED NOR SUPPORTED! DO NOT EXECUTE ON A PRODUCTION ENVIRONMENT !

Identity the naaID and partition number of the /altbootbank:

Scratch the partition through recreating the file system:

Remove the /altbootbank folder:

Create a symlink to the newly created vFat volume with /altbootbank:

Copy all the contents from /bootbank to /altbootbank:

Change the bootstate=3 in /altbootbank/boot.cfg

Run /sbin/autobackup.sh script to update the changes

Usefull vi commands

Some useful commands to work with vi

iinsert
rreplace
9ylcopy nine characters from cursor position, e.g 192.168.1.0 192.168.1 will be copied
ppaste what you have copied
xdelete character
:wqwrite and quit (save and close)
/search_stringfind search_string
ngo to next occurrence of search_string
Ngo to previous occurrence of search_string
dddelete line
10dddelete next 10 lines
yycopy line
10yycopy next 10 lines
rreplace current character at cursor
Rreplace current word at cursor

List registered SPNs in Active Directory: pimped

List all registered SPNs in Active Directory: pimped

This will go and poll all your registered SPNs in Active Directory and write them to a file. It accepts Debug, Log_Dir and Log_FileName as parameters.

Source: http://social.technet.microsoft.com/wiki/contents/articles/18996.list-all-spns-used-in-your-active-directory.aspx