The following powershell snippet is going to unconfigure the diagnostic coredump partition using the esxcli version 2 cmdlet. The second part will reconfigure the diagnostic partition with the ‘smart’ option so that an accessible partition is chosen.
The following powershell snippet is going to unconfigure the diagnostic coredump partition using the esxcli version 2 cmdlet. The second part will reconfigure the diagnostic partition with the ‘smart’ option so that an accessible partition is chosen.
There are a couple of steps which need to be taken to configure the Tesla M60 cards with NVIDIA GRID VGPU in a vSphere / Horizon environment. I have listed them here quick and dirty. They are an extract of the NVIDIA Virtual GPU Software User Guide.
Run the nvidia-smi command to verify the correct communictation with the device
Configuring Suspend and Resume for VMware vSphere
esxcli system module parameters set -m nvidia -p “NVreg_RegistryDwords=RMEnableVgpuMigration=1”
Reboot the host
Confirm that suspend and resume is configured
dmesg | grep NVRM
Check that the default graphics type is set to shared direct
If the graphics type were not set to shared direct, execute the following commands to stop and start the xorg and nv-hostengine services
/etc/init.d/xorg stop
nv-hostengine -t
nv-hostengine -d
/etc/init.d/xorg start
On the VM / Parent VM:
Configure the VM, beware that once the vGPU is configured that the console of the VM will not be visible/accessible through the vSphere Client. An alternate access method should already be foreseen
Edit the VM configuration to add a shared pci device, verify that NVIDIA GRID vGPU is selected
vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status‘
Checking vShield for errors: The DSVA VA console window shows: (as to where it should show a red/grey screen)
Let’s go for some log file analysis To get a login prompt: Alt + F2 Login with user dsva and password dsva (this is the default) less /var/log/messages (why less is more: you get almost all the vi commands) G to go to the last line
For some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces.q (to exit the log file display)sudo –s (to gain root privileges)enter the dsva user password
test
(to create the dsva-ovf.env file, if necessary delete the file first)reboot (to reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically)
vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status‘ Checking vShield for errors: The DSVA VA console window shows: (as to where it should show a red/grey screen) Let’s go for some log file analysisTo get a login prompt: Alt + F2Login with user dsva and password dsva (this is the default)less /var/log/messages (why less is more: you get almost all the vi commands)G to go to the last lineFor some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces.q (to exit the log file display)sudo –s (to gain root privileges)enter the dsva user password
test
(to create the dsva-ovf.env file, if necessary delete the file first)reboot (to reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically)
In a VSAN project the VMware Compatibility Guide mentioned a different driver version for the raid controller than the one that was installed. So I tried to install a driver update for the raid controller through the CLI. This did not work out as expected because the /altbootbank was in a corrupted state. There were two ways to go ahead, either reinstall from scratch or try to rebuild the /altbootbank from the /bootbank contents. This was not a production server so I had the freedom to apply a more experimental approach and therefor I chose the not supported, not recommended approach to rebuild the /altbootbank from the /bootbank contents.
I ran the following command to install the driver:
[InstallationError]
Failed to clear bootbank content /altbootbank: [Errno 9] Bad file descriptor: '/altbootbank/state.xxxxxxx'
Please refer to the log file for more details.
I found the following two links describing the issue.
The vmware KB is going through the steps to solve this, which in this case didn’t. The better solution is to repair or reinstall but this is a time consuming task.
The steps in the KB didn’t solve it, so I tried to delete it with:
rm /altbootbank/state.5824665/
rm –rf /altbootbank/state.5824665/
The ghost file/directory would not delete. The first command returned ‘This is not a file’, the second ‘This is not a directory’.
I repeated the same commands after a reboot with the same results. As the server was still booting well I knew the /bootbank was still ok. I wanted to replace the /altbootbank with the contents of the /bootbank partition.
THE FOLLOWING IS NOT RECOMMENDED NOR SUPPORTED! DO NOT EXECUTE ON A PRODUCTION ENVIRONMENT !
Identity the naaID and partition number of the /altbootbank:
vmkfstools -Ph /altbootbank
Scratch the partition through recreating the file system: