Setting up the lab in Ravello – Part 1 : the jumphost

This entry is part 1 of 1 in the series Ravello Cloud Lab 1.0

 Setting up the lab in Ravello – Part 1 : the jumphost

In these series we will create a lab with multiple components, a jumphost, vcsa, esxi, a vsan enabled cluster, nsx and maybe more. The aim of the series is to learn about deploying all components onto the Ravello cloud.

Part 1: Creating the Jumphost

Part one of the series will be about creating the jumphost. I’m looking at a linux system as we do not need any license to run it and it is already available in Ravello

Creating the Ravello Application

The first step is to create an application. We will create a 0.1 version of the LAB:

Creating the Jumphost VM in the Application

Drag a ‘Xubuntu Desktop 14.04.1 with qemu-kvm pre-installed’ onto the Canvas. Once the VM has been dragged onto the Canvas, there will be an error: ‘Key pair must be supplied’

You can see that the error has its source on the General tab. To correct this a Key Pair must be created.

On the General tab – Cloud Init Configuration – Key Pair

Select the Option: Create a Key Pair

In the following screenshot you can see that I already created a Key Pair

Once created the private key will be available for download. To be able to use the private key with a ssh session from putty, you will need to convert the key.pem to key.ppk. Open puttygen and load the key.pem file and save the file as key.ppk.

Now that we have created our key pair we can save the VM and the error should disappear.

On the System tab, change the # CPU to 2 and the memory to 3 GB.

On the Disks and NICs tab we leave everything as is.

On the Services tab, Add Supplied Service. We will use this Service to connect to the VM via RDP.

A second service will be added. I changed the name to RDP and chose protocol RDP which sets the Port to 3389.

We are ready to publish the application:

Change the ‘Schedule application to stop in:’ countdown timer to ‘04:00hr’. This will give us the time to update and change the VM to our needs.

Publish will power on the VM. When Powered on we will have access to the Console. Powering on the VM takes a couple of minutes.

Customizing the Jumphost VM

Upgrades

The Console will open in a new tab. The initial password for this VM is ‘ravelloCloud’.

The first thing we will do is upgrade the VM to the latest release available. Open the ‘Byobu Terminal’.

Run the command ‘sudo apt-get update && sudo apt-get upgrade’ and confirm you want to upgrade all proposed packages. I tried do-release-upgrade first, which failed because of an apt dependency.

sudo apt-get update && sudo apt-get upgrade

Now we are ready to upgrade to the lastest release. Confirm to all new version configuration files from the package maintainer. In the end all obsolete packages can be removed and reboot when finished.

Run the command ‘sudo apt-get dist-upgrade’ and confirm you want to upgrade all proposed packages. Now your system will be fully up-to-date.

XRDP 0.9.x

Install xrdp 0.9.x so that we can connect via RDP. This will be a more pleasant way of working.

We will add a PPA (Personal Package Archive) to add the package source location to the /etc/apt/sources.list file. This will enable updates through the apt update process. We will install the latest version of xrpd from this location. At the time of writing the version integrated is in the ubuntu sources is 0.6.x. The latest stable version has quite some enhancements like shared clipboard support.

sudo add-apt-repository ppa:hermlnx/xrdp
sudo apt-get update
sudo apt-get install
xrdp xrdp -v

The version installed at the time of writing is 0.9.4

Create xsession file with contents xfce4-session. The latest xrdp version should be detecting the desktop environment by default but in my case it did't and wouldn't work without the following xsession file.

cd $HOME
echo xfce4-session > ~/.xsession

Generate new certificate and key

openssl req -x509 -newkey rsa:2048 -nodes -keyout key.pem -out cert.pem -days 365

Update XRDP to use the new certificates

cd /etc/xrdp sudo vi xrdp.ini

Change the following lines to use the certificate and key generated

certificate=/home/ubuntu/cert.pem
key_file=/home/ubuntu/key.pem
cd /etc/X11/
sudo vi wrapper.config

Change the following line

allowed_users=anybody

Reboot the VM Now you can access the VM through RDP. You will need to confirm the self-signed cert as it has not been signed by a trusted root CA.

Powershell Core

Import the public repository GPG keys

curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -

Register the Microsoft Ubuntu repository

curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list | sudo tee /etc/apt/sources.list.d/microsoft.list

Update the list of products

sudo apt-get update

Install PowerShell

sudo apt-get install -y powershell

Start PowerShell

pwsh

PowerCLI 10

Install the PowerCLI module from the PowerShell Gallery

Install-Module -Name VMware.PowerCLI -scope CurrentUser

Verify PowerCLI version

Get-PowerCLIVersion

OPTIONAL: Opt-out from the Customer Experience Improvement Program (CEIP)

Set-PowerCLIConfiguration -scope user -ParticipateCeip $false

OPTIONAL: Do not display the warning about using self-signed certificates

Set-PowerCLIConfiguration -InvalidCertificateAction Ignore

OPTIONAL: Visual Studio Code

Installing Microsoft Visual Studio Code can be usefull for creating scripts that will/could be used within the environment.

curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg
sudo mv microsoft.gpg /etc/apt/trusted.gpg.d/microsoft.gpg sudo sh -c 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/vscode stable main" > /etc/apt/sources.list.d/vscode.list'
sudo apt-get update
sudo apt-get install code # or code-insiders

The next part will be setting up the ESXi machines and VCSA.

Many thanks to:

boot failure: systemctl status system-fsck-root.service

 

boot failure: systemctl status system-fsck-root.service

I had downtime in my lab due to an power failure which resulted in a boot failure of my VCSA 6.5 appliance. Looking on the console showed me a “[FAILED] Failed to start File System Check on /dev/dis…uuid/uuid. See ‘systemctl status system-fsck-root.service’ for details.” message. Therefor it booted into ‘Emergency Shell’ or ‘Emergency mode’.

boot failure: systemctl status system-fsck-root.service

I ran the command ‘systemctl status systemd-fsck-root’ manually. This showed me that the ‘/dev/sda3’ partition was having issues.

UPDATE: It also states “RUN fsck MANUALLY”. I did not notice this the first time

systemctl status system-fsck-root

I tried to run fsck with no options to see if the command was known to the CLI. I then ran the command with the partition as a parameter ‘fsck /dev/sda3’. I answered ‘y(es)’ to all ‘Fix<y>?’ questions.

fsck /dev/sda3

In the end I received the message ‘FILE SYSTEM WAS MODIFIED’ and tried to reboot. The reboot command gave me an error so I went through the ESXi to reset the virtual machine. Afterwards I was able to login again.

FILE SYSTEM WAS MODIFIED

vShield Endpoint SVM status vCenter alarm

vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status

Checking vShield for errors: The DSVA VA console window shows: (as to where it should show a red/grey screen)

Let’s go for some log file analysis
To get a login prompt: Alt + F2
Login with user dsva and password dsva (this is the default)
less /var/log/messages (why less is more: you get almost all the vi commands)
G to go to the last line

For some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces. q (to exit the log file display) sudo –s (to gain root privileges) enter the dsva user password  

test

    (to create the dsva-ovf.env file, if necessary delete the file first) reboot (to reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically)

vCenter is showing an alarm on the TrendMicro Deep Security Virtual Appliance (DSVA): ‘vShield Endpoint SVM status Checking vShield for errors: The DSVA VA console window shows: (as to where it should show a red/grey screen) Let’s go for some log file analysis To get a login prompt: Alt + F2 Login with user dsva and password dsva (this is the default) less /var/log/messages (why less is more: you get almost all the vi commands) G to go to the last line For some reason the ovf file is not like it is expected. The appliance is not able to set some ovf settings, in this case the network interfaces. q (to exit the log file display) sudo –s (to gain root privileges) enter the dsva user password  

test

    (to create the dsva-ovf.env file, if necessary delete the file first) reboot (to reboot the appliance, once rebooted give it 5 minutes and the alarm should clear automatically)

[code language=”css”] your code here [/code]

Start or stop ESXi services using PowerCLI

Start or stop ESXi services using PowerCLI

Start the ssh service on all hosts:

Get-VMHost | Foreach {
   Start-VMHostService -HostService ($_ | Get-VMHostService | Where { $_.Key -eq "TSM-SSH"} )
}

Thanks to Alan Renouf at virtu-al.net, where I found this snippet: http://www.virtu-al.net/2010/11/23/enabling-esx-ssh-via-powercli/

If you want to start the ssh service on a single host, change ESXiHostName to your ESXi FQDN:

Get-VMHost -Name ESXiHostName | Foreach {
   Start-VMHostService -HostService ( $_ | Get-VMHostService | Where { $_.Key -eq "TSM-SSH" } )
}

If you want to stop the ssh service on all hosts:

Get-VMHost | Foreach {
   Stop-VMHostService -HostService ($_ | Get-VMHostService | Where { $_.Key -eq "TSM-SSH"} )
}

If you have multiple cluster in vCenter, are connected to multiple vCenters, be sure to launch the command only to the necessary hosts:

  • Get-Cluster -Name ClusterName will filter to the specified Cluster
  • Get-VMHost -Name ESXiHostName will filter to the specified ESXi
  • Get-VMHost -Server vCenterServerName will filter to the specified vCenter server
Get-Cluster -Name ClusterName | Get-VMHost -Name ESXiHostName -Server vCenterServerName | Foreach {
   Stop-VMHostService -HostService ($_ | Get-VMHostService | Where { $_.Key -eq "TSM-SSH"} )
}

These are other services I frequently use:

  • DCUI (Direct Console UI)
  • lwsmd (Active Directory Service)
  • ntpd (NTP Daemon)
  • sfcbd-watchdog (CIM Server)
  • snmpd (SNMP Server)
  • TSM (ESXi Shell)
  • TSM-SSH (SSH)
  • vmsyslogd (Syslog Server)
  • vmware-fdm (vSphere High Availability Agent)
  • vpxa (VMware vCenter Agent)
  • xorg (X.Org Server)

There are other services available but I have never used them in this context (yet):

  • lbtd (Load-Based Teaming Daemon)
  • pcscd (PC/SC Smart Card Daemon)
  • vprobed (VProbe Daemon)

Change the startup policy for a service:

  • Automatic: Start automatically if any ports are open, and stop when all ports are closed
  • On: Start and stop with host
  • Off: Start and stop manually
get-vmhost | Foreach {Set-VMHostService -HostService ($_ | Get-VMHostService | where {$_.key -eq "tsm-ssh"}) -policy On}

 

From IOmeter to VMware I/O Analyzer fling

VMware I/O Analyzer is a tool to launch orchestrated tests against a storage solution available from the VMware flings website. It can be used as a single appliance where the worker process and the analytics is done within. Additional appliances can be deployed to act as Worker VMs. The Analyzer VM launches IOmeter tests (on the Worker VMs) and after test completion it collects the data. All configuration is done from a web interface on the Analyzer VM.

This post is describing how I deployed VMware I/O Analyzer and how I got to a test with maximized IOs. The first tests were conducted launching a IOmeter from within a virtual machine on the vSAN datastore and showed more or less 300 IOs being generated. In the end 18 Worker VMs with 8 disks each on a 6 host vSAN cluster were used generating 340K+ IOPS. The purpose was to create a baseline for a VSAN datastore maximum IOPs.

Hardware used

6 hosts
1 disk group
1 800GB SSD drive5 1,2 TB 10K SAS
vSphere 5.5 U3

General

The VM OS disks should not be put on the vSAN datastore you want to test, if not the generated IOPs will be part of your report. To keep the Analyser VM IOPS out of the performance graphs, put it on a different datastore.

Deploy one Analyser VM. Deploy a Worker VM per ESXi host. You should end up with as much Worker VMs as you have hosts in your cluster.

I changed the IP of all VMs to static as there was no DHCP server available in the subnet. This means that no DNS entries were required.

Preferably you will want to change the Analyser VM to a static IP as you will manage the solution from a web browser. The Worker VMs you can leave as is if there is DHCP server available. You will need dns entries and change the configuration used here.

To work easily set the Worker VMs on static IPs or create dns aliases as you will be doing a lot of work on the Worker VMs. I prefer static IPs because they add no complexity due to name resolving, etc…

Prerequisites

Download ova from: https://labs.vmware.com/flings/i-o-analyzer

Deploy

Deploying the Analyser VM:

Deploy ovf template. Choose your settings in regards to the recommendations above.

Delete the 100MB disk (second disk) from the virtual machine.

Start the Analyser VM via vSphere client and the open console

Login with root – vmware

A terminal window will be opened upon login

To configure static IP:

Change /etc/sysconfig/network/ifcfg-eth0 with your preferred text editor.

vi /etc/sysconfig/network/ifcfg-eth0

Assuming the subnet you’re deploying the vm is 192.168.1.0/24

Change the following lines highlighted to your needs:

BOOTPROTO=’static’
BROADCAST=’192.168.1.255’
ETHTOOL_OPTIONS=''
IPADDR=’192.168.1.20’
MTU=’1500’
NAME='82545EM Gigabit Ethernet Controller (Copper)'
NETMASK=’255.255.255.0’
NETWORK=’192.168.1.0’
REMOTE_IPADDR=''
STARTMODE='auto'
USERCONTROL='no'

Leave the other lines as is.

Save and close the file (:wq)

Now we will configure the default gateway

Assuming your default gateway is 192.168.1.1

vi /etc/sysconfig/network/routes (The file will be created if it doesn’t exist)

Add / Change the following line:

Default 192.168.1.1 - - (Default space GW space minus space minus)

Save and close the file (:wq)

Restart the network service:

service network restart

Check if the VM is reachable.

Now shutdown the VM.

Deploying the Worker VM:

Clone the Analyser VM.

Add a Hard Disk of 1GB.

Choose advanced and put the 1GB disk on the VSAN datastore.

I needed to configure static IPs on the Worker VMs, so I had to start each VM and change the IP address. After changing the network settings, shut down the VM and create a new clone. Not changing the IPs will give duplicate IPs.

Ease of access configuration

Two ease of access configurations were applied. The first is configured for easy copying from the Analyzer VM to the Worker VMs. The second because all appliances need to be logged onto for the VMware IO Analyzer solution to work. All commands are executed on the Analyzer VM and then copied to the Worker VMs.

Setup ssh keyless authentication

Generate a key pair

ssh-keygen (with an empty passphrase)

ssh-copy-id will copy your public key to the target machine

ssh-copy-id -i id_rsa.pub root@192.168.1.21
ssh-copy-id -i id_rsa.pub root@192.168.1.22
ssh-copy-id -i id_rsa.pub root@192.168.1.23
ssh-copy-id -i id_rsa.pub root@192.168.1.24
ssh-copy-id -i id_rsa.pub root@192.168.1.25
ssh-copy-id -i id_rsa.pub root@192.168.1.26

The root account password of the destination will need to be supplied for each of the above lines.

BE AWARE: This has the following security downside. If the root account is compromised on the Analyzer vm all worker vms should be considered compromised too.

Autologon

Change autologon=”” to autologon=”root” in the displaymanager (/etc/sysconfig/displaymanager) file with the following command:

sed -i ‘s/AUTOLOGIN=””/AUTOLOGIN=”root”/g’ /etc/sysconfig/displaymanager

This will force the machine to login with root after boot.

Copy the file to all workers:

scp /etc/sysconfig/displaymanager root@192.168.1.21:/etc/sysconfig/
scp /etc/sysconfig/displaymanager root@192.168.1.22:/etc/sysconfig/
scp /etc/sysconfig/displaymanager root@192.168.1.23:/etc/sysconfig/
scp /etc/sysconfig/displaymanager root@192.168.1.24:/etc/sysconfig/
scp /etc/sysconfig/displaymanager root@192.168.1.25:/etc/sysconfig/
scp /etc/sysconfig/displaymanager root@192.168.1.26:/etc/sysconfig/

Affinity rules

TIP: Create affinity rules in vCenter to keep the Worker VMs on dedicated hosts, otherwise the configuration on the VMware I/O Analyzer dashboard will be outdated soon. The consequence is that certain Worker VMs will not be launching their IOmeter profiles and therefor the reports will not be correct.

Configuration

Prerequisites

Enable the SSH service on the ESXi hosts via the vSphere (Web) Client or through Powershell.

The powershell way: (be aware to filter your hosts if needed)

Get-VMHost | Foreach {
   Start-VMHostService -HostService ($_ | Get-VMHostService | Where { $_.Key -eq "TSM-SSH"} )
}

Dashboard

Add the hosts to the host list.

Search for the Worker VMs in the list and add preferred IO test.

There are a lot of standard tests included in the appliance. The one that should be generating the most IOPs is 4k, 100% read and 0% random.

Optimized setup

To reach an optimized setup, three Worker VMs per host were deployed and 7 additional disks were added.

Adding the extra disks via PowerCLI:

$VMs = Get-VM -Name "*IOW*"

ForEach ($vm in $VMs) {ForEach ($num in 1..7) { New-HardDisk -CapacityGB 1 -datastore vsan* -VM $vm.name}}

The following specification was created on the Analyzer VM…

'TEST SETUP ====================================================================
'Test Description
	4k_100Read_0Rand_cust
'Run Time
' hours minutes seconds
	0 1 0
'Ramp Up Time (s)
	0
'Default Disk Workers to Spawn
	NUMBER_OF_CPUS
'Default Network Workers to Spawn
	0
'Record Results
	ALL
'Worker Cycling
' start step step type
	1 1 LINEAR
'Disk Cycling
' start step step type
	1 1 LINEAR
'Queue Depth Cycling
' start end step step type
	1 32 2 EXPONENTIAL
'Test Type
	NORMAL
'END test setup
'ACCESS SPECIFICATIONS =========================================================
'Access specification name,default assignment
	4k; 100% Read; 0% Random, NONE
'size,% of size,% reads,% random,delay,burst,align,reply
	4096,100,100,0,0,1,4096,0
'END access specifications
'MANAGER LIST ==================================================================
'Manager ID, manager name
	1,IOA-manager
'Manager network address
	127.0.0.1
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdb
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdc
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdd
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sde
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdf
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdg
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdh
'Target type
	DISK
'End target
'End target assignments
'End worker
'Worker
	IOA-worker
'Worker type
	DISK
'Default target settings for worker
'Number of outstanding IOs,test connection rate,transactions per connection
	16,DISABLED,1
'Disk maximum size,starting sector
	0,0
'End default target settings for worker
'Assigned access specs
	4k; 100% Read; 0% Random
'End assigned access specs
'Target assignments
'Target
	sdi
'Target type
	DISK
'End target
'End target assignments
'End worker
'End manager
'END manager list

… and copied over to the Worker VMs

scp ./VSAN_4k_100read_0rand.icf root@192.168.1.21:/var/www/configs/
scp ./VSAN_4k_100read_0rand.icf root@192.168.1.22:/var/www/configs/
scp ./VSAN_4k_100read_0rand.icf root@192.168.1.23:/var/www/configs/
scp ./VSAN_4k_100read_0rand.icf root@192.168.1.24:/var/www/configs/
scp ./VSAN_4k_100read_0rand.icf root@192.168.1.25:/var/www/configs/
scp ./VSAN_4k_100read_0rand.icf root@192.168.1.26:/var/www/configs/

Troubleshooting

I found that looking at the console of the Worker VMs is interesting for troubleshooting. You can see the IOmeter tests being launched. This was very usefull in the process of creating the IOmeter profile. You don’t need to wait untill the test is finished to see it has failed. Stopping IOmeter tests from the console gives the opportunity to look at, edit and save the launched profile.