Cloning and running a duplicate vCenter instance on the same network – process and gotchas

Recently I needed to clone a vSphere environment (vCenter 5.0.0) for testing purposes. This environment needed to be cloned to have an exact replica of the vCenter server and SQL database server for various tests/upgrades to be performed on it. As for ESXi hosts, a few were being split off the original environment and added to the duplicate vSphere environment.   All the Windows configuration and SQL server configuration needed to be retained, so my high-level plan was as follows:

  • Deploy a new Windows domain (it had to be the same domain functional level and the DC needed to run the same OS as the original)
  • Hot cloned the existing production vCenter and SQL servers
  • Split off the few ESXi physical hosts that were going to be added to the newly cloned environment later on (removed from prod clusters)
  • The new machines needed to run on the same VLAN and IP ranges as the originals too, which made things even more complex, so I made sure to keep the vNICs in the SQL and vCenter cloned VMs disconnected, and disconnected on start up too.
  • Re-IP the cloned VMs after powering them up on one of the split off ESXi hosts (logged in to vSphere client using root credentials) I also re-named their host names and removed from the old domain in, rejoining to the new domain at the same time.
  • Created new service accounts on the newly deployed domain, and reconfigured vCenter services to use these on the cloned machines
  • On the vCenter server there were some changes needed in vCenter config files. This post details most of what needed to be changed:
  • The main change for me though, was I couldn’t see (or didn’t know) how an existing vCenter SQL database would re-act when starting the cloned vCenter on the same VLAN and IP range! There was a strong possibility that this cloned instance could start interfering with the production vCenter and performing operations cross environment. Therefore I decided to create a new SQL vCenter DB. I logged into the cloned vCenter and deleted the old SQL System DSN pointing to the production SQL DB, and created a new SQL database on the cloned SQL box. I then created a new DSN pointing to this, and made sure I searched around for all configuration files on the vCenter server that pointed to the old DSN/SQL server. (I recall there being some references in registry and possibly the vpxd.conf file).
  • Re-creating the right SQL database structure was a bit of a task though. I needed to re-create the structure of the DB without doing an install of course, as I was using the cloned SQL and vCenter machines – with an existing installation on. I followed this KB article, but found a couple of errors/typos in the SQL queries: https://pubs.vmware.com/vsphere-50/index.jsp#com.vmware.vsphere.install.doc_50/GUID-F953497E-2170-4168-806F-6FF0EA6497A7.html by looking at the errors returned in SQL Management studio, you can start to determine where any issues come up and fix the typos. Unfortunately I did not document them myself as I sorted through!
  • Once I was finished with the renaming and IP changes for the cloned machines, I re-connected their vNICs to the relevant networks – happy that they were sufficiently changed!
  • My last issue I came to was that the schema I deployed was one version higher than the vCenter Server build version I was using. I found this out by looking at the vpxd.log file when vCenter failed to start up after deploying the new database schema and content. I fixed this by fudging the version in the database – essentially hard coding the version it wanted into the newly created schema – my thought was that the schema wouldn’t have changed anything old, but rather added new features, so it should be OK. I was right – vCenter started up just fine after this, but I wasn’t happy, so I stopped the services again, deleted the DB, and started again, this time using the right scripts (the scripts referenced in the article link above are located on the vSphere vCenter ISO / media). I had used an ISO with a build increment one higher than the vCenter build I was working with on the cloned VC!) So make sure you use the correct media for your vCenter install here. (I had a vCenter 5.0.0 install, but had deployed schema for 5.0.1!)

 

Hopefully that gives some ideas as to the tasks required when attempting to clone / duplicate an existing vCenter installation, whilst keeping

Setting up vCenter 5.0.x Inventory Service wrapper.log rotation

I noticed a trend recently on one of our vCenter servers where the disk space on the C:\ drive kept running low. The OS being run is Windows Server 2008 R2. Eventually if the disk fills up, the vCenter service will fail and stop.

 

 

I used the free utility “TreeSize” to inspect the drive and see where all the disk space was being used up – the culprit was the C:\ProgramData\VMware\Infrastructure\Inventory Service\Logs\wrapper.log file – sitting at a massive 16GB in my case.

By default this log file seems to be configured to have no limits in size. I am not sure what caused the log file to grow so rapidly over the course of a week or so, but I’m sure closer inspection of the contents of the log file will reveal more detail.

For now, I will just point anyone having a similiar issue to the steps required to enable log rotation and file size limit on this wrapper.log file, which will prevent your vCenter server from running out of disk space. These steps are taken from a VMware KB article, which I will link to below.

 

High level process is as follows:

 

  • Stop vCenter service
  • rename the wrapper.log file to something else (e.g. wrapper.log.bak) – after everything is up and a new log file is in place, this could be compressed and stored away, or deleted
  • open up the configuration file at C:\Program Files\VMware\Infrastructure\Inventory Service\conf and locate the wrapper.conf
  • edit the following two lines to change to:
    • wrapper.logfile.maxsize=100m (original value is 0)
    • wrapper.logfile.maxfiles=2 (original value is 0)
  • Save the file and close it
  • Start the vCenter service again

The actual VMware KB article details the above process.

 

Adding vCenter Server to Active Directory domain and disconnecting ESXi hosts issue

The other day I came across this issue, it was quite late at night so it took me a little longer than I would have liked to realise what the issue actually was.

I had a vCenter 5.0 server which had not been joined to the local Active Directory domain. My goal was to get this added to the rest of the AD domain. After adding the vCenter server to the domain, rebooting, and checking that all the VMware services had started up correctly afterwards, I connected the vSphere client and saw that all the ESXi hosts were in a disconnected state.

At this point I tried right-clicking a host and manually connecting it – this worked, but only 60 seconds or so, and then it disconnected again. Whilst it was connected it was manageable, and of course all the VMs on each host were still fine. I tried restarting management agents on a host and retrying the procedure, but this didn’t help either. My next step was to reboot an ESXi host that didn’t have anything critical running. Still nothing at this point.

So I decided to consult the VMware vpxd log files on the vCenter server. Consult this VMware KB article to see where to find these logs.

Before opening the latest vpxd.log file, I tried the reconnect on a host again using the vSphere client, and watched for the disconnect. At the exact time I noticed the host appear disconnected again, I noted down the time on the system clock, then opened the vpxd logs to navigate to this time and take a look. Here is what I found:

2013-01-04T00:00:22.121Z [02504 warning 'Default'] [VpxdInvtHostSyncHostLRO] Connection not alive for host host-28
2013-01-04T00:00:22.121Z [02504 warning 'Default'] [VpxdInvtHost::FixNotRespondingHost] Returning false since host is already fixed!
2013-01-04T00:00:22.121Z [02504 warning 'Default'] [VpxdInvtHostSyncHostLRO] Failed to fix not responding host host-28
2013-01-04T00:00:22.121Z [02504 warning 'Default'] [VpxdInvtHostSyncHostLRO] Connection not alive for host host-28
2013-01-04T00:00:22.121Z [02504 error 'Default'] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-28, marking host as notResponding
2013-01-04T00:00:22.126Z [02504 warning 'Default'] [VpxdMoHost] host connection state changed to [NO_RESPONSE] for host-28

This clearly shows the issue and points to it being a connectivity issue of some sort. Looking up these specific errors led me over to this VMware KB article, and it was at this point that it suddenly dawned on me – with the late night I had carelessly overlooked the Windows Firewall. Of course, Windows Firewall has settings for Windows Domains too, and of course this server had just joined the domain, so existing Firewall policies in place for vCenter that were previously on “public” settings, were now not enabled for “Domain”.

Timing the issue also revealed that it was 60 seconds before hosts disconnected again. So the issue here was that port 902 used for the host heartbeat between vCenter and the ESXi hosts was being blocked on the vCenter firewall. Unblocking this by simply enabling the rule for “Domain” fixed the issue and as soon as that was applied, all hosts reconnected by themselves. Of course I also took the time to ensure other vCenter firewall exceptions were correctly configured.

 

 

To fix, I just enabled the Domain profile that the firewall rule applies to.

 

Lastly, when examining VMware log files and settings, you may come across references to VMs, Hosts, or other VMware “objects” named as “host-28” or “vm-07” for example. These are VMware’s way of keeping reference of objects by what is called a MoRef, or “Managed object reference”. You may know host-28 as esxi03.yourdomain.local for example, so I thought I would include a handy tip for working out the Managed Object Reference name of an ESXi host to help with those vpxd.log diagnostics. Let’s say you find an interesting error mentioning moref “host-28”. You don’t know which host this is, so you can use PowerCLI to work out the morefs of hosts in a cluster and then match up the reference to the actual host name. Use this bit of script to achieve this:

 

Get-VMHost | Sort Name | Select Name,@{Name="MoRef";Expression={$_.ExtensionData.MoRef}}
Working out the MoRef of hosts using PowerCLI

 

 

Getting up and running with the vSphere 5.1 Web Client

Getting up and running with the vSphere 5.1 Web Client and vCenter 5.1 is now easier than before. The steps to follow are listed below, along with the steps you should use if you also have vCenter 5.0 instances to manage with the 5.1 Web Client.

 

  • If you have a vCenter 5.1 Server instance, you’ll just need to install the Web Client using standard installer from the vCenter autorun.
  • Don’t forget to install the latest Adobe Flash too.
  • With vSphere 5.1 you now have integration with the vCenter Single Sign on (SSO) service. If your vCenter server uses the same vCenter Single Sign On server as that which the Web Client uses, then you do not need to manually register vCenter 5.1 instances with the Web Client Server. Instead, just install the Web Client server as normal, and then sign in to it from the local machine at https://localhostl:9443/vsphere-client or remotely from another management machine at https://remotemachine:9443/vsphere-client. The vSphere Web Client can now locate these vCenter Server 5.1 systems by using the VMware Lookup Service.
  • If you run into any errors when you try to access the web client via the URL (local or remote), give it a few more minutes if you have just finished the installation. I found that it took my system up to 3 minutes before I could login. This must be due to automatic registration with the Lookup Service taking place in the background.

This definitely makes life a bit easier when setting up a vCenter 5.1 and the Web client, and makes complete sense as VMware have announced that the standard vSphere Client 5.1 (Windows application) is their final release of the vSphere Client software. From then on, everything will be managed via the Web Client!

Also remember that when you are setting up the vSphere web client, you are asked for the IP or FQDN of your vCenter server. If it uses IPv6 and you want to enter the IP address instead of using the FQDN, you must enter it in IPv6 format (ie. enclosing this address in square brackets).

 

If you are still using vCenter 5.0 or have vCenter 5.0 instances, you are still required to use the machine that the Web Client was installed on, and browse to https://localhost:9443/admin-app and then register these vCenter 5.0 instances as per the screenshots below. You do of course also have a couple of options depending on which vCenter Server 5.0 type you are using (Windows or the Appliance).

 

For vSphere vCenter 5.0 Windows instances you’ll still need to register these with the Web Client, login to the Web client on the machine it was installed on using the localhost address:

Register your vCenter Server 5.0 instance by using the IP or FQDN and correct credentials.

Accept and install the security certificate if applicable.

 

If you are using the vCenter 5.0 appliance, then you’ll need to register these instances using the command-line on the appliance. Use the following script to register your vCenter instance:

/usr/lib/vmware-vsphere-client/scripts/admin-cmd.sh register https://[IP or FQDN of the Web Client]:[HTTPS Port Number]/vsphere-client [VC IP Address] [VC Admin username] [VC Admin password]

If you have any special characters in your password, don’t forget to enclose this in single quote marks ( ‘ ).

 

Get vCenter User Sessions and Idle times with PowerCLI

Today I was looking into a small “nice to have” notification system for users that had left their vSphere clients open and logged into vCenter. I found this great bit of script to list currently logged in users over at blog.vmpros.nl and thought I would expand on this in my own way to generate a handy list of logged in users and their current idle time – similar to the way the “Sessions” tab in the vSphere client displays user session information. When I got home this evening I expanded on the original script from vmpros.nl to create the following:

 

$Now = Get-Date
$Report = @()
$svcRef = new-object VMware.Vim.ManagedObjectReference
$svcRef.Type = "ServiceInstance"
$svcRef.Value = "ServiceInstance"
$serviceInstance = get-view $svcRef
$sessMgr = get-view $serviceInstance.Content.sessionManager
foreach ($sess in $sessMgr.SessionList){
   $time = $Now - $sess.LastActiveTime
   # Our time calculation returns a TimeSpan object instead of DateTime, therefore formatting needs to be done as follows:
   $SessionIdleTime = '{0:00}:{1:00}:{2:00}' -f $time.Hours, $time.Minutes, $time.Seconds
   $row = New-Object -Type PSObject -Property @{
   		Name = $sess.UserName
		LoginTime = $sess.LoginTime
		IdleTime = $SessionIdleTime
	} ## end New-Object
	$Report += $row
}
$Report

 

[download id=”6″]

 

Here is an example of the output of the script:

 

Using this bit of PowerCLI script, it should be easy for you to create your own notification system based on user session idle time, or some functionality that would disconnect idle users. Let me know if you do improve on the above, or if you have any other suggestions.