Changing DNS on Azure IaaS VM’s NIC forces RDP / network disconnect

I  just noticed this happen to a VM I was connected to this evening.

All I did was change the primary DNS from automatically assigned to manual, gave it a DNS server IP, and provided a backup secondary IP, and my RDP session was instantly dropped. Other HTTPS traffic to the box stopped too.

I had to restart the VM in Azure to get connectivity back. This VM was deployed using the classic portal, but I’ve seen reports of it happening on newer ARM deployed VMs too. Here’s a thread with others that have found the same issue.

Hopefully Microsoft will resolve this soon.

Scaling Web API 2 and back-end SQL databases in Azure

I recently created a small Web API 2 project running with a back-end SQL database (Entity Framework code first), and had it deployed to an Azure web app, along with Azure SQL.

Naturally, I started it off using the free web app and one of the cheapest possible Azure SQL tiers (S0 – 10 DTUs).

After I finished working on the API, I wanted to see what sort of performance I could get out of it, by using Azure’s various scaling options.

To test I used Loader.io. This is a really nice and easy to use load testing service by SendGrid Labs. The free edition allows me to setup various API endpoint tests and run many concurrent connections for up to 1 minute at a time.

All my tests below were done using the same GET request test. The request always returned a collection of 5 x objects from the /Animals endpoint to keep things consistent.

My initial test was against the F1 free app tier for the Web app, with the SQL database running on S0 (10 DTUs). Here are the results of sending 500 requests per second for 1 minute.

S0-10DTU-result

The API struggled to complete the full 60k requests over 1 minute, and only completed about 8k requests, with an average response time of 4638ms. Terrible, but then again we are running on very low performance, cheap tiers. I had a look at the database performance stats and noticed that the DTUs were capped out at 100% during the 1 minute load test. At this point it definitely seems to be the database performance holding things back.

Scaling the database up to the S1 tier (20 DTUs) gives a definite improvement in response times and number of requests able to be sent within one minute. If we look at the database performance stats in the portal, we can now see that the DTUs are still maxing out at 100% though.

S1-20DTU-result

20-DTUs-maxed out

At this point I decided I would increase database performance again, but throw more requests per second at the API (from 500/second up to 1000/second).

Scaling the database up to S2 (50 DTUs) and throwing more requests a second at the API, and the number of requests completed in total higher now – up by about an extra 5k. Taking a look at the DTU performance status, we can see they now maxed out at around 60%. At this point it is pretty clear that the database is no longer the bottleneck.

50-DTUs-maxed out at 60% - even with doubling the requests per second from 500 to 1000

50-DTUs-maxed out at 60%

Now I scaled the web app tier up from free, to the B1 (Basic) tier, which gives you 1 Core, 1.75GB RAM, and up to 3 x instances scaled manually. I started with just the default 1 instance and ran the 1000 req/second for 1 minute test again.

boo-test-failed-error-rate-higher-than-50% due to timeouts

The results were pretty dismal compared to the free tier now. In fact the test failed due to an error rate of greater than 50% (all caused by timeouts). It is important to remember that we have not yet scaled out from the default 1 instance though.

Scaling up to 2 x instances on the B1 tier, helped quite a bit. The test now completes, and has a much smaller timeout error rate. Many more responses were served, but the response rate was quite slow. Taking a look at the distribution of CPU time over the two instances, we can also see that the traffic is indeed being split between the two instances we’ve scaled out with.

scale-B1-basic-from-1-to-2-instances

yay-test-finished-with much smaller error rate

processor time spread over two instances during load test

Taking this one step further to 3 x instances, and re-running the test nets us the best result so far. No timeout errors, and a response time averaging around 3000ms. Much better, but still quite a high response time, and not all 60k requests are being served.

I scaled up to the B2 tier for the following run. Each instance has 2 x cores and 3.5GB RAM this time. Starting at 1 x instance and running the test on these higher specification web instances seems to now handle things a lot better.

Little to no timeout errors, with about 5000ms avg response time, but using only 1 x instance this time!

Pushing things right up to 3 x instances (2 cores and 3.5GB RAM each) nets us the best result yet. The average response time is down to 1700ms and there are no timeout errors at all. The API was able to handle 49000 requests in the 1 minute test, which is the highest number of requests it has been able to handle so far.

B2-basic-test-with-3x-instances-good-result

I scaled up to the B3 tier from here, and tried another few runs using 3 x instances (at 4 x cores and 7GB RAM each). This didn’t help things much, netting around 200ms better response time, for a much pricier tier. It therefore looks like the sweet spot for this kind of work is to scale out with medium sized instances (2 x cores each), rather than scaling up too much.

I changed the tier to S2 (2 x cores 3.5GB RAM each, but allowing up to 10 x instances scaled out) and this time, running the test gave very similar results to 3 x instances. Clearly, the instances were now no longer the bottleneck. Looking back at the database performance, I saw that the DTUs were maxing out at around 90%. It was clear that there must have been some throttling happening there now.

I changed the database DTUs to 100 using the S3 tier, and re-ran the test once more.

bingo-60k-requests

Bingo! We’re now managing to serve the test’s 1000 requests a second, and over the 1 minute test, we get all 60k requests served successfully, and have a reasonable average response time of roughly 300-400ms.

I made a quick change to the GET method in the API for this endpoint to gather items from the database asynchronously, and running the same test again, now gets us all the way down to an average response time of just 100ms over the 60k requests in one minute. Excellent!

100ms-test-result

As you can see, by running load tests like this, and trying out different scaling options for the front end and back end, logically scaling each whenever you see bottlenecks in test results or performance metrics, you can after some time determine the best specification for your database and web apps.

 

Setting up DNS SRV records for an Office 365 migration (on 123-reg)

I needed to setup some DNS records for an Office 365 migration earlier and was initially slightly confused translating the settings Microsoft supplied us to those needed as input on 123-reg’s Advanced DNS configuration. The MX, TXT and CNAME records were simple enough, but it was the SRV records that needed a bit of fiddling to get right.

As an example on the SRV records, MS give you something like this:

Type Service Protocol Port Weight Priority TTL Name Target
SRV _sip _tls 443 1 100 1 Hour thedomain.co.uk sipdir.online.lync.com
SRV _sipfederationtls _tcp 5061 1 100 1 Hour thedomain.co.uk sipfed.online.lync.com

123-reg give you this interface to enter SRV records yourself:

Screen Shot 2013-06-13 at 14.24.42

 

By looking at the examples you can start to understand how to translate the Service, Protocol, and Weight items that MS give you, into the 123-reg input boxes (which do not exist individually for Service, Protocol and Weight).

In the first SRV record example –

Hostname therefore becomes: _sip._tls (the Service + the Protocol with a dot (.) between)

TTL of course becomes: 3600 (1 hr)

Priority is 100

Destination (the most confusing one) becomes: 1 443 sipdir.online.lync.com. (note that it starts with Weight (1), then a space, then the port number (443), then the Target (sipdir.online.lync.com), followed by a dot (.)

That forms your complete SRV record. By entering these along with the other records you require, you should have a fully functional Office 365 setup on your custom domain name.

Issue using the Count method to count objects with PowerShell 2.0

 

I came across a small issue with a little helper script I wrote to count vSphere objects using PowerCLI this morning. It’s been a couple of a weeks since I last did a blog post – things have been very busy, so I have not been able to commit much time over the last few weeks to blogging. As such, I thought I would do a quick post around this small issue I came across earlier. It has more than likely been covered off elsewhere, but will be a good reference point to come back to if I ever forget!

 

So to the issue I saw. Essentially, if an object count is 1 or less, then the object is returned as the object type itself. For example, where only one Distributed Virtual Switch exists in a vSphere environment, and we use the cmdlet, Get-VirtualSwitch -Distributed, a single object is returned of BaseType “VMware.VimAutomation.ViCore.Impl.V1.VIObjectImpl“.

 

However, if we had more than one dvSwitch, then we would get a BaseType of “System.Array” returned.

 

We are able to use the Count() method on an array with PowerShell version 2.0, but are not able to use the Count() method on a single object. The work around I found here (when using PowerShell 2.0) is to cast the object type specifically as an array.

So in the case of our dvSwitch example above, originally we would have done:

@dvSwitchCount = (Get-VirtualSwitch -Distributed).Count

 

To cast this as an array, and therefore having an accurate count of the objects, whether there are no members, one member, or more, we would use:

@dvSwitchCount = @(Get-VirtualSwitch -Distributed).Count

 

Note the addition of the “@” sign – used to cast the variable as an array.

Jonathan Medd also kindly pointed out that this is fixed with PowerShell 3.0 – have a read of the new features to see the addition that allows .Count to be used on any type of object over here.

Get Virtual Machine Inventory from a Hyper-V Failover Cluster using PowerShell

A colleague was asking around for a PowerShell script that would fetch some inventory data for VMs on a Hyper-V cluster the other day. Not knowing too much about Hyper-V and having only ever briefly looked at what was out there in terms of PowerShell cmdlets for managing Hyper-V, I decided to dive in tonight after I got home.

 

Here is a function that will fetch Inventory data for all VMs in a specified Failover Cluster. This is what it fetches:

  • VM Name
  • VM CPU Count
  • VM CPU Socket Count
  • VM Memory configuration
  • VM State (Up or Down)
  • Cluster Name the VM resides on
  • Hyper-V Host name the VM resides on
  • Network Virtual Switch Name
  • NIC Mac Address
  • Total VHD file size in MB
  • Total VHD Count

 

Being a function, you can pipe in the name of the cluster you want, for example Get-Cluster | Get-HyperVInventory. Or you could do Get-HyperVInventory -ClusterName “ExampleClusterName”. You could also send it to an HTML Report by piping it to “ConvertTo-HTML | Out-File example.html”

Download here, or copy it out from the script block below:
[download id=”15″]
 

# Requires: Imported HyperV PowerShell module (http://pshyperv.codeplex.com/releases/view/62842)
# Requires: Import-Module FailoverClusters
# Requires: Running PowerShell as Administrator in order to properly import the above modules

function Get-HyperVInventory {
<#
.SYNOPSIS
Fetches Hyper-V VM Inventory from a specified Hyper-V Failover cluster

.DESCRIPTION
Fetches Hyper-V VM Inventory from a specified Hyper-V Failover cluster

.PARAMETER ClusterName
The Name of the Hyper-V Failover Cluster to inspect

.EXAMPLE
PS F:\> Get-HyperVInventory -ClusterName "dev-cluster1"

.EXAMPLE
PS F:\> Get-Cluster | Get-HyperVInventory

.LINK
http://www.shogan.co.uk

.NOTES
Created by: Sean Duffy
Date: 09/07/2012
#>

[CmdletBinding()]
param(
[Parameter(Position=0,Mandatory=$true,HelpMessage="Name of the Cluster to fetch inventory from",
ValueFromPipeline=$true,ValueFromPipelineByPropertyName=$true)]
[System.String]
$ClusterName
)

process {

$Report = @()

$Cluster = Get-Cluster -Name $ClusterName
$HVHosts = $Cluster | Get-ClusterNode

foreach ($HVHost in $HVHosts) {
$VMs = Get-VM -Server $HVHost
foreach ($VM in $VMs) {
[long]$TotalVHDSize = 0
$VHDCount = 0
$VMName = $VM.VMElementName
$VMMemory = $VM | Get-VMMemory
$CPUCount = $VM | Get-VMCPUCount
$NetSwitch = $VM | Get-VMNIC
$NetMacAdd = $VM | Get-VMNIC
# VM Disk Info
$VHDDisks = $VM | Get-VMDisk | Where { $_.DiskName -like "Hard Disk Image" }
foreach ($disk in $VHDDisks) {
$VHDInfo = Get-VHDInfo -VHDPaths $disk.DiskImage
$TotalVHDSize = $TotalVHDSize + $VHDInfo.FileSize
$VHDCount += 1
}
$TotalVHDSize = $TotalVHDSize/1024/1024
$row = New-Object -Type PSObject -Property @{
Cluster = $Cluster.Name
VMName = $VMName
VMMemory = $VMMemory.VirtualQuantity
CPUCount = $CPUCount.VirtualQuantity
CPUSocketCount = $CPUCount.SocketCount
NetSwitch = $NetSwitch.SwitchName
NetMACAdd = $NetMacAdd.Address
HostName = $HVHost.Name
VMState = $HVHost.State
TotalVMDiskSizeMB = $TotalVHDSize
TotalVMDiskCount = $VHDCount
} ## end New-Object
$Report += $row
}
}
return $Report

}
}

 

Example use cases – load the function into your PowerShell session, or place it in your $profile for easy access in future, and run the following:

# Example 1
Get-HyperVInventory -ClusterName "mycluster1"
# Example 2
Get-Cluster | Get-HyperVInventory
# Example 3
Get-HyperVInventory -ClusterName "mycluster1" | ConvertTo-HTML | Out-File C:\Report.html

 

The function includes help text and examples, so you can also issue the normal “Get-Help Get-HyperVInventory” or “Get-Help Get-HyperVInventory -Examples”. It is by no means perfect and could do with some improvements, for example if there is more than one Virtual Switch Network associated with a VM these would be listed in a row multiple times for each. Feel free to suggest any improvements or changes in the comments.