Editing a webapp or site’s HTTP headers with Lambda@Edge and CloudFront

Putting CloudFront in front of a static website that is hosted in an S3 bucket is an excellent way of serving up your content and ensuring it is geographically performant no matter where your users are by leveraging caching and CloudFront’s geographically placed edge locations.

The setup goes a little something like this:

  • Place your static site files in an S3 bucket that is set up for static web hosting
  • Create a CloudFront distribution that uses the S3 bucket content as the origin
  • Add a cache behaviour to the distribution

This is an excellent way of hosting a website or webapp that can be delivered anywhere in the world with ultra low latency, and you don’t even have to worry about running your own webserver to host the content. Your content simply sits in an S3 bucket and is delivered by CloudFront (and can be cached too).

But what happens if you want to get a little more technical and serve up custom responses for any HTTP requests for your website content? Traditionally you’d need a custom webserver that you could use to modify the HTTP request/response lifecycle (such as Varnish / Nginx).

That was the case until Lambda@Edge was announced.

I was inspired to play around with Lambda@Edge after reading Julia Evan’s blog post about Cloudflare Workers, where she set up something similar to add a missing Content-Type header to responses from her blog’s underlying web host. I wanted to see how easy it was to handle in an AWS setup with S3 hosted content and CloudFront.

So here is a quick guide on how to modify your site / webapp’s HTTP responses when you have CloudFront sitting in front of it.

Note: you can run Lambda@Edge functions on all these CloudFront events (not just the one mentioned above):

  • After CloudFront receives a request from a viewer (viewer request)
  • Before CloudFront forwards the request to the origin (origin request)
  • After CloudFront receives the response from the origin (origin response)
  • Before CloudFront forwards the response to the viewer (viewer response)
  • You can return a custom response from Lambda@Edge without even sending a request to the CloudFront origin at all.

Of course the only ones that are guaranteed to always run are the Viewer type events. This is because origin request and origin response events only happen when the requested object is not already cached in an edge location. In this case CloudFront forwards a request to the origin and will receive a response back from the origin (hopefully!), and these events you can indeed act upon.

How to edit HTTP responses with Lambda@Edge

Create a new Lambda function and make sure it is placed in the us-east-1 region. (There is a requirement here by AWS that the function must be created in the US East / N. Virginia Region). When you create the function, it is deployed to all regions across the world with their own replication version of the Lambda@Edge function.

Fun fact: your CloudWatch logs for Lambda@Edge will appear in the relevant region where your content is requested from – i.e. based on the region the edge location exists in that ends up serving up your content.

You’ll need to create a new IAM Role for the function to leverage, so use the Lambda@Edge role template.

Select Node 6.10 runtime for the function. In the code editor, setup the following Node.js handler function which will do the actual header manipulation work:

exports.handler = (event, context, callback) => {
    const response = event.Records[0].cf.response;
    const headers = response.headers;
    
    headers['x-sean-example'] = [{key: 'X-Sean-Example', value: 'Lambda @ Edge was here!'}];
    
    callback(null, response);
};

 

The function will receive an event for every request passing through. In that event you simply retrieve the CloudFront response event.Records[0].cf.response and set your required header(s) by referencing the key by header name and setting the value.

Make sure you publish a version of the Lambda function, as you’ll need to attach it to your CloudFront behavior by ARN that includes the version number. (You can’t use $LATEST, so make sure you use a numerical version number that you have published).

Now if you make a new request to your content, you should see the new header being added by Lambda@Edge!

Lambda@Edge is a great way to easily modify CloudFront Distribution related events in the HTTP lifecycle. You can keep response times super low as the Lambda functions are executed at the edge location closest to your users. It also helps you to keep your infrastructure as simple as possible by avoiding the use of complicated / custom web servers that would otherwise just add unecessary operational overhead.

Running an S3 API compatible object storage server (Minio) on the Raspberry Pi

I’ve recently become interested in hosting my own local S3 API compatible object storage server at home.

So tonight I set about setting up Minio.

Image result for minio

Minio is an object storage server that is S3 API compatible. This means I’ll be able to use my working knowledge of the Amazon S3 API and tools, but to interact with my own, locally hosted storage service running on a Raspberry Pi.

I had heard about Zenko before (an S3 API compatible object storage server) but was searching around for something really lightweight that I could easily run on ARM architecture – i.e. my Raspberry Pi model 3 I have sitting on my desk right now. In doing so, Minio was the first that I found that could easily be compiled to run on the Raspberry Pi.

The goal right now is to have a local object storage service that is compatible with S3 APIs that I can use for home use. This has a bunch of cool use cases, and the ones I am specifically interested in right now are:

  • Being able to write scripts that interact with S3, but test them locally with Minio before even having to think about deploying them to the cloud. A local object storage API is going to be free and fast. Plus it’s great knowing that you’re fully in control of your own data.
  • Setting up a publically exposable object storage service that I can target with serverless functions that I plan to be running on demand in the cloud to do processing and then output artifacts to my home object storage service.

The second use case above is what I intend on doing to send ffmpeg processed video to. Basically I want to be able to process video from online services using something like AWS Lambda (probably using ffmpeg bundled in with the function) and output the resulting files to my home storage system.

The object storage service will receive these output files from Lambda and I’ll have a cronjob or rsync setup to then sync the objects placed into my storage bucket(s) to my home Plex media share.

This means I’ll be able to remotely queue up stuff to watch via a simple interface I’ll expose (or a message queue of some sort) to be processed by Lambda, and by the time I’m home everything will be ready to watch in Plex.

Normally I would be more interesting in running the Docker image for Minio, but at home I want something that is really cheap to run, and so compiling Minio for Raspberry Pi makes total sense to me here, as this device is super cheap to level powered on 24/7 as opposed to running something beefier that would instead run as a Docker host or lightweight Kubernetes home cluster.

Here’s the quick start up guide to get it running on Raspberry Pi

You’ll basically download Go, extract it, set it up on your path, then use it to compile Minio’s source code into an ARM compatible binary that you can run on your pi.

wget https://dl.google.com/go/go1.10.3.linux-armv6l.tar.gz
sudo tar -C /usr/local -xzf go1.10.3.linux-armv6l.tar.gz
export PATH=$PATH:/usr/local/go/bin # put into ~/.profile
source .profile
go get -u github.com/minio/minio
mkdir ~/minio-data
cd go/bin
./minio server ~/minio-data/

And you’re up and running! It’s that simple to get going quickly.

Running interactively you’ll get a default access and secret key in the terminal, so head on over to the Web UI / interface to check things out: http://your-raspberry-pi-ip-or-hostname:9000/minio/

Enter your credentials to login.

Of course at this stage you can also start using your S3 API compatible command line tools to start working with your new object storage server too.

Nice!

vCenter Server Appliance VM fails to boot with fsck failed message

 

I had a storage outage to deal with recently, and after the datastores on this storage were taken down, a vCenter Server Appliance VM on the storage got some corrupted files and would not boot. Upon start up, I was greeted with this message:

vcenter-server-appliance-fsck-failed

The error message reads:

fsck failed.  Please repair manually and reboot.  The root
file system is currently mounted read-only.  To remount it
read-write do

After trying the mount command using the maintenance mode bash shell, I restarted and found that the appliance still did not boot properly. I found a thread on the VMware community forums where someone had the same issue and was able to run e2fsck to fix the disk issues. I tried this and found it fixed a whole heap of disk errors on the /dev/sda3 mount, but on restart I noticed more issues on the /dev/sdb2 mount, so I ran the e2fsck command again for this path, and was able to finally reboot the appliance successfully. The commands I ran to resolve were essentially:

  • mount -n -o remount,rw /
  • e2fsck -y /dev/sda3
  • e2fsck -y /dev/sdb2
  • CTRL-D to reboot after fixing the errors using e2fsck

 

 

VMware T10 compliant VAAI integration and HP P2000 G3 FC storage

 

I’ve recently been updating firmware on some development and testing storage and found that the HP P2000 storage array firmware update TS251R004 and above enable the HP P2000 G3 FC enable T10 compliance for the hardware.

 

To quote VMware’s documentation on their VAAI implementation specific to T10 compliance:

The second required component can be referred to as a VAAI plug-in specific to the VAAI filter. It implements vendor-specific VAAI functions such as ATS, XCOPY and WRITE_SAME. There were different implementations of the VAAI block primitives in vSphere 4.1, but all of the primitives in vSphere 5.0 have been ratified by T10, so any array that is T10 compliant should be able to use VAAI.

 

This means that you no longer need to be running the HP P2000 VAAI plugin software directly on ESXi hosts. In fact, HP recommend you uninstall and remove the plugin software before you upgrade the firmware on these arrays, otherwise you could suffer from performance degradation and possible loss of access to datastores.

My process was to first of all login to all hosts and check for the presence of the VAAI plugin.

  • SSH into host as root, run find / -name hp_vaaip_p2000
  • Ensure that nothing comes up with the find command, if it does (you see something like this output: /usr/lib/vmware/vmkmod/hp_vaaip_p2000), then you should use this HP document to ensure it is removed correctly: https://h20566.www2.hpe.com/hpsc/doc/public/display?sp4ts.oid=4118559&docId=mmr_kc-0123414&docLocale=en_US – this will involve some setting changes, and removing claim rules as well as removal of the HP P2000 VAAI VIB itself.
  • After verifying nothing came up, check other hosts, and once happy all hosts are clear of the plugin, upgrade the firmware for the P2000 system.
  • Ideally reboot ESXi hosts after the firmware update and ensure access to datastores is still there. Check the hardware acceleration status of datastores – they should show up as “Supported”.

Octopus Deploy Endpoint auto configuration on Azure VM deployment

I’ve been working on a very cool project that involves the use of Microsoft Azure, TeamCity and Octopus Deploy.

I have created an Azure PowerShell script that deploys VMs into an Azure Subscription (Web machines that run IIS) as a part of a single Azure Cloud Service with load balancing enabled. As such, the endpoint ports that I create for Octopus tentacle communication need to differ for each machine on the public interface.

I wanted to fully automate things from end-to-end, so I wrote a very small console application that uses the Octopus Client library NuGet package in order to be able to communicate with your Octopus Deploy server via the HTTP API.

Octopus Endpoint Configurator on GitHub

The OctopusConfigurator console application should be run in your Azure VM once it is deployed, with 4 x parameters to specify when run.

It will then establish communication with your Octopus Deploy server, and register a new Tentacle endpoint using the details you pass it. The standard port number that gets assigned (10933) will then be replaced if necessary with the correct endpoint port number for that particular VM instance in your cloud service. For example, I usually start the first VM in my cloud service off on 10933, then increment the port number by 1 for every extra VM in the cloud service. As the deployments happen, the console application registers each new machine’s tentacle using the incremented port number back with the Octopus master server.

Once the Azure VM deployment is complete, I tell the VMs in the cloud service to restart with a bit of Azure PowerShell and once this is done, your Octopus environment page should show all newly deployed tentacles as online for your environment. Here is an example of an Invoke-Command scriptblock that I execute remotely on my Azure VMs as soon as they have completed initial deployment. What I do is tell the VM deployment script to wait for Windows boot, so once ready, the WinRM details are fetched for the VM using the Get-AzureWinRMUri cmdlet for Azure, which allows me to use the Invoke-Command to run the below script inside the guest VM.

 

Invoke-Command -ConnectionUri $connectionString -Credential $creds -ArgumentList $vmname,$externalDNSName,$creds,$InstallTentacleFunction,$OctopusExternalPort,$OctopusEnvironmentName -ScriptBlock {
	
	$webServerName = $args[0]
    $DNSPassthrough = $args[1]
    $passedCredentials = $args[2]
    $scriptFunction = $args[3]
    $OctoPort = $args[4]
    $OctopusEnvironmentName = $args[5]
		
	function DownloadFileUrl($url, $destinationPath, $fileNameToSave)
	{
	    $fullPath = "$destinationPath\$fileNameToSave"

	    if (Test-Path -Path $destinationPath)
	    {
	        Invoke-WebRequest $url -OutFile $fullPath
	    }
	    else
	    {
	        mkdir $destinationPath
	        Invoke-WebRequest $url -OutFile $fullPath
	    }

	    Write-Host "Full path is: $fullPath"
	    return [string]$fullPath
	}
	
	# Download the Octopus Endpoint Configurator to C:\Temp
	[string]$ConfiguratorPath = DownloadFileUrl "https://dl.dropboxusercontent.com/u/xxxxxxx/Apps/OctopusConfigurator.zip" "C:\Temp" "OctopusConfigurator.zip"
	
	Write-Host "Unzipping OctopusConfigurator.zip" -ForegroundColor Green
    cd C:\Temp
    $shell_app=new-object -com shell.application
    $filename = "OctopusConfigurator.zip"
    $zip_file = $shell_app.namespace((Get-Location).Path + "\$filename")
    $destination = $shell_app.namespace((Get-Location).Path)
    $destination.Copyhere($zip_file.items())
	
    cd C:\Temp

    if (Test-Path -Path .\OctopusConfigurator.exe)
    {
        & .\OctopusConfigurator.exe http://theoctopusurl.domain API-XXXXXXXXXXXXXXXXXXXXXX $webServerName $OctoPort
        Write-Host "Reconfigured Octopus Machine URI to correct port number" -ForegroundColor Green
    }
    else
    {
        Write-Host "OctopusConfigurator not found!" -ForegroundColor Red
        Exit
    }
}

Cloning and running a duplicate vCenter instance on the same network – process and gotchas

Recently I needed to clone a vSphere environment (vCenter 5.0.0) for testing purposes. This environment needed to be cloned to have an exact replica of the vCenter server and SQL database server for various tests/upgrades to be performed on it. As for ESXi hosts, a few were being split off the original environment and added to the duplicate vSphere environment.   All the Windows configuration and SQL server configuration needed to be retained, so my high-level plan was as follows:

  • Deploy a new Windows domain (it had to be the same domain functional level and the DC needed to run the same OS as the original)
  • Hot cloned the existing production vCenter and SQL servers
  • Split off the few ESXi physical hosts that were going to be added to the newly cloned environment later on (removed from prod clusters)
  • The new machines needed to run on the same VLAN and IP ranges as the originals too, which made things even more complex, so I made sure to keep the vNICs in the SQL and vCenter cloned VMs disconnected, and disconnected on start up too.
  • Re-IP the cloned VMs after powering them up on one of the split off ESXi hosts (logged in to vSphere client using root credentials) I also re-named their host names and removed from the old domain in, rejoining to the new domain at the same time.
  • Created new service accounts on the newly deployed domain, and reconfigured vCenter services to use these on the cloned machines
  • On the vCenter server there were some changes needed in vCenter config files. This post details most of what needed to be changed:
  • The main change for me though, was I couldn’t see (or didn’t know) how an existing vCenter SQL database would re-act when starting the cloned vCenter on the same VLAN and IP range! There was a strong possibility that this cloned instance could start interfering with the production vCenter and performing operations cross environment. Therefore I decided to create a new SQL vCenter DB. I logged into the cloned vCenter and deleted the old SQL System DSN pointing to the production SQL DB, and created a new SQL database on the cloned SQL box. I then created a new DSN pointing to this, and made sure I searched around for all configuration files on the vCenter server that pointed to the old DSN/SQL server. (I recall there being some references in registry and possibly the vpxd.conf file).
  • Re-creating the right SQL database structure was a bit of a task though. I needed to re-create the structure of the DB without doing an install of course, as I was using the cloned SQL and vCenter machines – with an existing installation on. I followed this KB article, but found a couple of errors/typos in the SQL queries: https://pubs.vmware.com/vsphere-50/index.jsp#com.vmware.vsphere.install.doc_50/GUID-F953497E-2170-4168-806F-6FF0EA6497A7.html by looking at the errors returned in SQL Management studio, you can start to determine where any issues come up and fix the typos. Unfortunately I did not document them myself as I sorted through!
  • Once I was finished with the renaming and IP changes for the cloned machines, I re-connected their vNICs to the relevant networks – happy that they were sufficiently changed!
  • My last issue I came to was that the schema I deployed was one version higher than the vCenter Server build version I was using. I found this out by looking at the vpxd.log file when vCenter failed to start up after deploying the new database schema and content. I fixed this by fudging the version in the database – essentially hard coding the version it wanted into the newly created schema – my thought was that the schema wouldn’t have changed anything old, but rather added new features, so it should be OK. I was right – vCenter started up just fine after this, but I wasn’t happy, so I stopped the services again, deleted the DB, and started again, this time using the right scripts (the scripts referenced in the article link above are located on the vSphere vCenter ISO / media). I had used an ISO with a build increment one higher than the vCenter build I was working with on the cloned VC!) So make sure you use the correct media for your vCenter install here. (I had a vCenter 5.0.0 install, but had deployed schema for 5.0.1!)

 

Hopefully that gives some ideas as to the tasks required when attempting to clone / duplicate an existing vCenter installation, whilst keeping