Ingest CloudWatch Logs to a Splunk HEC with Lambda and Serverless

cloudwatch logs to splunk HEC via Lambda

I recently came across a scenario requiring CloudWatch log ingestion to a private Splunk HEC (HTTP Event Collector).

The first and preferred method of ingesting CloudWatch Logs into Splunk is by using AWS Firehose. The problem here though is that Firehose only seems to support an endpoint that is open to the public.

This is a problem if you have a Splunk HEC that is only available inside of a VPC and there is no option to proxy public connections back to it.

The next thing I looked at was the Splunk AWS Lambda function template to ingest CloudWatch logs from Log Group events. I had a quick look and it seems pretty out of date, with synchronous functions and libraries in use.

So, I decided to put together a small AWS Lambda Serverless project to improve on what is currently out there.

You can find the code over on Github.

The new version has:

  • async / await, and for promised that wrap the synchronous libraries like zlib.
  • A module that handles identification of Log Group names based on a custom regex pattern. If events come from log groups that don’t match the naming convention, then they get rejected. The idea is that you can write another small function that auto-subscribes Log Groups.
  • Secrets Manager integration for loading the Splunk HEC token from Secrets Manager. (Or fall back to a simple environment variable if you like).
  • Serverless framework wrapper. Pass in your Security Group ID, Subnet IDs and tags, and let serverless CLI deploy the function for you.
  • Lambda VPC support by default. You should deploy this Lambda function in a VPC. You could change that, but my idea here is that most enterprises would be running their own internal Splunk inside of their corporate / VPC network. Change it by removing the VPC section in serverless.yml if you do happen to have a public facing Splunk.

You deploy it using Serverless framework, passing in your VPC details and a few other options for customisation.

serverless deploy --stage test \
  --iamRole arn:aws:iam::123456789012:role/lambda-vpc-execution-role \
  --securityGroupId sg-12345 \
  --privateSubnetA subnet-123 \
  --privateSubnetB subnet-456 \
  --privateSubnetC subnet-789 \
  --splunkHecUrl https://your-splunk-hec:8088/services/collector \
  --secretManagerItemName your/secretmanager/entry/here

Once configured, it’ll pick up any log events coming in from Log Groups you’ve ‘subscribed’ it to (Lambda CloudWatch Logs Triggers).

add your lambda CloudWatch logs triggers and enabled them for automatic ingestion of these to Splunk

These events get enriched with extra metadata defined in the function. The metadata is derived by default from the naming convention used in the CloudWatch Log Groups. Take a close look at the included Regex pattern to ensure you name your Log Groups appropriately. Finally, they’re sent to your Splunk HEC for ingestion.

For an automated Log Group ingestion story, write another small helper function that:

  • Looks for Log Groups that are not yet subscribed as CloudWatch Logs Triggers.
  • Adds them to your CloudWatch to Splunk HEC function as a trigger and enables it.

In the future I might add this ‘automatic trigger adding function’ to the Github repository, so stay tuned!

Synchronizing tmux Panes in a Window

Splitting a tmux session window into multiple panes can do wonders for productivity. Here you’ll see how easy synchronizing tmux panes is.

But, what if you would like to use this feature to automate a workflow across many machines?

You’ll be glad to know it is possible to synchronize panes in a tmux window.
This allows you to execute a series of commands or particular workflow across many machines.

synchronize panes in tmux

First of all, if you haven’t already, split your window into the panes you need. The commands for this of course (assuming CTRL + B is prefix for tmux commands) are:

Split vertical: CTRL + B, %Split horizontal: CTRL + B, “

You may wish at this point to connect each pane to the relevant machine you want in each.

Synchronize the panes by entering tmux command mode with:

CTRL + B, :

Now type:

setw synchronize-panes on

Hit enter and you’ll immediately notice all the panes are now synchronized. At this point you can go wild and execute whichever workflow you want to automate across the subjects.

You can toggle the synchronization off again by entering command mode once more and typing:

:setw synchronize-panes off

You can also use the same command to toggle synchronization on/off by omitting the on/off parameter on the end. Synchronizing tmux Panes is toggled when omitting this parameter.

:setw synchronize-panes

Fast Batch S3 Bucket object deletion from the shell

This is a quick post showing a nice and fast batch S3 bucket object deletion technique.

I recently had an S3 bucket that needed cleaning up. It had a few million objects in it. With path separating forward slashes this means there were around 5 million or so keys to iterate.

The goal was to delete every object that did not have a .zip file extension. Effectively I wanted to leave only the .zip file objects behind (of which there were only a few thousand), but get rid of all the other millions of objects.

My first attempt was straight forward and naive. Iterate every single key, check that it is not a .zip file, and delete it if not. However, every one of these iterations ended up being an HTTP request and this turned out to be a very slow process. Definitely not fast batch S3 bucket object deletion…

I fired up about 20 shells all iterating over objects and deleting like this but it still would have taken days.

I then stumbled upon a really cool technique on serverfault that you can use in two stages.

  1. Iterate the bucket objects and stash all the keys in a file.
  2. Iterate the lines in the file in batches of 1000 and call delete-objects on these – effectively deleting the objects in batches of 1000 (the maximum for 1 x delete request).

In-between stage 1 and stage 2 I just had to clean up the large text file of object keys to remove any of the lines that were .zip objects. For this process I used sublime text and a simple regex search and replace (replacing with an empty string to remove those lines).

So here is the process I used to delete everything in the bucket except the .zip objects. This took around 1-2 hours for the object key path collection and then the delete run.

Get all the object key paths

Note you will need to have Pipe Viewer installed first (pv). Pipe Viewer is a great little utility that you can place into any normal pipeline between two processes. It gives you a great little progress indicator to monitor progress in the shell.

aws s3api list-objects --output text --bucket the-bucket-name-here --query 'Contents[].[Key]' | pv -l > all-the-stuff.keys

 

Remove any object key paths you don’t want to delete

Open your all-the-stuff.keys file in Sublime or any other text editor with regex find and replace functionality.

The regex search for sublime text:

^.*.zip*\n

Find and replace all .zip object paths with the above regex string, replacing results with an empty string. Save the file when done. Make sure you use the correctly edited file for the following deletion phase!

Iterate all the object keys in batches and call delete

tail -n+0 all-the-stuff.keys | pv -l | grep -v -e "'" | tr '\n' '\0' | xargs -0 -P1 -n1000 bash -c 'aws s3api delete-objects --bucket the-bucket-name-here --delete "Objects=[$(printf "{Key=%q}," "$@")],Quiet=false"' _

This one-liner effectively:

  • tails the large text file (mine was around 250MB) of object keys
  • passes this into pipe viewer for progress indication
  • translates (tr) all newline characters into a null character ‘\0’ (effectively every line ending)
  • chops these up into groups of 1000 and passes the 1000 x key paths as an argument with xargs to the aws s3api delete-object command. This delete command can be passed an Objects array parameter, which is where the 1000 object key paths are fed into.
  • finally quiet mode is disabled to show the result of the delete requests in the shell, but you can also set this to true to remove that output.

Effectively you end up calling aws s3api delete-object passing in 1000 objects to delete at a time.

This is how it can get through the work so quickly.

Nice!

Streamlining AWS AMI image creation and management with Packer

If you want to set up quick and efficient provisioning and automation pipelines and you rely on machine images as a part of this framework, you’ll definitely want to prepare and maintain preconfigured images.

With AWS you can of course leverage Amazon’s AMIs for EC2 machine images. If you’re configuring autoscaling for an application, you definitely don’t want to be setting up your launch configurations to launch new EC2 instances using base Amazon AMI images and then installing any prerequesites your application may need at runtime. This will be slow and tedious and will lead to sluggish and unresponsive auto scaling.

Packer comes in at this point as a great tool to script, automate and pre-bake custom AMI images. (Packer is a tool by Hashicorp, of Terraform fame). Packer also enables us to store our image configuration in source control and set up pipelines to test our images at creation time, so that when it comes time to launching them, we can be confident they’ll work.

Packer doesn’t only work with Amazon AMIs. It supports tons of other image formats via different Builders, so if you’re on Azure or some other cloud or even on-premise platform you can also use it there.

Below I’ll be listing out the high level steps to create your own custom AMI using Packer. It’ll be Windows Server 2016 based, enable WinRM connections at build time (to allow Packer to remote in and run various setup scripts), handle sysprep, EC2 configuration like setting up the administrator password, EC2 computer name, etc, and will even run some provioning tests with Pester

You can grab the files / policies required to set this up on your own from my GitHub repo here.

Setting up credentials to run Packer and an IAM role for your Packer build machine to assume

First things first, you need to be able to run Packer with the minimum set of permissions it needs. You can run packer on an EC2 instance that has an EC2 role attached that provides it the right permissions, or if you’re running from a workstation, you’ll probably want to use an IAM user access/secret key.

Here is an IAM policy that you can use for either of these. Note it also includes an iam:PassRole statement that references an AWS account number and specific role. You’ll need to update the account number to your own, and create the Role called Packer-S3-Access in your own account.

IAM Policy for user or instance running Packer:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:AttachVolume",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CopyImage",
                "ec2:CreateImage",
                "ec2:CreateKeypair",
                "ec2:CreateSecurityGroup",
                "ec2:CreateSnapshot",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:DeleteKeypair",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteSnapshot",
                "ec2:DeleteVolume",
                "ec2:DeregisterImage",
                "ec2:DescribeImageAttribute",
                "ec2:DescribeImages",
                "ec2:DescribeInstances",
                "ec2:DescribeRegions",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSnapshots",
                "ec2:DescribeSubnets",
                "ec2:DescribeTags",
                "ec2:DescribeVolumes",
                "ec2:DetachVolume",
                "ec2:GetPasswordData",
                "ec2:ModifyImageAttribute",
                "ec2:ModifyInstanceAttribute",
                "ec2:ModifySnapshotAttribute",
                "ec2:RegisterImage",
                "ec2:RunInstances",
                "ec2:StopInstances",
                "ec2:TerminateInstances",
                "ec2:RequestSpotInstances",
                "ec2:CancelSpotInstanceRequests"
            ],
            "Resource": "*"
        },
        {
            "Effect":"Allow",
            "Action":"iam:PassRole",
            "Resource":"arn:aws:iam::YOUR_AWS_ACCOUNT_NUMBER_HERE:role/Packer-S3-Access"
        }
    ]
}

IAM Policy to attach to new Role called Packer-S3-Access (Note, replace the S3 bucket name that is referenced with a bucket name of your own that will be used to provision into your AMI images with). See a little further down for details on the bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowS3BucketListing",
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::YOUR-OWN-PROVISIONING-S3-BUCKET-HERE"
            ],
            "Condition": {
                "StringEquals": {
                    "s3:prefix": [
                        "",
                        "Packer/"
                    ],
                    "s3:delimiter": [
                        "/"
                    ]
                }
            }
        },
        {
            "Sid": "AllowListingOfdesiredFolder",
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::YOUR-OWN-PROVISIONING-S3-BUCKET-HERE"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "Packer/*"
                    ]
                }
            }
        },
        {
            "Sid": "AllowAllS3ActionsInFolder",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-OWN-PROVISIONING-S3-BUCKET-HERE/Packer/*"
            ]
        }
    ]
}

This will allow Packer to use the iam_instance_profile configuration value to specify the Packer-S3-Access EC2 role in your image definition file. Essentially, this allows your temporary Packer EC2 instance to assume the Packer-S3-Access role which will grant the temporary instance enough privileges to download some bootstrapping files / artifacts you may wish to bake into your custom AMI. All quite securely too, as the policy will only allow the Packer instance to assume this role in addition to the Packer instance being temporary too.

Setting up your Packer image definition

Once the above policies and roles are in place, you can set up your main packer image definition file. This is a JSON file that will describe your image definition as well as the scripts and items to provision inside it.

Look at standardBaseImage.json in the GitHub repository to see how this is defined.

standardBaseImage.json

{
  "builders": [{
    "type": "amazon-ebs",
    "region": "us-east-1",
    "instance_type": "t2.small",
    "ami_name": "Shogan-Server-2012-Build-{{isotime \"2006-01-02\"}}-{{uuid}}",
    "iam_instance_profile": "Packer-S3-Access",
    "user_data_file": "./ProvisionScripts/ConfigureWinRM.ps1",
    "communicator": "winrm",
    "winrm_username": "Administrator",
    "winrm_use_ssl": true,
    "winrm_insecure": true,
    "source_ami_filter": {
      "filters": {
        "name": "Windows_Server-2012-R2_RTM-English-64Bit-Base-*"
      },
      "most_recent": true
    }
  }],
  "provisioners": [
    {
        "type": "powershell",
        "scripts": [
            "./ProvisionScripts/EC2Config.ps1",
            "./ProvisionScripts/BundleConfig.ps1",
            "./ProvisionScripts/SetupBaseRequirementsAndTools.ps1",
            "./ProvisionScripts/DownloadAndInstallS3Artifacts.ps1"
        ]
    },
    {
        "type": "file",
        "source": "./Tests",
        "destination": "C:/Windows/Temp"
    },
    {
        "type": "powershell",
        "script": "./ProvisionScripts/RunPesterTests.ps1"
    },
    {
        "type": "file",
        "source": "PesterTestResults.xml",
        "destination": "PesterTestResults.xml",
        "direction": "download"
    }
  ],
  "post-processors": [
    {
        "type": "manifest"
    }
  ]
}

When Packer runs it will build out an EC2 machine as per the definition file, copy any contents specified to copy, and provision and execute any scripts defined in this file.

The packer image definition in the repository I’ve linked above will:

  • Create a Server 2012 R2 base instance.
  • Enable WinRM for Packer to be able to connect to the temporary instance.
  • Run sysprep to generalize it.
  • Set up EC2 configuration.
  • Download a bunch of tools (including Pester for running test once the image build is done).
  • Download any S3 artifacts you’ve placed in a specific bucket in your account and store them on the image.

S3 Downloads into your AMI during build

Create a new S3 bucket and give it a unique name of your choice. Set it to private, and create a new virtual folder inside the bucket called Packer. This bucket should have the same name you specified in the Packer-S3-Access role policy in the few policy definition sections.

Place any software installers or artifacts you would like to be baked into your image in the /Packer virtual folder.

Update the DownloadAndInstallS3Artifacts.ps1 script to reference any software installers and execute the installers. (See the commented out section for an example). This PowerShell script will download anything under the /Packer virtual folder and store it in your image under C:\temp\S3Downloads.

Testing

Finally, you can add your own Pester tests to validate tasks carried out during the Packer image creation.

Define any custom tests under the /Tests folder.

Here is simple test that checks that the S3 download for items from /Packer was successful (The Read-S3Object cmdlet will create the folder and download items into it from your bucket):

Describe  'S3 Artifacts Downloads' {
    It 'downloads artifacts from S3' {
        "C:\temp\S3Downloads" | Should -Exist
    }
}

The main image definition file ensures that these are all copied into the image at build time (to the temp directory) and from there Pester executes them.

Hook up your image build process to a build system like TeamCity and you can get it to output the results of the tests from PesterTestResults.xml.

Have fun automating and streamlining your image builds with Pester!

Octopus Deploy Endpoint auto configuration on Azure VM deployment

I’ve been working on a very cool project that involves the use of Microsoft Azure, TeamCity and Octopus Deploy.

I have created an Azure PowerShell script that deploys VMs into an Azure Subscription (Web machines that run IIS) as a part of a single Azure Cloud Service with load balancing enabled. As such, the endpoint ports that I create for Octopus tentacle communication need to differ for each machine on the public interface.

I wanted to fully automate things from end-to-end, so I wrote a very small console application that uses the Octopus Client library NuGet package in order to be able to communicate with your Octopus Deploy server via the HTTP API.

Octopus Endpoint Configurator on GitHub

The OctopusConfigurator console application should be run in your Azure VM once it is deployed, with 4 x parameters to specify when run.

It will then establish communication with your Octopus Deploy server, and register a new Tentacle endpoint using the details you pass it. The standard port number that gets assigned (10933) will then be replaced if necessary with the correct endpoint port number for that particular VM instance in your cloud service. For example, I usually start the first VM in my cloud service off on 10933, then increment the port number by 1 for every extra VM in the cloud service. As the deployments happen, the console application registers each new machine’s tentacle using the incremented port number back with the Octopus master server.

Once the Azure VM deployment is complete, I tell the VMs in the cloud service to restart with a bit of Azure PowerShell and once this is done, your Octopus environment page should show all newly deployed tentacles as online for your environment. Here is an example of an Invoke-Command scriptblock that I execute remotely on my Azure VMs as soon as they have completed initial deployment. What I do is tell the VM deployment script to wait for Windows boot, so once ready, the WinRM details are fetched for the VM using the Get-AzureWinRMUri cmdlet for Azure, which allows me to use the Invoke-Command to run the below script inside the guest VM.

 

Invoke-Command -ConnectionUri $connectionString -Credential $creds -ArgumentList $vmname,$externalDNSName,$creds,$InstallTentacleFunction,$OctopusExternalPort,$OctopusEnvironmentName -ScriptBlock {
	
	$webServerName = $args[0]
    $DNSPassthrough = $args[1]
    $passedCredentials = $args[2]
    $scriptFunction = $args[3]
    $OctoPort = $args[4]
    $OctopusEnvironmentName = $args[5]
		
	function DownloadFileUrl($url, $destinationPath, $fileNameToSave)
	{
	    $fullPath = "$destinationPath\$fileNameToSave"

	    if (Test-Path -Path $destinationPath)
	    {
	        Invoke-WebRequest $url -OutFile $fullPath
	    }
	    else
	    {
	        mkdir $destinationPath
	        Invoke-WebRequest $url -OutFile $fullPath
	    }

	    Write-Host "Full path is: $fullPath"
	    return [string]$fullPath
	}
	
	# Download the Octopus Endpoint Configurator to C:\Temp
	[string]$ConfiguratorPath = DownloadFileUrl "https://dl.dropboxusercontent.com/u/xxxxxxx/Apps/OctopusConfigurator.zip" "C:\Temp" "OctopusConfigurator.zip"
	
	Write-Host "Unzipping OctopusConfigurator.zip" -ForegroundColor Green
    cd C:\Temp
    $shell_app=new-object -com shell.application
    $filename = "OctopusConfigurator.zip"
    $zip_file = $shell_app.namespace((Get-Location).Path + "\$filename")
    $destination = $shell_app.namespace((Get-Location).Path)
    $destination.Copyhere($zip_file.items())
	
    cd C:\Temp

    if (Test-Path -Path .\OctopusConfigurator.exe)
    {
        & .\OctopusConfigurator.exe http://theoctopusurl.domain API-XXXXXXXXXXXXXXXXXXXXXX $webServerName $OctoPort
        Write-Host "Reconfigured Octopus Machine URI to correct port number" -ForegroundColor Green
    }
    else
    {
        Write-Host "OctopusConfigurator not found!" -ForegroundColor Red
        Exit
    }
}