Minimal Cost Web Hosting With Spot, Graviton2, EFS, Traefik, & Let’s Encrypt

web

I’m constantly searching for minimal cost web hosting solutions. To clarify that statement, I mean ‘dynamic‘ websites, not static. At the moment I am running this blog and a bunch of others on a Raspberry Pi Kubernetes cluster at home. I got to thinking though, what happens if I need to move? I’ll have an inevitable period of downtime. Clearly self-hosting from home has it’s drawbacks.

I’ve run my personal dynamic websites from AWS before (EC2 with a single Docker instance), but used an application load balancer (ALB) to help with routing traffic to different hostnames. The load balancer itself adds a large chunk of cost, and storage was EBS, a little more difficult to manage when automating host provisioning.

A Minimal Cost Web Hosting Infrastructure in AWS

I wanted to find something that minimises costs in AWS. My goal was to go as cheap as possible. I’ve arrived at the following solution, which saves on costs for networking, compute, and storage.

minimal cost web hosting Infrastructure diagram
  • AWS spot EC2 single instance running on AWS Graviton2 (ARM).
  • EFS storage for persistence (a requirement is that containers have persistence, as I use wordpress and require MySQL etc…)
  • Elastic IP address
  • Simple Lambda Function that manages auto-attachment of a static, Elastic IP (EIP) to the single EC2 instance. (In case the spot instance is terminated due to demand/price changes for example).
  • Traefik v2 for reverse proxying of traffic hitting the single EC2 instance to containers. This allows for multiple websites / hosts on a single machine

It isn’t going to win any high availability awards, but I’m OK with that for my own self-hosted applications and sites.

One important requirement with this solution is the ability to run dynamic sites. I know I could be doing this all a lot easier with S3/CloudFront if I were to only be hosting static sites.

Using this setup also allows me to easily move workloads between my home Kubernetes cluster and the cloud. This is because the docker images and tags I am using are now compatible between ARM (on Raspberry Pi) and ARM on Graviton2 AWS docker instances.

The choices I have gone with allow me to avoid ‘cloud lock in’, as I can easily switch between the two setups if needed.

Cost Breakdown

I’ve worked out the monthly costs to be roughly as follows:

  • EC2 Graviton2 ARM based instance (t4g.medium), $7.92
  • 3GB EFS Standard Storage, $0.99
  • Lambda – will only invoke when an EC2 instance change occurs, so cost not even worth calculating
  • EIP – free, as it will remain attached to the EC2 instance at all times
minimal cost web hosting solution - spot instance pricing chart
Current Spot Instance pricing for t4g.medium instances

If you don’t need 4GB of RAM, you can drop down to a t4g.small instance type for half the cost.

Total monthly running costs should be around $8.91.

Keep in mind that this solution will provide multiple hostname support (multiple domains/sites hosted on the same system), storage persistence, and a pretty quick and responsive ARM based Graviton2 processor.

You will need to use ARM compatible Docker images, but there are plenty out there for all the standard software like MySQL, WordPress, Adminer, etc…

How it Works

The infrastructure diagram above pretty much explains how everything fits together. But at a high level:

  • An Autoscaling Group is created, in mixed mode, but only allows a single, spot instance. This EC2 instance uses a standard Amazon Linux 2 ARM based AMI (machine image).
  • When the new instance is created, a Lambda function (subscribed to EC2 lifecycle events) is invoked, locates a designated Elastic IP (EIP), and associates it with the new spot EC2 instance.
  • The EC2 machine mounts the EFS storage on startup, and bootstraps itself with software requirements, a base Traefik configuration, as well as your custom ‘dynamic’ Traefik configuration that you specify. It then launches the Traefik container.
  • You point your various A records in DNS to the public IP address of the EIP.
  • Now it doesn’t matter if your EC2 spot instance is terminated, you’ll always have the same IP address, and the same EFS storage mounted when the new one starts up.
  • There is the question of ‘what if the spot market goes haywire?’ By default the spot price will be allowed to go all the way up to the on-demand price. This means you could potentially pay more for the EC2 instance, but it is not likely. If it did happen, you could change the instance configuration or choose another instance type.

Deploying the Solution

As this is an AWS opinionated infrastructure choice, I’ve packaged everything into an AWS Cloud Development Kit (AWS CDK) app. AWS CDK is an open source software development framework that allows you to do infrastructure-as-code. I’ve used Typescript as my language of choice.

Clone the source from GitHub

Deploy Requirements

You’ll need the following requirements on your local machine to deploy this for yourself:

  • NodeJS installed, along with npm.
  • AWS CDK installed globally (npm install -g aws-cdk)
  • Define your own traefik_dynamic.toml configuration, and host it somewhere where the EC2 instance will be able to grab it with curl. Note, that the Traefik dashboard basic auth password is defined using htpasswd.
htpasswd -nb YourUsername YourSuperSecurePasswordGoesHere
  • An existing VPC in your account to use. The CDK app does not create a VPC (additional cost). You can definitely use your default account VPC that is already available in all accounts though.
  • An existing AWS Keypair
  • An existing Elastic IP address (EIP) created, and tagged with the key/value of Usage:Traefik (this is for the Lambda function to identify the right EIP to associate to the EC2 instance when it starts)
Tag requirement for the Elastic IP Address

I haven’t set up the CDK app to pass in parameters, so you’ll just need to modify a bunch of variables at the top of aws-docker-web-with-traefik-stack.ts to substitute your specific values for the aforementioned items. For example:

const vpcId = "your-vpc-id";
const instanceType = "t4g.medium"; // t4g.small for even more cost saving
const keypairName = "your-existing-keypair-name";
const managementLocationCidr = "1.1.1.1/32"; // your home / management network address that SSH access will be allowed from. Change this!
const traefikDynContentUrl = "https://gist.githubusercontent.com/Shogan/f96a5a20183e672f9c49f278ea67503b/raw/351c52b7f2bacbf7b8dae65404b61ff4e4313d81/example-traefik-dynamic.toml"; // this should point to your own dynamic traefik config in toml format.
const emailForLetsEncryptAcmeResolver = 'email = "youremail@example.com"'; // update this to your own email address for lets encrypt certs
const efsAutomaticBackups = false; // set to true to enable automatic backups for EFS

Build and Deploy

Build the Typescript project using npm run build. This compiles the CDK and the EIP Manager Lambda function typescript.

At this point you’re ready to deploy with CDK.

If you have not used CDK before, all you really need to know is that it takes the infrastructure described by the code (typescript in this case), and coverts it to CloudFormation language. The cdk deploy command deploys the stack (which is the collection of AWS resources defined in code).

Run:

# Check what changes will be made first
cdk diff

# Deploy
cdk deploy

Testing a Sample Application Stack

Here is a sample docker-compose stack that will install MySQL, Adminer, and a simple WordPress setup.

SSH onto the EC2 instance that is provisioned, and use docker-compose up -d deploy the compose example stack. Just remember to edit and change the template passwords in the two environment variables.

You’ll also need to update the hostnames to your own (from .example.com), and point those A records to your Elastic public IP address.

One more thing, there is a trick to running docker-compose on ARM systems. I personally prefer to grab a docker image that contains a pre-built docker-compose binary, and shell script that ties it together with the docker-compose command. Here are the steps if you need them (run on the EC2 instance that you SSH onto):

sudo curl -L --fail https://raw.githubusercontent.com/linuxserver/docker-docker-compose/master/run.sh -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

For your own peace of mind, make sure you inspect that githubusercontent run.sh script yourself before downloading, as well as the docker image(s) it references and pulls down to run docker-compose.

Tear Down

To destroy the stack, simply issue the cdk destroy command. The EFS storage is marked by default with a retain policy, so it will not be deleted automatically.

cdk destroy AwsDockerWebWithTraefikStack
deleting the minimal cost web hosting solution cdk stack

Closing

If you’re on the look out for a minimal cost web hosting solution, then give this a try.

The stack uses the new Graviton2 based t4g instance type (ARM) to help achieve a minimal cost web hosting setup. Remember to find compatible ARM docker images for your applications before you go all in with something like this.

The t4g instance family is also a ‘burstable’ type. This means you’ll get great performance as long as you don’t use up your burst credits. Performance will slow right down if that happens. Keep an eye on your burst credit balance with CloudWatch. For 99% of use cases you’ll likely be just fine though.

Also remember that you don’t need to stick to AWS. You could bolt together services from any other cloud provider to do something similiar, most likely at a similar cost too.

Hashicorp Waypoint Server on Raspberry Pi

waypoint server running on raspberry pi

This evening I finally got a little time to play around with Waypoint. This wasn’t a straightforward install of Waypoint on my desktop though. I wanted to run and test HashiCorp Waypoint Server on Raspberry Pi. Specifically on my Pi Kubernetes cluster.

Out of the box Waypoint is simple to setup locally, whether you’re on Windows, Linux, or Mac. The binary is written in the Go programming language, which is common across HashiCorp software.

There is even an ARM binary available which lets you run the CLI on Raspberry Pi straight out of the box.

Installing Hashicorp Waypoint Server on Raspberry Pi hosted Kubernetes

I ran into some issues initially when assuming that waypoint install --platform=kubernetes -accept-tos would ensure an ARM docker image was pulled down for my Pi based Kubernetes hosts though.

My Kubernetes cluster also has the nfs-client-provisioner setup, which fulfills PersistentVolumeClaim resources with storage from my home FreeNAS Server Build. I noticed that PVCs were not being honored because they did not have the specific storage-class of nfs-storage that my nfs-client-provisioner required.

Fixing the PVC Issue

Looking at the waypoint CLI command, it’s possible to generate the YAML for the Kubernetes resources it would deploy with a --platform=kubernetes flag. So I fetched a base YAML resource definition:

waypoint install --platform=kubernetes -accept-tos --show-yaml

I modified the volumeClaimTemplates section to include my required PVC storageClassName of nfs-storage.

volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: nfs-storage
      resources:
        requests:
          storage: 1Gi

That sorted out the pending PVC issue in my cluster.

Fixing the ARM Docker Issue

Looking at the Docker image that the waypoint install command for Kubernetes gave me, I could see right away that it was not right for ARM architecture.

To get a basic Waypoint server deployment for development and testing purposes on my Raspberry Pi Kubernetes Cluster, I created a simple Dockerfile for armhf builds.

Basing it on the hypriot/rpi-alpine image, to get things moving quickly I did the following in my Dockerfile.

  • Added few tools, such as cURL.
  • Added a RUN command to download the waypoint ARM binary (currently 0.1.3) from Hashicorp releases and place in /usr/bin/waypoint.
  • Setup a /data volume mount point.
  • Created a waypoint user.
  • Created the entrypoint for /usr/bin/waypoint.

You can get my ARM Waypoint Server Dockerfile on Github, and find the built armhf Docker image on Docker Hub.

Now it is just a simple case of updating the image in the generated YAML StatefulSet to use the ARM image with the ARM waypoint binary embedded.

containers:
- name: server
  image: shoganator/waypoint:0.1.3.20201026-armhf
  imagePullPolicy: Always

With the YAML updated, I simply ran kubectl apply to deploy it to my Kubernetes Cluster. i.e.

kubectl apply -f ./waypoint-armhf.yaml

Now Waypoint Server was up and running on my Raspberry Pi cluster. It just needed bootstrapping, which is expected for a new installation.

Hashicorp Waypoint Server on Raspberry Pi - pod started.

Configuring Waypoint CLI to Connect to the Server

Next I needed to configure my internal jumpbox to connect to Waypoint Server to verify everything worked.

Things may differ for you here slightly, depending on how your cluster is setup.

Waypoint on Kubernetes creates a LoadBalancer resource. I’m using MetalLB in my cluster, so I get a virtual LoadBalancer, and the EXTERNAL-IP MetalLB assigned to the waypoint service for me was 10.23.220.90.

My cluster is running on it’s own dedicated network in my house. I use another Pi as a router / jumpbox. It has two network interfaces, and the internal interface is on the Kubernetes network.

By getting an SSH session to this Pi, I could verify the Waypoint Server connectivity via it’s LoadBalancer resource.

curl -i --insecure https://10.23.220.90:9702

HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 3490
Content-Type: text/html; charset=utf-8
Last-Modified: Mon, 19 Oct 2020 21:11:45 GMT
Date: Mon, 26 Oct 2020 14:27:33 GMT

Bootstrapping Waypoint Server

On a first time run, you need to bootstrap Waypoint. This also sets up a new context for you on the machine you run the command from.

The Waypoint LoadBalancer has two ports exposed. 9702 for HTTPS, and 9701 for the Waypoint CLI to communicate with using TCP.

With connectivity verified using curl, I could now bootstrap the server with the waypoint bootstrap command, pointing to the LoadBalancer EXTERNAL-IP and port 9701.

waypoint server bootstrap -server-addr=10.23.220.90:9701 -server-tls-skip-verify
waypoint context list
waypoint context verify

This command gives back a token as a response and sets up a waypoint CLI context from the machine it ran from.

Waypoint context setup and verified from an internal kubernetes network connected machine.

Using Waypoint CLI from a machine external to the Cluster

I wanted to use Waypoint from a management or workstation machine outside of my Pi Cluster network. If you have a similar network setup, you could also do something similar.

As mentioned before, my Pi Router device has two interfaces. A wireless interface, and a phyiscal network interface. To get connectivity over ports 9701 and 9702 I used some iptables rules. Importantly, my Kubernetes facing network interface is on 10.0.0.1 in the example below:

sudo iptables -t nat -A PREROUTING -i wlan0 -p tcp --dport 9702 -j DNAT --to-destination 10.23.220.90:9702
sudo iptables -t nat -A POSTROUTING -p tcp -d 10.23.220.90 --dport 9702 -j SNAT --to-source 10.0.0.1
sudo iptables -t nat -A PREROUTING -i wlan0 -p tcp --dport 9701 -j DNAT --to-destination 10.23.220.90:9701
sudo iptables -t nat -A POSTROUTING -p tcp -d 10.23.220.90 --dport 9701 -j SNAT --to-source 10.0.0.1

These rules have the effect of sending traffic destined for port 9701 and 9702 hitting the wlan0 interface, to the MetalLB IP 10.23.220.90.

The source and destination network address translation will translate the ‘from’ address of the TCP reply packets to make them look like they’re coming from 10.0.0.1 instead of 10.23.220.90.

Now, I can simply setup a Waypoint CLI context on a machine on my ‘normal’ network. This network has visibility of my Raspberry Pi Router’s wlan0 interface. I used my previously generated token in the command below:

waypoint context create -server-addr=192.168.7.31:9701 -server-tls-skip-verify -server-auth-token={generated-token-here} rpi-cluster
waypoint context verify rpi-cluster
Connectivity verified from my macOS machine, external to my Raspberry Pi Cluster with Waypoint Server running there.

Concluding

Waypoint Server is super easy to get running locally if you’re on macOS, Linux or Windows.

With a little bit of extra work you can get HashiCorp Waypoint Server on Raspberry Pi working, thanks to the versatility of the Waypoint CLI!

Quick and Easy Local NPM Registry With Verdaccio and Docker

container storage

Sometimes it can be useful to be able to npm publish libraries or projects you’re working on to a local npm registry for use in other development projects.

This post is a quick how-to showing how you can get up and running with a private, local npm registry using Verdaccio and docker compose.

Verdaccio claims it is a zero config required NPM registry, and that is pretty much correct. You can have it up and running in under 5 minutes. Here’s how:

Local NPM Registry Quick Start

Clone verdaccio docker-examples and then change directory into the docker-examples/docker-local-storage-volume directory.

git clone https://github.com/verdaccio/docker-examples.git
cd docker-examples/docker-local-storage-volume

This particular sample docker-compose configuration gives you a locally run verdaccio instance along with persistence via local volume mount.

From here you can be up and running by simply issuing the following docker-compose command:

docker-compose up -d

However if you do want to make a few tweaks to the configuration, simply load up the conf/config.yaml file in your editor.

I wanted to change the max_body_size to a higher value to allow for larger npm packages to be published locally, so I added:

max_body_size: 500mb

If you haven’t yet started the local docker container, start it up with docker-compose up.

Usage

Now all you need to do is configure your local npm settings to use verdaccio on http://localhost:4873. This is default host and port that verdaccio is configured to listen on when running in docker locally).

Then add an npm user for local development:

npm adduser --registry http://localhost:4873

To use your new registry at a project level, you can create a .npmrc file in your local projects with the following content:

@shogan:registry=http://localhost:4873

Of course replace the scope of @shogan with the package scope of your choosing.

To publish a module / package locally:

npm publish --registry http://localhost:4873

Other Examples

There are lots more verdaccio samples and configurations that you can use in the docker-examples repository. Take a look to find these, including Kubernetes resources to deploy if you prefer running there for a local development setup.

Also refer to the verdaccio configuration page for more examples and documentation on the possible config options.

Using Placeholder Templates With Xargs In The Pipeline

pipes running along a wall

Using placeholder templates with xargs gives you a lot more power than simply using xargs to append args onto the end of a command.

I previously blogged about how to use xargs to append arguments to another command in the pipeline. This post goes into a bit more detail and shows you a more powerful way of using xargs.

Basic Xargs Example

With a typical, simple xargs command, you might append one argument onto the end of an existing command in the pipeline like this:

echo "this is a basic xargs example" | xargs echo "you said:"

The above command results in the output:

you said: this is a basic xargs example

Placeholder Templates With Xargs Example

Now, here is a simple example of using a placeholder, or template in your command, and passing your argument into that.

echo "this is a basic xargs example" | xargs -I {} echo "you said: {}"

The output is exactly the same as with the basic example. However, you can now use the curly braces placeholder to move the argument placement to anywhere in your command.

You’re no longer constrained to it being on the end of the echo (or whichever command you’re using). You can also do multiple placements of the argument in your command.

For example:

echo "FOO" | xargs -I {} echo "you said: {}. Here is another usage of your sample argument: {}. And here is yet another: {}"

A Slightly More Practical Example

Enough of the simple echo examples though. How about using this for a more practical, real world example?

In the following example, we want to list a bunch of AWS S3 buckets, and then do a summary output of their total size in GiB. We cut out the bucket name using cut from the initial listing that is returned with aws s3 ls.

aws s3 ls | cut -d' ' -f3 | xargs -n1 aws s3 ls --summarize --human-readable --recursive s3://

Using xargs to append the bucket name from the pipeline looks like it would work, as we only need it right at the end of the aws s3 ls command. There is an issue though, xargs would add a space, and we want the bucket name appended to s3:// without a space.

Using The Template or Placeholder

This is where using a placeholder or template with xargs can come in handy.

aws s3 ls | cut -d' ' -f3 | xargs -n1 -I {} aws s3 ls --summarize --human-readable --recursive s3://{}

It’s also worth noting that you can change your template or placeholder token with the -I parameter. It doesn’t have to be {} as in the examples above.

In summary, your usage of xargs can be levelled up by using the -I parameter to leverage placeholder or template tokens.

Select, match and pipe output into another command with jq, grep, and xargs

select-match-and-pipe-output-with-jq-grep-xargs

This is a quick reference post if you’re looking to pipe output into another command on Linux using xargs.

The pipeline is immensley powerful and you can leverage it to act on different stages of your full command to do specific selecting, matching, and manipulation.

Say you are running an executable that outputs a bunch of JSON and you want to select certain a certain subset of this data, pattern match it, and then send that matched data into another command.

This is the perfect use case for a mixture of the jq, grep and xargs commands in the pipeline.

Practical example with xargs

Here is a practical example where you might want to list all your AWS CodePipeline pipelines, match only on certain ones, and then execute (Release Changes) on each of them.

aws codepipeline list-pipelines | jq -r '.pipelines[].name' | grep project-xyz | xargs -n1 aws codepipeline start-pipeline-execution --name

The piped command above does the following:

  • Lists all AWS CodePipelines with the command aws codepipeline list-pipelines
  • Uses jq to ‘raw’ select the name from each pipeline object in the pipelines[] array that the above command outputs
  • Sends each pipeline name into grep to match only those containing the string “project-xyz”
  • Pipes the resulting pipeline names using xargs into the command aws codepipeline start-pipeline-execution --name. The -n1 argument tells xargs to use at most max-args of 1 per command line.

Cheap S3 Cloud Backup with BackBlaze B2

white and blue fiber optic cables in a FC storage switch

I’ve been constantly evolving my cloud backup strategies to find the ultimate cheap S3 cloud backup solution.

The reason for sticking to “S3” is because there are tons of cloud provided storage service implementations of the S3 API. Sticking to this means that one can generally use the same backup/restore scripts for just about any service.

The S3 client tooling available can of course be leveraged everywhere too (s3cmd, aws s3, etc…).

BackBlaze B2 gives you 10GB of storage free for a start. If you don’t have too much to backup you could get creative with lifecycle policies and stick within the 10GB free limit.

a lifecycle policy to delete objects older than 7 days.

Current Backup Solution

This is the current solution I’ve setup.

I have a bunch of files on a FreeNAS storage server that I need to backup daily and send to the cloud.

I’ve setup a private BackBlaze B2 bucket and applied a lifecycle policy that removes any files older than 7 days. (See example screenshot above).

I leveraged a FreeBSD jail to install my S3 client (s3cmd) tooling, and mount my storage to that jail. You can follow the steps below if you would like to setup something similar:

Step-by-step setup guide

Create a new jail.

Enable VNET, DHCP, and Auto-start. Mount the FreeNAS storage path you’re interested in backing up as read-only to the jail.

The first step in a clean/base jail is to get s3cmd compiled and installed, as well as gpg for encryption support. You can use portsnap to get everything downloaded and ready for compilation.

portsnap fetch
portsnap extract # skip this if you've already run extract before
portsnap update

cd /usr/ports/net/py-s3cmd/
make -DBATCH install clean
# Note -DBATCH will take all the defaults for the compile process and prevent tons of pop-up dialogs asking to choose. If you don't want defaults then leave this bit off.

# make install gpg for encryption support
cd /usr/ports/security/gnupg/ && make -DBATCH install clean

The compile and install process takes a number of minutes. Once complete, you should be able to run s3cmd –configure to set up your defaults.

For BackBlaze you’ll need to configure s3cmd to use a specific endpoint for your region. Here is a page that describes the settings you’ll need in addition to your access / secret key.

After gpg was compiled and installed you should find it under the path /usr/local/bin/gpg, so you can use this for your s3cmd configuration too.

Double check s3cmd and gpg are installed with simple version checks.

gpg --version
s3cmd --version
quick version checks of gpg and s3cmd

A simple backup shell script

Here is a quick and easy shell script to demonstrate compressing a directory path and all of it’s contents, then uploading it to a bucket with s3cmd.

DATESTAMP=$(date "+%Y-%m-%d")
TIMESTAMP=$(date "+%Y-%m-%d-%H-%M-%S")

tar --exclude='./some-optional-stuff-to-exclude' -zcvf "/root/$TIMESTAMP-backup.tgz" .
s3cmd put "$TIMESTAMP-backup.tgz" "s3://your-bucket-name-goes-here/$DATESTAMP/$TIMESTAMP-backup.tgz"

Scheduling the backup script is an easy task with crontab. Run crontab -e and then set up your desired schedule. For example, daily at 25 minutes past 1 in the morning:

25 1 * * * /root/backup-script.sh

My home S3 backup evolution

I’ve gone from using Amazon S3, to Digital Ocean Spaces, to where I am now with BackBlaze B2. BackBlaze is definitely the cheapest option I’ve found so far.

Amazon S3 is overkill for simple home cloud backup solutions (in my opinion). You can change to use infrequent access or even glacier tiered storage to get the pricing down, but you’re still not going to beat BackBlaze on pure storage pricing.

Digital Ocean Spaces was nice for a short while, but they have an annoying minimum charge of $5 per month just to use Spaces. This rules it out for me as I was hunting for the absolute cheapest option.

BackBlaze currently has very cheap storage costs for B2. Just $0.005 per GB and only $0.01 per GB of download (only really needed if you want to restore some backup files of course).

Concluding

You can of course get more technical and coerce a willing friend/family member to host a private S3 compatible storage service for you like Minio, but I doubt many would want to go to that level of effort.

So, if you’re looking for a cheap S3 cloud backup solution with minimal maintenance overhead, definitely consider the above.

This is post #4 in my effort towards 100DaysToOffload.