Cheap S3 Cloud Backup with BackBlaze B2

white and blue fiber optic cables in a FC storage switch

I’ve been constantly evolving my cloud backup strategies to find the ultimate cheap S3 cloud backup solution.

The reason for sticking to “S3” is because there are tons of cloud provided storage service implementations of the S3 API. Sticking to this means that one can generally use the same backup/restore scripts for just about any service.

The S3 client tooling available can of course be leveraged everywhere too (s3cmd, aws s3, etc…).

BackBlaze B2 gives you 10GB of storage free for a start. If you don’t have too much to backup you could get creative with lifecycle policies and stick within the 10GB free limit.

a lifecycle policy to delete objects older than 7 days.

Current Backup Solution

This is the current solution I’ve setup.

I have a bunch of files on a FreeNAS storage server that I need to backup daily and send to the cloud.

I’ve setup a private BackBlaze B2 bucket and applied a lifecycle policy that removes any files older than 7 days. (See example screenshot above).

I leveraged a FreeBSD jail to install my S3 client (s3cmd) tooling, and mount my storage to that jail. You can follow the steps below if you would like to setup something similar:

Step-by-step setup guide

Create a new jail.

Enable VNET, DHCP, and Auto-start. Mount the FreeNAS storage path you’re interested in backing up as read-only to the jail.

The first step in a clean/base jail is to get s3cmd compiled and installed, as well as gpg for encryption support. You can use portsnap to get everything downloaded and ready for compilation.

portsnap fetch
portsnap extract # skip this if you've already run extract before
portsnap update

cd /usr/ports/net/py-s3cmd/
make -DBATCH install clean
# Note -DBATCH will take all the defaults for the compile process and prevent tons of pop-up dialogs asking to choose. If you don't want defaults then leave this bit off.

# make install gpg for encryption support
cd /usr/ports/security/gnupg/ && make -DBATCH install clean

The compile and install process takes a number of minutes. Once complete, you should be able to run s3cmd –configure to set up your defaults.

For BackBlaze you’ll need to configure s3cmd to use a specific endpoint for your region. Here is a page that describes the settings you’ll need in addition to your access / secret key.

After gpg was compiled and installed you should find it under the path /usr/local/bin/gpg, so you can use this for your s3cmd configuration too.

Double check s3cmd and gpg are installed with simple version checks.

gpg --version
s3cmd --version
quick version checks of gpg and s3cmd

A simple backup shell script

Here is a quick and easy shell script to demonstrate compressing a directory path and all of it’s contents, then uploading it to a bucket with s3cmd.

DATESTAMP=$(date "+%Y-%m-%d")
TIMESTAMP=$(date "+%Y-%m-%d-%H-%M-%S")

tar --exclude='./some-optional-stuff-to-exclude' -zcvf "/root/$TIMESTAMP-backup.tgz" .
s3cmd put "$TIMESTAMP-backup.tgz" "s3://your-bucket-name-goes-here/$DATESTAMP/$TIMESTAMP-backup.tgz"

Scheduling the backup script is an easy task with crontab. Run crontab -e and then set up your desired schedule. For example, daily at 25 minutes past 1 in the morning:

25 1 * * * /root/backup-script.sh

My home S3 backup evolution

I’ve gone from using Amazon S3, to Digital Ocean Spaces, to where I am now with BackBlaze B2. BackBlaze is definitely the cheapest option I’ve found so far.

Amazon S3 is overkill for simple home cloud backup solutions (in my opinion). You can change to use infrequent access or even glacier tiered storage to get the pricing down, but you’re still not going to beat BackBlaze on pure storage pricing.

Digital Ocean Spaces was nice for a short while, but they have an annoying minimum charge of $5 per month just to use Spaces. This rules it out for me as I was hunting for the absolute cheapest option.

BackBlaze currently has very cheap storage costs for B2. Just $0.005 per GB and only $0.01 per GB of download (only really needed if you want to restore some backup files of course).

Concluding

You can of course get more technical and coerce a willing friend/family member to host a private S3 compatible storage service for you like Minio, but I doubt many would want to go to that level of effort.

So, if you’re looking for a cheap S3 cloud backup solution with minimal maintenance overhead, definitely consider the above.

This is post #4 in my effort towards 100DaysToOffload.

AWS CodeBuild local with Docker

AWS have a handy post up that shows you how to get CodeBuild local by running it with Docker here.

Having a local CodeBuild environment available can be extremely useful. You can very quickly test your buildspec.yml files and build pipelines without having to go as far as push changes up to a remote repository or incurring AWS charges by running pipelines in the cloud.

I found a few extra useful bits and pieces whilst running a local CodeBuild setup myself and thought I would document them here, along with a summarised list of steps to get CodeBuild running locally yourself.

Get CodeBuild running locally

Start by cloning the CodeBuild Docker git repository.

git clone https://github.com/aws/aws-codebuild-docker-images.git

Now, locate the Dockerfile for the CodeBuild image you are interested in using. I wanted to use the ubuntu standard 3.0 image. i.e. ubuntu/standard/3.0/Dockerfile.

Edit the Dockerfile to remove the ENTRYPOINT directive at the end.

# Remove this -> ENTRYPOINT ["dockerd-entrypoint.sh"]

Now run a docker build in the relevant directory.

docker build -t aws/codebuild/standard:3.0 .

The image will take a while to build and once done will of course be available to run locally.

Now grab a copy of this codebuild_build.sh script and make it executable.

curl -O https://gist.githubusercontent.com/Shogan/05b38bce21941fd3a4eaf48a691e42af/raw/da96f71dc717eea8ba0b2ad6f97600ee93cc84e9/codebuild_build.sh
chmod +x ./codebuild_build.sh

Place the shell script in your local project directory (alongside your buildspec.yml file).

Now it’s as easy as running this shell script with a few parameters to get your build going locally. Just use the -i option to specify the local docker CodeBuild image you want to run.

./codebuild_build.sh -c -i aws/codebuild/standard:3.0 -a output

The following two options are the ones I found most useful:

  • -c – passes in AWS configuration and credentials from the local host. Super useful if your buildspec.yml needs access to your AWS resources (most likely it will).
  • -b – use a buildspec.yml file elsewhere. By default the script will look for buildspec.yml in the current directory. Override with this option.
  • -e – specify a file to use as environment variable mappings to pass in.

Testing it out

Here is a really simple buildspec.yml if you want to test this out quickly and don’t have your own handy. Save the below YAML as simple-buildspec.yml.

version: 0.2

phases:
  install:
    runtime-versions:
      java: openjdk11
    commands:
      - echo This is a test.
  pre_build:
    commands:
      - echo This is the pre_build step
  build:
    commands:
      - echo This is the build step
  post_build:
    commands:
      - bash -c "if [ /"$CODEBUILD_BUILD_SUCCEEDING/" == /"0/" ]; then exit 1; fi"
      - echo This is the post_build step
artifacts:
  files:
    - '**/*'
  base-directory: './'

Now just run:

./codebuild_build.sh -b simple-buildspec.yml -c -i aws/codebuild/standard:3.0 -a output /tmp

You should see the script start up the docker container from your local image and ‘CodeBuild’ will start executing your buildspec steps. If all goes well you’ll get an exit code of 0 at the end.

aws codebuild test run output from a local Docker container.

Good job!

This post contributes to my effort towards 100DaysToOffload.

Saving £500 on a new Apple Mac Mini with 32GB RAM

mac mini internals

I purchased a new Apple Mac Mini recently and didn’t want to fall victim to Apple’s “RAM Tax”.

I used Apple’s site to configure a Mac Mini with a quad core processor, 32GB RAM, and a 512GB SSD.

I was shocked to see they added £600.00 to the price of a base model with 8GB RAM. They’re effectively charging all of this money for 24GB of extra RAM. This memory is nothing special, it’s pretty standard 2666MHz DDR4 SODIMM modules. The same stuff that is used in generic laptops.

I decided to cut back my order to the base model with 8GB of RAM. I ordered a Crucial 32GB Kit (2 x 16GB DDR4-2666 SODIMM modules running at 1.2 volts with a CAS latency of 19ns). This kit cost me just over £100.00 online.

The Crucial 2 x 16GB DDR4-2666 SODIMM kit

In total I saved around £500.00 for the trouble of about 30 minutes of work to open up the Mac Mini and replace the RAM modules myself.

The Teardown Process

Use the iFixit Guide

You can use my photos and brief explanations below if you would like to follow the steps I took to replace the RAM, but honestly, you’re better off following iFixit’s excellent guide here.

Follow along Here

If you want to compare or follow along in my format, then read on…

Get a good tool kit with hex screw drivers. I used iFixit’s basic kit.

iFixit basic tool kit

Flip the Mac Mini upside down.

Pry open the back cover, carefully with a plastic prying tool

Undo the 6 x hex screws on the metal plate under the black plastic cover. Be careful to remember the positions of these, as there are 2 x different types. 3 x short screws, and 3 x longer.

opening the mac mini

Very carefully, move the cover to the side, revealing the WiFi antenna connector. Unscrew the small hex screw holding the metal tab on the cable. Use a plastic levering tool to carefully pop the antenna connector off.

Next, unscrew 4 x screws that hold the blower fan to the exhaust port. You can see one of the screws in the photo below. Two of the screws are angled at a 45 degree orientation, so carefully undo those, and use tweezers to catch them as they come out.

Carefully lift the blower fan up, and disconnect it’s cable using a plastic pick or prying tool. The trick is to lift from underneat the back of the cable’s connector and it’ll pop off.

mac mini blower fan removal

Next, disconnect the main power cable at the top right of the photo below. This requires a little bit of wiggling to loosen and lift it as evenly as possible.

Now disconnect the LED cable (two pin). It’s very delicate, so do this as carefully as possible.

There are two main hex screws to remove from the motherboard central area now. You can see them removed below near the middle (where the brass/gold coloured rings are).

With everything disconnected, carefully push the inner motherboard and it’s tray out, using your thumbs on the fan’s exhaust port. You should ideally position your thumbs on the screw hole areas of the fan exhaust port. It’ll pop out, then just very carefully push it all the way out.

The RAM area is protected by a metal ‘cage’. Unscrew it’s 4 x hex screws and slowly lift the cage off the RAM retainer clips.

Carefully push the RAM module retainer clips to the side (they have a rubber grommet type covering over them), and the existing SODIMM modules will pop loose.

mac mini SODIMM RAM modules and slots

Remove the old modules and replace with your new ones. Make sure you align the modules in the correct orientation. The slots are keyed, so pay attention to that. Push them down toward the board once aligned and the retainer clips will snap shut and lock them in place.

Replace the RAM ‘cage’ with it’s 4 x hex screws.

Reverse the steps you took above to insert the motherboard tray back into the chassis and re-attach all the cables and connectors in the correct order.

Make sure you didn’t miss any screws or cables when reconnecting everything.

Finally boot up and enjoy your cheap RAM upgrade.

Ingest CloudWatch Logs to a Splunk HEC with Lambda and Serverless

cloudwatch logs to splunk HEC via Lambda

I recently came across a scenario requiring CloudWatch log ingestion to a private Splunk HEC (HTTP Event Collector).

The first and preferred method of ingesting CloudWatch Logs into Splunk is by using AWS Firehose. The problem here though is that Firehose only seems to support an endpoint that is open to the public.

This is a problem if you have a Splunk HEC that is only available inside of a VPC and there is no option to proxy public connections back to it.

The next thing I looked at was the Splunk AWS Lambda function template to ingest CloudWatch logs from Log Group events. I had a quick look and it seems pretty out of date, with synchronous functions and libraries in use.

So, I decided to put together a small AWS Lambda Serverless project to improve on what is currently out there.

You can find the code over on Github.

The new version has:

  • async / await, and for promised that wrap the synchronous libraries like zlib.
  • A module that handles identification of Log Group names based on a custom regex pattern. If events come from log groups that don’t match the naming convention, then they get rejected. The idea is that you can write another small function that auto-subscribes Log Groups.
  • Secrets Manager integration for loading the Splunk HEC token from Secrets Manager. (Or fall back to a simple environment variable if you like).
  • Serverless framework wrapper. Pass in your Security Group ID, Subnet IDs and tags, and let serverless CLI deploy the function for you.
  • Lambda VPC support by default. You should deploy this Lambda function in a VPC. You could change that, but my idea here is that most enterprises would be running their own internal Splunk inside of their corporate / VPC network. Change it by removing the VPC section in serverless.yml if you do happen to have a public facing Splunk.

You deploy it using Serverless framework, passing in your VPC details and a few other options for customisation.

serverless deploy --stage test \
  --iamRole arn:aws:iam::123456789012:role/lambda-vpc-execution-role \
  --securityGroupId sg-12345 \
  --privateSubnetA subnet-123 \
  --privateSubnetB subnet-456 \
  --privateSubnetC subnet-789 \
  --splunkHecUrl https://your-splunk-hec:8088/services/collector \
  --secretManagerItemName your/secretmanager/entry/here

Once configured, it’ll pick up any log events coming in from Log Groups you’ve ‘subscribed’ it to (Lambda CloudWatch Logs Triggers).

add your lambda CloudWatch logs triggers and enabled them for automatic ingestion of these to Splunk

These events get enriched with extra metadata defined in the function. The metadata is derived by default from the naming convention used in the CloudWatch Log Groups. Take a close look at the included Regex pattern to ensure you name your Log Groups appropriately. Finally, they’re sent to your Splunk HEC for ingestion.

For an automated Log Group ingestion story, write another small helper function that:

  • Looks for Log Groups that are not yet subscribed as CloudWatch Logs Triggers.
  • Adds them to your CloudWatch to Splunk HEC function as a trigger and enables it.

In the future I might add this ‘automatic trigger adding function’ to the Github repository, so stay tuned!

Synchronizing tmux Panes in a Window

Splitting a tmux session window into multiple panes can do wonders for productivity. Here you’ll see how easy synchronizing tmux panes is.

But, what if you would like to use this feature to automate a workflow across many machines?

You’ll be glad to know it is possible to synchronize panes in a tmux window.
This allows you to execute a series of commands or particular workflow across many machines.

synchronize panes in tmux

First of all, if you haven’t already, split your window into the panes you need. The commands for this of course (assuming CTRL + B is prefix for tmux commands) are:

Split vertical: CTRL + B, %Split horizontal: CTRL + B, “

You may wish at this point to connect each pane to the relevant machine you want in each.

Synchronize the panes by entering tmux command mode with:

CTRL + B, :

Now type:

setw synchronize-panes on

Hit enter and you’ll immediately notice all the panes are now synchronized. At this point you can go wild and execute whichever workflow you want to automate across the subjects.

You can toggle the synchronization off again by entering command mode once more and typing:

:setw synchronize-panes off

You can also use the same command to toggle synchronization on/off by omitting the on/off parameter on the end. Synchronizing tmux Panes is toggled when omitting this parameter.

:setw synchronize-panes

Multipurpose FreeNAS Server Build

multipurpose freenas server build

There is something magical about building your own infrastructure from scratch. And when I say scratch, I mean using bare metal. This is a run through of my multipurpose FreeNAS server build process.

After scratching the itch recently with my Raspberry Pi Kubernetes Cluster, I got a hankering to do it again, and this build was soon in the works.

Part of my motivation came from my desire to reduce our reliance on cloud technology at home. Don’t get me wrong, I am an advocate for using the cloud where it makes sense. My day job revolves around designing and managing various clients’ cloud infrastructure.

At home, this was more about taking control of our own data.

I’ll skip to the juicy specifications part if you would like to know what hardware I used right away.

The intial hardware
Note: I got this Gigabyte B450 motherboard, but soon found out it did not support ECC.

Final specifications:

These are the final specifications I decided on. Scroll down to see the details about each area.

The Goals

The final home server build would need to meet many requirements:

  • It should provide a resilient, large shared storage pool for network file storage across multiple Windows PCs at home.
  • Support NFS storage for sharing persistent volumes to my Raspberry Pi Kubernetes Cluster.
  • It should be able to run Plex for home and remote media streaming.
  • It must be able to run Nextcloud for home and remote mobile file storage.
  • Run services in Virtual Machines, Jails, or Docker containers. For example, I like to run Pi-hole as a DNS server for all my home equipment and devices.

The Decision Process

I started out my search looking at two products. Unraid and FreeNAS.

I have had experience running FreeNAS in the past for home lab setups. I never really used it seriously with the goal of making it reliable though.

This time around, all my files would be at stake, so I did a fair bit of research into the features and offerings of both products.

Unraid performed quite well for me. But, what pushed me away from it was the fact that it is a paid for, closed source, commercial product.

Unraid does make it super easy to bundle storage together and expand that storage in future if need be. However FreeNAS’ use of ZFS and it’s various other features were what won me over.

The Build Details

Having settled on FreeNAS, I went about researching which hardware I would need. My goal here was to not spend too much money, but at the same time not cheap out and compromise on reliability.

CPU, Motherboard, RAM

ECC (Error Checking and Correction) RAM is very important for ZFS, so this is basically what my build hinged on.

I found that AMD Ryzen CPUs support ECC, and so do most Ryzen compatible motherboards.

Importantly, in my research I found that Ryzen APU CPUs do not support ECC. Make sure you do not get an APU if ECC is important to you.

Additionally, many others report much better stability running FreeNAS on AMD Ryzen Generation 2 chips and above. With this in mind, I decided I would use at least an AMD Ryzen 2xxx CPU.

On the ECC topic, I only found evidence of single bit error correction working on AMD Ryzen systems.

I also made an initial mistake here in my build buying a Gigabyte B450M DS3H motherboard. The product specs seem to indicate that it supports ECC, and so did a review I found on Anandtech. In reality the Gigabyte board does not support the ECC feature. Rather it ‘supports’ ECC memory by allowing the system to boot with ECC RAM installed, but you don’t get the actual error checking and correction!

I figured this out after booting it up with Fedora Rawhide as well as a couple of uBuntu Server distributions and running the edac-utils package. In all cases edac-utils failed to find ECC support / or any memory controller.

checking ECC support with edac-utils
Checking ECC support with edac-utils

The Asus board I settled on supports ECC and edac-utils confirmed this.

The motherboard also has an excellent EFI BIOS. I found it easy to get to the ECC and Virtualization settings.

the Asus Prime X470-Pro EFI BIOS

Storage

I used 4 x Western Digital 3TB Red hard drives for the RAIDZ1 main storage pool.

Western Digital 3TB Red hard drives

The SSD storage pool consists of 2 x Crucial MX500 250GB SSD SATA drives in a mirror configuration. This configuration is for running Virtual Machines and the NFS storage for my Kubernetes cluster.

Graphics Card

The crossing out of APUs also meant I would need a discrete graphic card for console / direct access, and to install the OS initially. I settled on a cheap PCI Express Graphics card off Ebay for this.

A cheap AMD Radeon HD 6450 1GB DVI DisplayPort PCI-Express Graphics Card I used for the FreeNAS build.

Having chosen a beefy six core Ryzen 2600 CPU, I decided I didn’t need to get a fancy graphics card for live media encoding. (Plex does much better with this). If media encoding speed and efficiency is important to you, then consider something like an nVIDIA or AMD card.

For me, the six core CPU does a fine job at encoding media for home and remote streaming over Plex.

Network

I wanted to use this system to server file storage for my home PCs and equipment. Besides this, I also wanted to export and share storage to my Raspberry Pi Kubernetes cluster, which runs on it’s own, dedicated network.

The simple solution for me here was multihoming the server onto the two networks. So I would need two network interface cards, with at least 1Gbit/s capability.

The motherboard already has an Intel NIC onboard, so I added two more ports with an Intel Pro Dual Port Gigabit PCI Express x4 card.

Intel dual port NIC

Configuration Highlights

I’ll detail the highlights of my configuration for each service the multipurpose FreeNAS Server build hosts.

Main System Setup

The boot device is the 120GB M.2 nVME SSD. I installed FreeNAS 11.3 using a bootable USB drive.

FreeNAS Configuration

I created two Storage Pools. Both are encrypted. Besides the obvious protection encryption provides, this also makes it easier to recycle drives later on if I need to.

FreeNAS storage pool configuration
  • Storage Pool 1
    • 4 x Western Digital Red 3TB drives, configured with RAIDZ1. (1 disk’s worth of storage is effectively lost for parity, giving roughly 8-9 TB of usable space).
    • Deduplication turned off
    • Compression enabled
  • Storage Pool 2
    • 2 x Crucial MX500 250GB SSD drives, configured in a Mirror (1 disk mirrors the other, providing a backup if one fails).
    • Deduplication turned off
    • Compression enabled

The network is set to use the onboard NIC to connect to my main home LAN. One of the ports on the Intel dual port NIC connects to my Raspberry Pi Kubernetes Cluster network and assigned a static IP address on that network.

Windows Shares

My home network’s storage shares are simple Windows SMB Shares.

I created a dedicated user in FreeNAS which I configured in the SMB share configuration ACLs to give access.

Windows machines then simply mount the network location / path as mapped drives.

I also enabled Shadow Copies. FreeNAS supports this to enable Windows to use Shadow Copies.

FreeNAS Windows SMB share

Pi-hole Configuration

I setup a dedicated uBuntu Server 18.04 LTS Virtual Machine using FreeNAS’ built-in VM support (bhyve). Before doing this, I enabled virtualization support in the motherboard BIOS settings. (SVM Mode = Enabled).

I used the standard installation method for Pi-Hole. I made sure the VM was using a static IP address and was bridged to my home network. Then I reconfigured my home DHCP server to dish out the Pi-hole’s IP address as the primary DNS server to all clients.

For the DNS upstream servers that Pi-hole uses, I chose to use the Quad9 (filtered, DNSSEC) ones, and enabled DNSSEC.

pi-hole upstream DNS configuration with DNSSEC

NextCloud

NextCloud has a readily available plugin for FreeNAS. However, out of the box you get no SSL. You’ll need to setup your networking at home to allow remote access. Additionally, you’ll need to get an SSL certificate. I used Let’s Encrypt.

I detailed my full process in this blog post.

Plex

Plex was a simple setup. Simply install the Plex FreeNAS plugin from the main Plugins page and follow the wizard. It will install and configure a jail to run Plex.

To mount your media, you need to stop the Plex jail and edit it to add your media location on your storage. Here is an example of my mount point. It basically mounts the media directory I use to keep all my media into the Plex Jail’s filesystem.

Plex jail mount point

NFS Storage for Kubernetes

Lastly, I setup an NFS share / export for my Raspberry Pi Kubernetes Cluster to use for Persistent Volumes to attach to pods.

NFS shares for Kubernetes in FreeNAS

The key points here were that I allowed the two network ranges I wanted to have access to this storage from. (10.0.0.0/8 is my Kubernetes cluster network). I also configured a Mapall user of ‘root’, which allows the storage to be writeable when mounted by pods/containers in Kubernetes. (Or any other clients that mount this storage).

I was happy with this level of access for this particular NFS storage share from these two networks.

Next, I installed the NFS External-storage provisioner for Kubernetes on my Pi Cluster. I needed to use the ARM specific deployment manifest as Pi’s of course have ARM CPUs.

I modified the deployment manifest to point it to my FreeNAS machine’s IP address and NFS share path.

The kubernetes nfs client provisioner manifest configured for NFS storage provisioning.

With that done, pods can now request persistent storage with a Persistent Volume Claim (PVC). The NFS client provisioner will create a directory for the pod (named after the pod itself) on the NFS mount and mount that to your pod.

Final Thoughts

So far the multipurpose FreeNAS server build has been very stable. It has been happily serving our home media streaming, storage, and shared storage needs.

It’s also providing persistent storage for my Kubernetes lab environment which is great, as I prefer not to use the not-so-durable microSD cards on the Raspberry Pi’s themselves for storage.

The disk configuration size seems fine for our needs. At the moment we’re only using ~20% of the total storage, so there is plenty of room to grow.

I’m also happy with the ability to run custom VMs or Jails for additional services, though I might need to add another 16GB of ECC RAM in the future to support more as ZFS does well with plenty of memory.