I’ve been constantly evolving my cloud backup strategies to find the ultimate cheap S3 cloud backup solution.
The reason for sticking to “S3” is because there are tons of cloud provided storage service implementations of the S3 API. Sticking to this means that one can generally use the same backup/restore scripts for just about any service.
The S3 client tooling available can of course be leveraged everywhere too (s3cmd, aws s3, etc…).
BackBlaze B2 gives you 10GB of storage free for a start. If you don’t have too much to backup you could get creative with lifecycle policies and stick within the 10GB free limit.
Current Backup Solution
This is the current solution I’ve setup.
I have a bunch of files on a FreeNAS storage server that I need to backup daily and send to the cloud.
I’ve setup a private BackBlaze B2 bucket and applied a lifecycle policy that removes any files older than 7 days. (See example screenshot above).
I leveraged a FreeBSD jail to install my S3 client (s3cmd) tooling, and mount my storage to that jail. You can follow the steps below if you would like to setup something similar:
Step-by-step setup guide
Create a new jail.
Enable VNET, DHCP, and Auto-start. Mount the FreeNAS storage path you’re interested in backing up as read-only to the jail.
The first step in a clean/base jail is to get s3cmd compiled and installed, as well as gpg for encryption support. You can use portsnap to get everything downloaded and ready for compilation.
portsnap extract # skip this if you've already run extract before
make -DBATCH install clean
# Note -DBATCH will take all the defaults for the compile process and prevent tons of pop-up dialogs asking to choose. If you don't want defaults then leave this bit off.
# make install gpg for encryption support
cd /usr/ports/security/gnupg/ && make -DBATCH install clean
The compile and install process takes a number of minutes. Once complete, you should be able to run s3cmd –configure to set up your defaults.
For BackBlaze you’ll need to configure s3cmd to use a specific endpoint for your region. Here is a page that describes the settings you’ll need in addition to your access / secret key.
After gpg was compiled and installed you should find it under the path /usr/local/bin/gpg, so you can use this for your s3cmd configuration too.
Double check s3cmd and gpg are installed with simple version checks.
A simple backup shell script
Here is a quick and easy shell script to demonstrate compressing a directory path and all of it’s contents, then uploading it to a bucket with s3cmd.
tar --exclude='./some-optional-stuff-to-exclude' -zcvf "/root/$TIMESTAMP-backup.tgz" .
s3cmd put "$TIMESTAMP-backup.tgz" "s3://your-bucket-name-goes-here/$DATESTAMP/$TIMESTAMP-backup.tgz"
Scheduling the backup script is an easy task with crontab. Run crontab -e and then set up your desired schedule. For example, daily at 25 minutes past 1 in the morning:
Amazon S3 is overkill for simple home cloud backup solutions (in my opinion). You can change to use infrequent access or even glacier tiered storage to get the pricing down, but you’re still not going to beat BackBlaze on pure storage pricing.
Digital Ocean Spaces was nice for a short while, but they have an annoying minimum charge of $5 per month just to use Spaces. This rules it out for me as I was hunting for the absolute cheapest option.
BackBlaze currently has very cheap storage costs for B2. Just $0.005 per GB and only $0.01 per GB of download (only really needed if you want to restore some backup files of course).
You can of course get more technical and coerce a willing friend/family member to host a private S3 compatible storage service for you like Minio, but I doubt many would want to go to that level of effort.
So, if you’re looking for a cheap S3 cloud backup solution with minimal maintenance overhead, definitely consider the above.
The other weekend I managed to get some spare time to do another update to my ESXi 5.0 / 5.1 Host Backup & Restore GUI utility, this time it has been updated to version 1.3. I didn’t post up the changes as it was done by special request from one of my blog readers (thanks Flavio!) However, after receiving more comments with a few others having a similiar issue to what Flavio had, I thought I should definitely post the updated version here, which should hopefully solve the issues some people are seeing.
The changes are based on feedback received in the comments I have received about the utility relating to exceptions received when users in some circumstances try to backup their host configurations. Specifically the exception message “Exception caught: Get-VMHost VMHost with name ‘xxx’ was not found using the specified filter(s).”
A quick post today to just mention that I have updated my ESXi 5.0 / 5.1 Host Backup & Restore GUI utility to version 1.2.
There are a couple of improvements in 1.2 based on feedback received in the comments I have received about the utility. The main improvement introduces a function in the script which backs the GUI to check that ESX hosts are valid before attempting to backup or restore these. You can check the utility out over on it’s page here.
Updates (29-12-2012) – version 1.2:
Added ESX/ESXi host validation into utility – will now test that the host is valid and either connected or in maintenance mode before attempting backup or restore (See the script’s new “Check-VMHost” function for those interested)
This little host backup utility I created back in February 2012 has been receiving quite a bit of attention, and has already managed to get over 2000 downloads.
Someone recently asked the other day if it was possible to restore a configuration file to a new host (i.e. new hardware). With version 1.0 of my utility, this was not possible due to mismatches that the PowerCLI cmdlet finds (i.e. MAC addresses on NICs etc… on the new hardware when compared to the existing backup). However, the Set-VMHostFirmware cmdlet allows the use of a -Force paramter, so I set about updating the utility to allow for this.
Here is a quick list of changes in version 1.1
Allows restore to new hardware (tick the “Force restore to new hardware” checkbox). Please note that I have only very briefly tested this on virtualised ESXi hosts – it works, but I am not sure how networking configurations are applied to NICs and differing physical NIC orders – so it is best to test this thoroughly in a dev/test environment before using anywhere else!
Tested against single ESXi hosts as opposed to connecting to vCenter first.
Updated labels to neaten up a bit – connection box now shows that you can connect to single hosts or vCenter
PHD Virtual Backup & Replication is as the name would suggest, a complete, all-in-one backup and replication package. It is available in both VMware and Citrix XenServer flavours. I have long been a user of other Virtualization Backup Solutions and up until recently, never had the chance to play with PHD’s offering. A couple of weeks ago, PHD Virtual asked me to take a look at their Backup offering and put down my thoughts in the form of a sponsored review. That being said, I got the appliance installed in my lab environment and set about putting down my thoughts and observations about the product whilst using it for various backup, recovery, and replication tasks in my lab over the last two weeks.
Thoughts and Observations
Getting PHD Virtual Backup and running in my Virtual Lab environment was an absolute pleasure. Let’s just say the product definitely does what it says on the tin – installation was as simple as deploying the downloaded OVF file with the vSphere client (File -> Deply OVF Template), powering up the “Virtual Backup Appliance” and setting up some basic network settings. I would say the longest part of the installation for me was finding the line in the installation steps that said “Press CTRL + N to enter the network settings in the console” (which wasn’t long at all)! After entering my network settings, I had the choice of either browsing to the IP address of my appliance, or running the PHDVB_Install.exe file to get the Virtual Appliance “Management” console installed. I simply ran the installer and within 8 minutes or so (from start to finish) I had PHD Virtual Backup & Replication up and running in my vSphere lab.
The product supports VMware and Citrix (XenServer) in terms of hypervisor platforms. As stated above, in this review I will be working with a VMware vSphere 5.0 environment, and have therefore put the VMware edition to the test.
The observation I liked this far into my experience was that I didn’t have to make the choice as to whether I should be running my backup solution on a physical or virtual machine – its simple – the product is a Virtual Appliance. You deploy the initial appliance, and if needed, scale by deploying more virtual appliances. This means you don’t need to worry about managing a separate physical server(s) for your backup solution. This is just one of the reasons why PHD Virtual Backup is so easy to deploy.
The Virtual Appliance is pre-configured with the following specifications:
In terms of actual backup storage, you do of course have a few options.
Add a Virtual Disk to the Appliance itself (VMDK)
Configure Network storage (which could be):
a CIFS target
an NFS target
I chose to use a separate NFS mount on a Virtual Appliance I use for general purpose storage and backup in my lab, so I simply opened the appliance management console (right click in vSphere Client -> PHD Virtual Backup -> Console) and went to “Backup Storage” under “Configuration” to configure my NFS datastore as a backup target. You can also set up a couple of thresholds for warning / stop levels in terms of free disk space on your target, as well as enable/disable backup compression at this stage.
Backing up VMs
As the virtual appliance integrates in with the vSphere client, dealing with configuration tasks and actually setting up backups for your VMs is simple. No need to remote to another server or open up a console to your backup appliance VM. For my testing I configured a couple of different backup jobs – one to backup my VC, Update Manager and other VI VMs and one to backup a couple of general purpose VMs in my lab.
Backup speeds themselves were of a good level and on par with what I would expect from a product that utilises the VMware vStorage APIs for Data Protection (VADP). My first job that I ran took a little while to do the first initial (full) backup, but after this the subsequent runs of the backup job correctly used CBT (Change Block Tracking) to pick up on only changed blocks and copy these up, significantly reducing backup times of my VMs. VMware Hotadd is also utilised to help with quicker VM Backup times. Each job that runs gives you some detailed information on statistics such as:
Dedupe Ratios (Per VM and Per individual VM Disk)
Job average speed
Dedupe Ratios (Per Job)
Total amount of Data Written (useful for tracking how well CBT is working for example)
Scheduling / Time details
A nice feature I found at this stage was the ability to look at a detailed job log right from the console. Let’s say you have a job or VM in a job that gave a warning or error message for some reason, and you wished to find out the cause. All you need to do is right-click the job name and select “View Log”. This pops up a window with a detailed, timestamped job log, allowing you to dig in to each step of the backup process and see what happened at each stage of the particular backup job.
File Level Restore
Restoring files is also a simple task. From the main console, there is a “FLR” (File Level Recovery) section which handles this process. I tested restoring files from within two different VMs using this console. Both were Windows Guests (one Server 2003 Standard and one Server 2008 R2 Standard VM). The process went as follows:
Under “Backup Catalog” where your previous backup jobs are listed, select the VM / VM Disk you would like to restore from.
Click the “FLR” button.
Go through the “Backup to Share” wizard and tick on the option to “Add target to iSCSI Initiator on this computer”.
Finish Wizard, and the VM Disks are mounted on the local machine and are now accessible.
Following the Wizard through to mount the VM Disk/s on local machine for File Level Restore
If you take a look at the Microsoft iSCSI Initiator tool you can see the two targets that have been mounted…
Incidentally, doing file-level restores from Linux/Unix based VMs can also be done by PHD VB. You just need to supplement the restore process with a third-party tool such as “Ext2explore”. You will follow the same process to mount the VM disks using the FLR wizard, but then just use Ext2explore to actually browse the mounted disk/s instead of Windows Explorer.
Restoring full VMs
I must say that I really like the features available in PHD Virtual Backup & Replication when it comes to doing full/partial restores of VMs. The wizard you use is nicely laid out and functional. You also get some great restore options such as; appending a “_restored” tag to the end of your restored VM name, auto-generation of a new MAC address for the restored VM, and changing of the default VM network (portgroup).
These are all great features when it comes to restoring VMs. Especially if you are restoring back into a production environment alongside the original VM and would like to ensure that there are no network conflicts for example. I have a dedicated, isolated VM network for testing (no vSwitch uplinks to physical adapters) so the option to change the default network on the VM to restore was perfect for me to test with.
PHD Virtual Backup also has replication functionality. Ideally you will want to have more than one VBA (Backup Appliance) running. For example, one in your DR Site, and one in your Production site. The appliance in your DR site will essentially connect in to the Backup Storage at your production site and hook into your backup jobs done there to find the latest changes of the VM backups done to replicate. So ideally when you set up a particular replication job, you should schedule it to start a short while after the relevant backup job completes. This ensures you get the latest changes replicated. The replication job will fetch only the changes since the last run. To enable replication, you just need to complete a once off configuration task using the PHD VB Console – adding a Replication Datastore. All this is, is pointing the appliance to an existing PHD VB Backup storage area – this can be a CIFS, NFS or VMDK Disk store that you are currently using for backups. As with VM Restores, you also get some useful options when replicating to change VM networks (VM portgroups) or auto-generate new MAC addresses for replicated VMs. I should also mention that you are also able to do replication even with just one VBA.
From the PHD Console, you are able to test your replicated VMs. This is quite a handy feature and after putting a replicated VM into “TESTING” mode, you can then use the vSphere client to power up your replicated VM and perform any testing and validation you might require. A snapshot is added to the VM to ensure that the state of the VM pre-testing is preserved. Once testing is complete, you simply just click “Stop Test” in the console. The VM is powered down and changes are rolled back to the pre-testing state.
“All in one” backup solution (everything you need in one Virtual Backup Appliance).
Simple and quick to deploy (or scale by adding more VBAs).
Good feature set (VM Backup, File Level Restore, Full VM restore, and Replication).
Easy to work with – simple/logical User Interface.
Integrates with the vSphere client for quick and easy access to Configuration, Backup, Restore and Replication options.
Great File-level restore – quick and easy access to files within VM backups (Windows or Linux/Unix).
Nice features available to change networking settings on restored VMs for testing or running alongside existing VMs.
Configurable VM Backup retention settings
Processing of multiple VMs at once in a backup job – allows VMs to be backed up in multiple streams instead of a “serial” fashion.
No network “fine tuning” options – example: fine tuning deduplication ratios when backing up over a WAN or LAN as opposed to direct disk storage. This would essentially allow you to have quicker backups for local storage jobs (albeit larger) or longer backups, but with smaller sizes to transmit over WAN links.
A couple of small caveats when using Replication (such as VM configuration changes are not replicated when changing settings on the original source VM, to the replicated VM).
No automation options – this would be nice to have in terms of backup, restore, replication or reporting automation. (A PowerShell module would be nice to have).
At the end of the day, PHD Virtual Backup is a great integrated Backup and Recovery product, with a little bit of room for improvement to add some extra “nice to have” features. The VBA (Virtual Backup Appliance) is dead easy to deploy and manage, and so is managing your backup, restore and replication processes. I think these are the best parts of the appliance. Whilst using it I found that each of the various Backup and DR processes I needed were easy to use through the combination of a well laid out UI and interface that “just works”. Access to files in VM Backups via the file-level restore wizard was a highlight for me – it didn’t take long at all to get at historic files and restore them using the “FLR” Wizard.
The appliance offers a good selection of options, but these could be bettered by offering some form of automation (perhaps PowerShell access) and some more advanced settings for power-users. My thought was that some more advanced backup job options could be made available for power users to fine tune compression or deduplication ratios.
A free trial of the product is available and I would definitely encourage you to take a look at this – as mentioned above, being so easy to deploy and manage it won’t be long before you are up and running. This Backup & Replication product does offer everything you need to handle DR for your VMware Virtual Environment.
Installing PHD Virtual Backup & Replication for VMware vSphere
The other day I was asked to collect some statistics on our Veeam Backup & Recovery server from as many VM Backup jobs as possible. The environment has roughly 70 scheduled jobs thats run either daily or weekly. After searching around a bit first I could not find any current solution or built in method to retrieve the info I needed to collect in a quick or automated way. First ideas were to either somehow grab the info via SQL queries from the Veeam database, or to rather take a sampling of 10-20 different types of jobs and their backup sessions over one normal incremental run day, and one normal full backup day (Manually collecting this data from email reports would be quite a slow process).
After browsing around the Veeam Community Forums I suddenly remembered that there was a PowerShell module that Veeam Include with B&R. I read the basic documentation and got acquainted with a few simple cmdlets. I wanted to build a report, that would loop through every single Veeam B&R Job we have, and grab data from the last 7 backup sessions of each (daily backups), therefore giving me a good idea of both full backup and incremental backup runs performance, times taken etc… My first attempt at a script got me almost all the way there (tried during spare time in my evenings!) – I was however having trouble matching backup session data with the right day’s backup file stats – sometimes the ordering was out, and I would get metrics back for a backup file that was not from the correct day. Before I was able to resolve this myself, help arrived from “ThomasMc” over at the Veeam Community Forums. (Thanks Thomas!) We got a script together that was able to match up sessions correctly. I then added a few more features, as well as some nice HTML formatting and the ability to grab statistics for all jobs instead of just one sample job. The resulting script gets the following info for you:
Index (1 = the last backup sesion, 2 = the day before that, etc)
Start time of job
Stop time of job
File Name (Allows you to determine if the job was a full or incremental run)
Average Speed MB – average processing speed of the job
Duration – time the job took to complete
Result – Success/Warning/Failed (Failed is highlighted in red)
Here is an example of the report run on my Veeam Backup & Recovery Lab environment at home (Thanks to Veeam for the NFR licenses they gave out to VCPs earlier this year!)
So, to run the above script, launch a PowerShell session from within Veeam B&R (Tools -> PowerShell). This will make sure your PowerShell session launches with the Veeam Automation/PowerShell snapin. Execute the script and you’ll get an HTML file output to the root of your C:\ drive. By default, all jobs you have in Veeam will be detailed. If you wish to sample a specific job, or a job with a certain word/phrase in it, adjust the -match parameter for the Get-VBRJob cmdlet line near the top of the script. The default setting is an empty string – i.e. “”. To change how many sessions the the script fetches for each backup job, just change the “$sessionstofetch” variable defined at the top of the script.
I have added comments throughout the script for those interested in how it works. Lastly, you could also quite easily modify this script to e-mail you the report, or even run it as a scheduled task. Let me know if you need help doing this and I’ll gladly modify it as required.