Running an S3 API compatible object storage server (Minio) on the Raspberry Pi

I’ve recently become interested in hosting my own local S3 API compatible object storage server at home.

So tonight I set about setting up Minio.

Image result for minio

Minio is an object storage server that is S3 API compatible. This means I’ll be able to use my working knowledge of the Amazon S3 API and tools, but to interact with my own, locally hosted storage service running on a Raspberry Pi.

I had heard about Zenko before (an S3 API compatible object storage server) but was searching around for something really lightweight that I could easily run on ARM architecture – i.e. my Raspberry Pi model 3 I have sitting on my desk right now. In doing so, Minio was the first that I found that could easily be compiled to run on the Raspberry Pi.

The goal right now is to have a local object storage service that is compatible with S3 APIs that I can use for home use. This has a bunch of cool use cases, and the ones I am specifically interested in right now are:

  • Being able to write scripts that interact with S3, but test them locally with Minio before even having to think about deploying them to the cloud. A local object storage API is going to be free and fast. Plus it’s great knowing that you’re fully in control of your own data.
  • Setting up a publically exposable object storage service that I can target with serverless functions that I plan to be running on demand in the cloud to do processing and then output artifacts to my home object storage service.

The second use case above is what I intend on doing to send ffmpeg processed video to. Basically I want to be able to process video from online services using something like AWS Lambda (probably using ffmpeg bundled in with the function) and output the resulting files to my home storage system.

The object storage service will receive these output files from Lambda and I’ll have a cronjob or rsync setup to then sync the objects placed into my storage bucket(s) to my home Plex media share.

This means I’ll be able to remotely queue up stuff to watch via a simple interface I’ll expose (or a message queue of some sort) to be processed by Lambda, and by the time I’m home everything will be ready to watch in Plex.

Normally I would be more interesting in running the Docker image for Minio, but at home I want something that is really cheap to run, and so compiling Minio for Raspberry Pi makes total sense to me here, as this device is super cheap to level powered on 24/7 as opposed to running something beefier that would instead run as a Docker host or lightweight Kubernetes home cluster.

Here’s the quick start up guide to get it running on Raspberry Pi

You’ll basically download Go, extract it, set it up on your path, then use it to compile Minio’s source code into an ARM compatible binary that you can run on your pi.

sudo tar -C /usr/local -xzf go1.10.3.linux-armv6l.tar.gz
export PATH=$PATH:/usr/local/go/bin # put into ~/.profile
source .profile
go get -u
mkdir ~/minio-data
cd go/bin
./minio server ~/minio-data/

And you’re up and running! It’s that simple to get going quickly.

Running interactively you’ll get a default access and secret key in the terminal, so head on over to the Web UI / interface to check things out: http://your-raspberry-pi-ip-or-hostname:9000/minio/

Enter your credentials to login.

Of course at this stage you can also start using your S3 API compatible command line tools to start working with your new object storage server too.


Scaling Web API 2 and back-end SQL databases in Azure

I recently created a small Web API 2 project running with a back-end SQL database (Entity Framework code first), and had it deployed to an Azure web app, along with Azure SQL.

Naturally, I started it off using the free web app and one of the cheapest possible Azure SQL tiers (S0 – 10 DTUs).

After I finished working on the API, I wanted to see what sort of performance I could get out of it, by using Azure’s various scaling options.

To test I used This is a really nice and easy to use load testing service by SendGrid Labs. The free edition allows me to setup various API endpoint tests and run many concurrent connections for up to 1 minute at a time.

All my tests below were done using the same GET request test. The request always returned a collection of 5 x objects from the /Animals endpoint to keep things consistent.

My initial test was against the F1 free app tier for the Web app, with the SQL database running on S0 (10 DTUs). Here are the results of sending 500 requests per second for 1 minute.


The API struggled to complete the full 60k requests over 1 minute, and only completed about 8k requests, with an average response time of 4638ms. Terrible, but then again we are running on very low performance, cheap tiers. I had a look at the database performance stats and noticed that the DTUs were capped out at 100% during the 1 minute load test. At this point it definitely seems to be the database performance holding things back.

Scaling the database up to the S1 tier (20 DTUs) gives a definite improvement in response times and number of requests able to be sent within one minute. If we look at the database performance stats in the portal, we can now see that the DTUs are still maxing out at 100% though.


20-DTUs-maxed out

At this point I decided I would increase database performance again, but throw more requests per second at the API (from 500/second up to 1000/second).

Scaling the database up to S2 (50 DTUs) and throwing more requests a second at the API, and the number of requests completed in total higher now – up by about an extra 5k. Taking a look at the DTU performance status, we can see they now maxed out at around 60%. At this point it is pretty clear that the database is no longer the bottleneck.

50-DTUs-maxed out at 60% - even with doubling the requests per second from 500 to 1000

50-DTUs-maxed out at 60%

Now I scaled the web app tier up from free, to the B1 (Basic) tier, which gives you 1 Core, 1.75GB RAM, and up to 3 x instances scaled manually. I started with just the default 1 instance and ran the 1000 req/second for 1 minute test again.

boo-test-failed-error-rate-higher-than-50% due to timeouts

The results were pretty dismal compared to the free tier now. In fact the test failed due to an error rate of greater than 50% (all caused by timeouts). It is important to remember that we have not yet scaled out from the default 1 instance though.

Scaling up to 2 x instances on the B1 tier, helped quite a bit. The test now completes, and has a much smaller timeout error rate. Many more responses were served, but the response rate was quite slow. Taking a look at the distribution of CPU time over the two instances, we can also see that the traffic is indeed being split between the two instances we’ve scaled out with.


yay-test-finished-with much smaller error rate

processor time spread over two instances during load test

Taking this one step further to 3 x instances, and re-running the test nets us the best result so far. No timeout errors, and a response time averaging around 3000ms. Much better, but still quite a high response time, and not all 60k requests are being served.

I scaled up to the B2 tier for the following run. Each instance has 2 x cores and 3.5GB RAM this time. Starting at 1 x instance and running the test on these higher specification web instances seems to now handle things a lot better.

Little to no timeout errors, with about 5000ms avg response time, but using only 1 x instance this time!

Pushing things right up to 3 x instances (2 cores and 3.5GB RAM each) nets us the best result yet. The average response time is down to 1700ms and there are no timeout errors at all. The API was able to handle 49000 requests in the 1 minute test, which is the highest number of requests it has been able to handle so far.


I scaled up to the B3 tier from here, and tried another few runs using 3 x instances (at 4 x cores and 7GB RAM each). This didn’t help things much, netting around 200ms better response time, for a much pricier tier. It therefore looks like the sweet spot for this kind of work is to scale out with medium sized instances (2 x cores each), rather than scaling up too much.

I changed the tier to S2 (2 x cores 3.5GB RAM each, but allowing up to 10 x instances scaled out) and this time, running the test gave very similar results to 3 x instances. Clearly, the instances were now no longer the bottleneck. Looking back at the database performance, I saw that the DTUs were maxing out at around 90%. It was clear that there must have been some throttling happening there now.

I changed the database DTUs to 100 using the S3 tier, and re-ran the test once more.


Bingo! We’re now managing to serve the test’s 1000 requests a second, and over the 1 minute test, we get all 60k requests served successfully, and have a reasonable average response time of roughly 300-400ms.

I made a quick change to the GET method in the API for this endpoint to gather items from the database asynchronously, and running the same test again, now gets us all the way down to an average response time of just 100ms over the 60k requests in one minute. Excellent!


As you can see, by running load tests like this, and trying out different scaling options for the front end and back end, logically scaling each whenever you see bottlenecks in test results or performance metrics, you can after some time determine the best specification for your database and web apps.


Simple Content Delivery Network (CDN) using Amazon AWS (S3 + CloudFront)


Content Delivery Networks

Having a content delivery network has many benefits for your users or clients. One of the most obvious reasons of having a CDN, is the ability to serve up content to your users from multiple (often the most optimal) locations.  Users access files that originate from one original source location, but the content is delivered by the closest location(s), often with the lowest latency and highest possible speed.

Using Amazon CloudFront, you can share dynamic, static, or even streamed content to users (including full websites), using Amazon’s global network of edge locations. This means that content can be served to users at the highest possible speeds, with the lowest possible latencies. In this blog post, I will cover the steps you need to take to deploy a basic CDN using Amazon AWS. For this purpose, we will leverage a combination of Amazon S3 + CloudFront.


Setting up Amazon S3

Amazon S3 (Amazon Simple Storage Service) is essentially Amazon’s “storage for the Internet”, and as explained above, CloudFront is a content delivery network service. As such, both products sit in Amazon’s “Storage & Content Delivery” stack.


  • To get started you will of course need an Amazon AWS account. Go to and register. You will need to provide credit card details, but most products have some sort of free tier that you can utilise for initial testing (usually free for up to 1 year, based on certain utilisation thresholds).
  • Once you are all signed up, you’ll need to navigate to the AWS Web Console. This is the central location you can use to manage all AWS services (among other options such as the AWS SDK and Command Line).
The central, AWS Web Management Console
  • To start, we’ll need to define an origin location for our content. This is the location our original files are kept. For this purpose, we will use Amazon S3. It allows us easy access to files that we place in something Amazon call a “bucket”. I like to think of it as a folder, or container. You can have as many buckets as you wish, however each one’s name needs to be completely unique across Amazon S3. Click on “S3” under the “Storage & Content Delivery” heading of your AWS Console to get started.
  • From here, you will be greeted with a welcome page and some explanation of what S3 is. Simply click “Create Bucket” to get going.



  • Provide a unique bucket name, and specify a region to use. Regions have the benefit of allowing organisations to comply with storage regulation rules – for example, if you were storing client data that you were bound legally to keep within the UK, you would specify the Ireland region.



  • Your new bucket will appear in the S3 Management Console after being created. Simply click the name of the bucket to open it. For our simple CDN, we’ll just be serving up one single file – pretend this was a really large file that needed efficient distribution to many people – for example a large media file. At the top left, you’ll see an “Upload” button. Click this, and choose a file to upload as your test file. I will be using a simple image file. (By the way, Amazon have a service called “Amazon Import/Export”, which allows you to send really large amounts of data via post on portable media to Amazon for them to upload directly to your Amazon S3 or Glacier services).
  • Click “Start Upload” once you have chosen a file to test with.
  • After the file is finished uploading, it will appear in the console under your bucket name. (I called mine “image-for-distribution.png”).



  • Right-click the file, and choose the option “Make Public” for this test. This choice would be affected by the nature of the files you would want to deliver to users in your own configuration, but for this simple example, this is what I am choosing.
  • Right-click the file again, and choose “Properties“. Here you can get the direct, public link to your file and test access to it in your web browser. This is simple, direct access, and is not the access we are aiming for, as we will utilise our CDN with CloudFront to serve the file in our final configuration. This is just to test that the direct link is working.



Setting up CloudFront and your Distribution

  • Now that we know our basic file is being correctly served from Amazon S3, we’ll navigate to “CloudFront” from the main AWS Console ( A quick way to get there is by clicking the orange cube icon in the top left of your AWS page – wherever you are in the console, it’ll take you back to the main AWS console. From there just click “CloudFront“.
  • In CloudFront, we’ll want to create something called a “Distribution“. Click the “Create Distribution” button to get started.



  • Make sure you select “Download” type for the “delivery method” when asked on the next page, then click “Continue“.



  • We’ll now select various options for our CloudFront Distribution.
    • For “Origin Domain Name“, click the text box and you’ll see a populated list of Amazon S3 buckets. Your bucket you created earlier should feature here. Click it to select it.
    • The “Origin ID” should auto populate based on your S3 bucket name you chose.
    • If you wish to restrict users to only access your content via CloudFront URLs, and not direct by S3 URLs, then choose “Yes” for “Restrict Bucket Access“.
    • If you chose “Yes” for restricting bucket access, you’ll also need to create a “Comment” and “Grant Read Permissions” on the bucket for CloudFront’s access to the S3 bucket. Click “Yes, Update Bucket Policy” to have CloudFront get read access automatically to the S3 bucket.
    • Select “HTTP and HTTPS” for “Viewer Protocol Policy“.
    • You can customise the object caching properties if you wish, but for this example, just leave the “Default Cache Behavior Settings” on their defaults.
    • Now you can set your “Distribution Settings“. Choose “Use All Edge Locations (Best Performance)” for “Price Class“. This will ensure that all edge locations around the world are used to distribute your content in the fastest, most efficient way to your users. You could also restrict this to other groups of regions e.g. only the US and Europe for example – this would be a cheaper option, but not as efficient for all users globally.
    • Next, we can add an alternate CNAME for the distribution. This is highly recommended so that you can provide your own domain name formatted URLs to users, instead of a long, ugly default Amazon CloudFront URL. Enter something now, (for example I will use as I own the domain and can create this CNAME record myself in DNS). Once you are complete with this distribution setup, you should get the Distribution URL, and point a new CNAME record to the full URL that CloudFront assigns to your distribution.
    • Leave all other options at their defaults for now, and make sure that the last option “Distribution State” is “Enabled“, then click the “Create Distribution” button at the very bottom.

example-distribution-settings1 example-distribution-settings2

  • Your Distribution should now be created. Use the Navigation menu on the left side of the screen and click “Distribution” to see a list of your CloudFront Distributions.



  • At first the “Status” will show “InProgress“. After a few minutes this should change to “Deployed“.
  • In the mean time, look for your “Domain Name” that this Distribution has been assigned, and go and create a CNAME record pointing the CNAME you specified when creating this distribution, to the domain name. For example, you may have something like In my case, I specified a CNAME of, so I will create a CNAME record linking these together.



Once your CNAME record is created, type in your new CNAME record, followed by a forward slash, and then the name of the file you originally uploaded to your S3 bucket that is linked to by this CloudFront distribution. For example, my file was called “file-for-distribution.png” and my CNAME record I made is So to utilise my CloudFront CDN, I would simply access the file as “”. If your DNS takes a while to apply/propagate, then you can simply use the CloudFront domain name assigned to your Distribution (for example to test out your distribution. Remember to ensure your distribution is in a deployed state before testing. You should now see your file served up in your web browser via your brand spanking new Amazon AWS powered CDN!



That concludes the basic setup of a Amazon S3 + CloudFront powered Content Delivery Network. I hope this was useful for some. In forthcoming blog posts I will delve into setting up custom logging and monitoring / alerting for your CDN. Please remember to like/share/tweet this post out to friends if you thought it was useful.



Cloud Credibility challenges – blogging about my team members

So there is a fun website called “CloudCred” that allows individuals or teams to participate in various tasks and challenges –  everything from technical challenges to social and fun are covered and it is quite a good team building exercise, apart from the leaderboard challenge aspect!

One of the tasks is to blog about my team members and include links to their own blogs. We have quite a few team members so I can’t cover all of them, but here goes:

Of course this task is for our team – Xtravirt Limited, so we also have a company blog you can go and visit for some excellent content around the Cloud and Virtualisation industry.

Troubleshooting the Autolab vCloud Director 1.5.1 installation

I have had this issue twice now, where deploying vCD via the Autolab PXE boot option on the vCD VM fails. As far as I can tell, the process seems to fail on the Oracle Express DB installation, due to the RPM not being a valid package.  The vCloud Director steps seem to be the same for Autolab 1.0 or 1.1, so the following applies to both.

error: /root/oracle-xe-11.2.0-1.0.x86_64.rpm: not an rpm package (or package manifest)

You can see the error I was getting in the screenshot I captured during boot time below. I had checked the RPM file and everything else to ensure it was in place, and indeed it was. Even vCD installs via the script, although it of course does not work due to the database not being there.



Here is the process I used to correct my vCD install.

  • Allow VM to finish booting, even with the missing oracle DB.
  • Use PuTTy to SSH to the vCD VM (either direct from your VC or DC VM, or if you have the route setup, from your host machine (in the case you are using VMware Workstation for example). Default credentials are in the Autolab setup guide document
  • Open up the “Build” share on the NAS VM, and location the vcd-install script. Default location: \\\Build\Automate\CentOS\vcd-install (open this with a text editor)
  • Locate the method for each section of the install script. There is a section for each process in the script. For each method, copy out the entire block, paste it into a new text document, and remove any exclamation marks from any “echo” parts of the script. I found that manually tracking through this script using PuTTy gave me issues with the exclamation marks being misinterpreted by the shell, so I removed these. You’ll need to get a script block for the following sections and do this:
    • verify() {}
    • installOracle() {}
    • configureOracle() {}
    • generateCertificates() {}
    • installvCD() {}
    • configurevCD() {}
  • Remember to copy the whole block, including the start and end braces {} – paste these into a new text document, remove the exclamation marks, then copy-paste them back into your shell open in PuTTy. Hit enter, and the method will be entered and ready for use.
  • Once all the methods have  been copied in, you can simply type the name of the method, followed by enter to execute them. By doing it this way, you can manually step through the process and figure out where any potential remaining issues may be. This script is normally executed during the PXE boot installation process so you don’t really get a chance to slowly track through it.
  • Type each method in until you reach and complete the last “configurevCD” one
    • verify
    • installOracle
    • configureOracle
    • generateCertificates
    • installvCD
    • configurevCD
  • You may find that the generateCertificates and installvCD methods complete and echo out that they had already been completed prior – this is fine.
  • After configurevCD finished, all being well, you should now have vCD started, and you should be able to browse over to https://vcd.lab.local and finish the initial configuration via the vCD web page.


Other tips to try would be to:

  • MD5 hash check the RPM of the Oracle Express database that you download and place in your Build share – make sure it is not a corrupted file
  • Ensure you have the correct version of vCD and the Oracle Express database downloaded


The latest trends in VMware and Cloud Computing

cloud computing


VMware promotes virtualization as a catalyst for cloud computing. Cloud infrastructures are built on and powered by VMware. VMware allows IT professionals to build solutions that are specifically tailored to a client’s individual needs. Internal and external clouds may be created to handle the needs of a growing business. Hybrid clouds are growing in popularity for businesses that want the convenience of both. Here are some of the benefits of VMware cloud virtualization:


  • Efficient Processes. VMware makes it possible to automate processes and employ utilization to increase IT performance. When IT professionals leverage existing resources and avoid expenses related to infrastructure investment, the total cost of ownership (TCO) is reduced tremendously.
  • Agility. End-users gain a more secure environment with cloud computing. With VMware, IT professionals can be assured that they will preserve IT authority, control and security while remaining compliant. Processes are also simplified to make the job easier. An IT organization is able to respond quickly to organizations with evolving business needs.
  • More Flexibility. IT professionals can use VMware in conjunction with traditional systems for maximum flexibility. The systems may be deployed internally or externally. When configuring VMware, IT professionals are not limited to using any one vendor or technology. The solutions are portable and are capable of using a common management and security framework.
  • Better Security. VMware solutions protect end-points, the network edge and applications through virtualization. The cloud based deployments of security patches and solutions are dynamic and constantly being updated.
  • Automation and Management. With VMware, a highly efficient, self-managing infrastructure can be created. Business rules and policies can be mapped to IT resources when the tools are virtually pooled.
  • Portable and Independent. Open standard VMware solutions provide more flexibility and reduce the dependence on a particular vendor. With this security model, applications are easily portable from internal datacenters to external service provider clouds. The applications are also dynamic, optimized and deployable on public clouds with VMware cloud application platforms.
  • Saves Time. A self-service cloud-based portal is capable of reducing time spent by deploying standardized solutions that have been pre-configured to operate off-the-shelf or out-of-the-box. This method promotes efficiency through automation and standardization. Tailored services are also popular and can be achieved with VMware solutions. IT can remain in compliance and preserve control over policies with VMware.
  • Virtual Pooling and Dynamic Resource Allocation. Virtual datacenters are created by pooling IT resources through abstraction. Logical storage building blocks, server units and network are integrated into the solution to power applications. This process is completed in accordance to regulations and business rules. User demand also plays a role in how these applications are deployed and hosted.


How Businesses are using VMware to transition to the Cloud

Dynamic businesses have a need for a robust and affordable IT solution. Most businesses use 70 percent of their resources focusing on maintenance of servers and applications in a traditional system. With only 30 percent of the IT budget left for innovation, companies cannot grow and provide the type of service and products its clients need and desire. IT management is searching for a better strategy, and VMware seems to be a viable solution.

VMware provides users with faster response times. Faster response times lead to lower costs over time. Self-managed virtual infrastructures are efficient and preferred by many businesses.

IT professionals can identify which cloud-based solution is best for your company. The choices typically consist of a public, private or hybrid solution. Many companies have successfully implemented these solutions.

VMware’s cloud infrastructure and management application is commonly known as vCloud Director.  This application will allow a company to transition to the cloud at their own pace. The application was introduced in 2011 to provide companies with greater flexibility and efficiency in the cloud.

VMware’s solution allows companies the ability to leverage their existing infrastructure. This saved business owners significant time and money. The savings could then be reinvested for innovation. VMware’s cost-effective solution provides an answer to the pre-existing solution of 70 percent spending on infrastructure maintenance.

NetApp has exceptional backup and recovery capabilities that are necessary for any company’s disaster recovery solution. Within minutes, VMware’s vCloud Director can recover data. The backup and recovery system is customizable, fast and accurate.

NetApp and VMware have a 24 hour per day and seven day per week global staff monitoring the applications and data stored in the cloud. This ensures the data is protected. Technical support constantly works with all parties to ensure issues are addressed promptly and efficiently. Additionally, VMware ensures that resources are available to meet service level agreements.


Consider How VMware Can Help Your Organization

VMware is a viable solution that can be beneficial in any organization. Consider VMware for your business and witness an increase in productivity, efficiency and mobility. VMware solutions are chosen frequently because they work.


Author Bio:

David Malmborg works with Dell. When David is not working, he enjoys spending time with his two kids. For more information on cloud computing, David recommends clicking here.