The FreeNAS Nextcloud plugin installation works great with automatic configuration thanks to a recent pull request. But, you don’t get SSL enabled by default. This is critical, especially for a system exposed to the internet.
In this post you’ll see how to:
Install the Nextcloud plugin in a FreeNAS BSD jail
Add an extra NAT port for SSL to the jail
Configure NGINX inside the jail by adding a customised configuration with SSL enabled
Apply a free SSL certificate using Lets Encrypt and DNS-01 challenge validation
Look at some options for setting up home networking for public access
Start off by Installing the Nextcloud Plugin in a jail. Choose NAT for networking mode. It defaults to port 8282:80 (http).
Stop the jail once it’s running and edit it. Add another NAT rule to point 8443 to 443 for SSL.
The reason for selecting port 8443 for Nextcloud is because the FreeNAS web UI listens on port 443 for SSL too.
An alternative could be to use DHCP instead of NAT for the jail. I chose NAT for my setup as I prefer using one internal IP address for everything I run on the FreeNAS server.
Shell into the Nextcloud jail, and rename the default nginx configuration.
NGINX will load all .conf files in this directory. Hence the reason you’ll create a new configuration for your SSL setup here.
ee /usr/local/etc/nginx/conf.d/nextcloud.conf /usr/local/etc/nginx/conf.d/nextcloud-ssl.conf
Populate it with the contents of the gist below, but replace server_name, ssl_certificate, and ssl_certificate_key with your own hostname.
Generate a free SSL certificate with Lets Encrypt
To configure the Nextcloud plugin on FreeNAS with SSL you don’t need to break the bank on SSL certificate costs from traditional CAs. Lets Encrypt it free, but you’ll need to renew your certificate every three months.
DNS-01 challenge certificate generation for Lets Encrypt is a great way to get SSL certificates without a public web server.
It entails creating a TXT/SPF record on the domain you own, with a value set to a code that certbot gives you during the certbot request process.
Install certbot if you don’t already have it installed. On a debian based system:
sudo apt-get install certbot
Request a certificate for your desired hostname using certbot with dns as the preferred challenge.
sudo certbot -d yournextcloud.example.net --manual --preferred-challenges dns certonly
Follow the prompts until you receive a code to setup your own TXT record with. Go to your DNS provider control panel and create it with the code you’re given as the value.
After creating the record, finish the certificate request. Lets Encrypt will confirm the DNS TXT record and issue you a certificate. You’ll get a chain file called fullchain.pem, along with a private key file called privkey.pem.
Upload the SSL certificate files to Nextcloud
Upload both to your Nextcloud Jail. Use SCP to copy them up, renaming them as follows:
In this post I’ll explain how I created an AWS EC2 Spot Instance Termination Simulator to run on any EC2 instance.
It can signal typical monitoring scripts, applications, interrupt handlers, and more, just like a real Spot Termination signal would do.
Looking around, there aren’t really any straight forward ways of testing this behaviour.
The obvious way is to change the price you bid for spot instances to the threshold of the current market price. This is so that if there is a slight increase in demand your spot instance(s) will be terminated.
The problem with that approach is of course that you can’t easily predict when the price will move, or by how much.
How EC2 Spot Instance Termination warnings work
When your spot instance bid price is surpassed by the current market price, a Termination Notice becomes available on one of the instance’s metadata endpoints. For example, http://169.254.169.254/latest/meta-data/spot/termination-time.
There is one other, newer endpoint at http://169.254.169.254/latest/meta-data/spot/instance-action that could also be used.
At this point the endpoint returns an HTTP 200 response. A timestamp of when a shutdown signal is going to be sent to your instance’s OS is also returned.
In the case of the newer endpoint, a JSON string is returned with an action and time. Usually the endpoint usually simply returns a 404 not found response.
The tools out there tend to monitor this endpoint to warn and action in case of a Spot Termination notice being received.
The two minute warning feature was added by Amazon back in 2015, announced in this blog post.
The service is a NodePort service, exposing the port on the host it runs on.
List the service and the pod (and find the kubernetes node the pod is running on):
kubectl get svc ec2-spot-termination-simulator
kubectl get pod spot-term-simulator-xxxxxx -o wide
Take note of the NodePort the service is listening on for the Kubernetes node. E.g. port 30626.
docker run -p 30626:80 -e PORT=80 shoganator/ec2-spot-termination-simulator
Proxying the EC2 metadata service
Perform some trickery to proxy traffic destined to 169.254.169.254 on port 80 to localhost (where the container runs). This is so that the fake service can take the place of the real one.
SSH onto the Kubernetes node and run:
sudo ifconfig lo:0 169.254.169.254 up
sudo socat TCP4-LISTEN:80,fork TCP4:127.0.0.1:30626
Create an alias for the localhost interface at 169.254.169.254, effectively taking over the EC2 metadata service and sending that to 127.0.0.1 instead.
Forward TCP port 80 traffic (usually destined to the EC2 metadata service) to 127.0.0.1 on the NodePort 30626. This is where the ec2-spot-termination-simulator pod is running on this host. Substitute this with the correct port in your case.
As a result, it should return a 200 OK response with a timestamp from the fake service. This is the same as the real one would.
Looking at how the kube-spot-termination-notice-handler service works specifically:
This service runs as a DaemonSet, meaning one instance per host. The instance on the node you set the simualtor up on should immediately drain the node. The instance won’t actually be terminated as this was of course just a simulated termination.
If you’re not running on Kubernetes and are using a different spot termination handler, don’t worry. The system you’re using to monitor the EC2 instance metadata endpoint should still take action at this point.
The proxied web service is now returning a legitimate looking termination time notice on http://169.254.169.254/latest/meta-data/spot/termination-time.
Ephemeral Containers are an early-state alpha feature in Kubernetes 1.16 and offer some interesting new dynamics when it comes to tooling that we can use in day-to-day Kubernetes operations.
To see this feature live, in action, check out the demo shell session below:
Before we look at Ephemeral Containers, let’s go over what a Pod is in the Kubernetes world.
Remember that a Pod in Kubernetes is a group of one or more containers (e.g. Docker containers).
With that basic tidbit of information out of the way, we’ll look at some characteristics that Pods and their containers have always had in the past:
They’re meant to be disposable and easily replaced in a controlled manner with Deployments.
You could not add containers to pods at runtime.
Containers in pods can have ports assigned for network access and use things like liveness probes.
Troubleshooting containers in pods usually meant looking at logs or using kubectl exec to get into the running container and poke around. The latter of course being useless if your container had already crashed.
So here is where I see one of the best use cases for the new Ephemeral Containers feature – troubleshooting.
Ephemeral Containers can be inserted into a live, running pod at runtime.
This means they are great for live troubleshooting of your applications. How many times have you wished your base docker image you’ve built your application image on top of has curl, dig, or even ping in some cases…
If we’ve been following best practices, we’ve kept our docker images as slim as possible, and removed as much attack surface area as possible. This usually means all the useful diagnostic tools are missing.
Ephemeral Containers are great. We can now keep a diagnostic Docker image handy with all the tools we need and live insert a diagnostic container into a running pod to troubleshoot when the time arises.
When an Ephemeral Container runs, it executes within the namespace of the target pod. So you’ll be able to access, for example, the filesystems and other resources that containers in the the pods have.
In order to follow along with this demo, you’ll need Kubernetes 1.16 or higher, and you’ll need to use two pod related features:
EphemeralContainers (of course) – disabled by default in 1.16 as it’s alpha.
PodShareProcessNamespace – for sharing the process namespace in a pod (enabled by default in 1.16 as it’s a beta feature).
To enable the Ephemeral Containers feature, edit the following configuration files on your Kubernetes master nodes and restart each master:
Enable the EphemeralContainers alpha feature gate in the following places
by adding the following line inside the command section:
Create a new pod (the one I’m using is Rabbit MQ and specific to ARM architecture as I’m using a Raspberry Pi cluster here), but replace this image with anything you like as its just for testing:
Next, create an EphemeralContainer resource saving it as ephemeral-diagnostic-container.json
(Note that I’m using a Docker image I created, shoganator/rpi-alpine-tools with a bunch of diagnostic tools added, and that this image is specific to ARM architecture only). Replace the image in this file with anything else you like, e.g. busybox.
Otherwise, read on for step-by-step and more information…
There are a few guides floating around that detail how to install the Weave Net CNI plugin for Amazon Kubernetes clusters (EKS), however I’ve not seen them go into much detail.
Most tend to skip over some important steps and details when it comes to configuring weave and getting the pod networking functioning correctly.
There are also some important caveats that you should be aware of when replacing the AWS CNI Plugin with a different CNI, whether it be Weave, Calico, or any other.
Replacing CNI functionality
You should be 100% happy with what you’ll lose if completely replace the AWS CNI with another CNI. The AWS CNI has some very useful functionality such as:
Assigning IP addresses (via ENIs) to place pods directly into your VPC network
VPC flow logs that make sense
However, depending on your architecture and design decisions, as well as potential VPC network limitations, you may wish to opt out of the CNI that Amazon provides and instead use a different CNI that provides an overlay network with other functionality.
AWS CNI Limitations
One of the problems I have seen in VPCs is limited CIDR ranges, and therefore subnets that are carved up into smaller numbers of IP addresses.
The Amazon AWS CNI plugin is very IP address hungry and attaches multiple Secondary Private IP addresses to EKS worker nodes (EC2 instances) to provide pods in your cluster with directly assigned IPs.
This means that you can easily exhaust subnet IP addresses with just a few EKS worker nodes running.
This limitation also means that those who want high densities of pods running on worker nodes are in for a surprise. The IP address limit becomes an issue for maximum number of pods in these scenarios way before compute capacity becomes a problem.
This page shows the maximum number of ENI’s and Secondary IP addresses that can be used per EC2 instance: https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt
Removing the AWS CNI plugin
Note: This process will involve you needing to replace your existing EKS worker nodes (if any) in the cluster after installing the Weave Net CNI.
Assuming you have a connection to your cluster already, the first thing to do is to remove the AWS CNI.
kubectl -n=kube-system delete daemonset aws-node
With that gone, your future EKS workers will no longer assign multiple Secondary IP addresses from your VPC subnets.
Installing CNI Genie
With the AWS CNI plugin removed, your pods won’t be able to get a network connection when starting up from this point onward.
Installing a basic deployment of CNI Genie is a quick way to get automatic CNI selection working for containers that start from this point on.
CNI genie has tons of other great features like allowing you to customise which CNI containers use when starting up and more.
For now, you’re just using it to allow containers to start-up and use the Weave Net overlay network by default.
Install CNI Genie. This manifest works with Kubernetes 1.12, 1.13, and 1.14 on EKS.
Next, get a Weave Net CNI yaml manifest file. Decide what overlay network IP Range you are going to be using and fill it in for the env.IPALLOC_RANGE query string parameter value in the code block below before making the curl request.
Note: the env.IPALLOC_RANGE query string param added is to specify you want a config with a custom CIDR range. This should be chosen specifically not to overlap with any network ranges shared with the VPC you’ll be deploying into.
In the example above I had a VPC and VPC peers that shared the CIDR block 10.0.0.0/8). Therefore I chose to use 192.168.0.0/16 for the Weave overlay network.
You should be aware of the network ranges you’re using and plan this out appropriately.
The config you now have as weave-cni.yaml will contain the environment variable IPALLOC_RANGE with the correct value that the weave pods will use to setup networking on the EKS Worker nodes.
Apply the weave Net CNI resources:
Note: This manifest is pre-created to use an overlay network range of 192.168.0.0/16
Note: Don’t expect things to change suddenly. The current EKS worker nodes will need to be rotated out (e.g. drain, terminate, wait for new to appear) in order for the IP addresses that the AWS CNI has kept warm/allocated to be released.
If you have any existing EKS workers running, drain them now and terminate/replace them with new workers. This includes the source/destination check change made previously.
kubectl get nodes
kubectl drain nodename --ignore-daemonsets
Remove max pod limits on nodes:
Your worker nodes by default have a limit set on how many pods they can schedule. The EKS AMI sets this based on EC2 type (and the max pods due to the usual ENI limitations / IP address limitations with the AWS CNI).
Check your max pod limits with:
kubectl get nodes -o yaml | grep pods
If you’re using the standard EKS optimized AMI (or a derivative of it) then you can simply pass an option to the bootstrap.sh script located in the image that setup the kubelet and joins the cluster. Set –use-max-pods false as an argument to the script.
For example, your autoscale group launch configuration might get the EC2 worker nodes to join the cluster using the bootstrap.sh script. You can update it like so:
If you’re using the EKS Terraform module you can simply pass in bootstrap-extra-args – this will automatically setup your worker node userdata templates with extra bootstrap arguments for the kubelet. See example here
Checking max-pods limit again after applying this change, you should see the previous pod limit (based on prior AWS CNI max pods for your instance type) removed now.
You’re almost running Weave Net CNI on AWS EKS, but first you need to roll out new worker nodes.
With the Weave Net CNI installed, the kubelet service updated and your EC2 source/destination checks disabled, you can rotate out your old EKS worker nodes, replacing them with the new nodes.
kubectl drain node --ignore-daemonsets
Once the new nodes come up and start scheduling pods, if everything went to plan you should see that new pods are using the Weave overlay network. E.g. 192.168.0.0/16.
A quick run-down on weave IP addresses and routes
If you get a shell to a worker node running the weave overlay network and do a listing of routes, you might see something like the following:
# ip route show
default via 10.254.109.129 dev eth0
10.254.109.128/26 dev eth0 proto kernel scope link src 10.254.109.133
169.254.169.254 dev eth0
192.168.0.0/16 dev weave proto kernel scope link src 192.168.192.0
This routing table shows two main interfaces in use. One from the host (EC2) instance network interfaces itself, eth0, and one from weave called weave.
When network packets are destined for the 10.254.109.128/26 address space, then traffic is routed down eth0.
If traffic on the host is destined for any address on 192.168.0.0/16, it will instead route via the weave interface ‘weave’ and the weave system will handle routing that traffic appropriately.
Otherwise if the traffic is destined for some public IP address out on the wider internet, it’ll go down the default route which is down the interface, eth0. This is a default gateway in the VPC subnet in this case – 10.254.109.129.
Finally, metadata URL traffic for 169.254.169.254 goes down the main host eth0 interface of course.
For the most part everything should work great. Weave will route traffic between it’s overlay network and your worker node’s host network just fine.
However, some of your custom workloads or kubernetes tools might not like being on the new overlay network. For example they might need to talk to other Kubernetes nodes that do not run weave net.
This is now where the limitation of using a managed Kubernetes offering like EKS becomes a bit of a problem.
You can’t run weave on the Kubernetes master / API servers that are effectively the ‘managed’ control plane that AWS EKS hosts for you.
This means that your weave overlay network does not span the Kubernetes master nodes where the Kubernetes API runs.
If you have an application or container in the weave overlay network and the Kubernetes master node / API needs to talk to it, this won’t work.
One potential solution though is to use hostNetwork: true in your pod specification. However you should of course be aware of how this would affect your application and application security.
In my case, I was running metrics-server and it stopped working after it started using Weave. I found out that the Kubernetes API needs to talk to the metrics-server service and of course this won’t work in the overlay network.
Example EKS with Weave Net CNI cluster
You can use the source code I’ve uploaded here.
There are five simple steps to deploy this example EKS cluster in your own account.
Modify the example.tfvars file to fit your own parameters.
terraform plan -var-file="example.tfvars" -out="example.tfplan"
terraform apply "example.tfplan"
Warning: This will create a new VPC, subnets, NAT Gateway instance, Internet Gateway, EKS Cluster, and set of worker node autoscale groups. So be sure Terraform Destroy this if you’re just testing things out.
– Your wallet
After terraform creates all the resources, you can run the two included shell scripts. setup-weave.sh will remove the AWS CNI, install CNI genie, Weave, and deploy two simple example pods and services.
At this point you should terminate your existing worker nodes (that still use the AWS CNI) and wait for your new worker nodes to join the cluster.
test-weave.sh will wait for the hello-node test pods to become ready, and then execute a curl command inside one, talking to the other via the the service and vice versa. If successful, you’ll see a HTTP 200 OK response from each service.
This is a quick post showing a nice and fast batch S3 bucket object deletion technique.
I recently had an S3 bucket that needed cleaning up. It had a few million objects in it. With path separating forward slashes this means there were around 5 million or so keys to iterate.
The goal was to delete every object that did not have a .zip file extension. Effectively I wanted to leave only the .zip file objects behind (of which there were only a few thousand), but get rid of all the other millions of objects.
My first attempt was straight forward and naive. Iterate every single key, check that it is not a .zip file, and delete it if not. However, every one of these iterations ended up being an HTTP request and this turned out to be a very slow process. Definitely not fast batch S3 bucket object deletion…
I fired up about 20 shells all iterating over objects and deleting like this but it still would have taken days.
I then stumbled upon a really cool technique on serverfault that you can use in two stages.
Iterate the bucket objects and stash all the keys in a file.
Iterate the lines in the file in batches of 1000 and call delete-objects on these – effectively deleting the objects in batches of 1000 (the maximum for 1 x delete request).
In-between stage 1 and stage 2 I just had to clean up the large text file of object keys to remove any of the lines that were .zip objects. For this process I used sublime text and a simple regex search and replace (replacing with an empty string to remove those lines).
So here is the process I used to delete everything in the bucket except the .zip objects. This took around 1-2 hours for the object key path collection and then the delete run.
Get all the object key paths
Note you will need to have Pipe Viewer installed first (pv). Pipe Viewer is a great little utility that you can place into any normal pipeline between two processes. It gives you a great little progress indicator to monitor progress in the shell.
Remove any object key paths you don’t want to delete
Open your all-the-stuff.keys file in Sublime or any other text editor with regex find and replace functionality.
The regex search for sublime text:
Find and replace all .zip object paths with the above regex string, replacing results with an empty string. Save the file when done. Make sure you use the correctly edited file for the following deletion phase!
Iterate all the object keys in batches and call delete
tails the large text file (mine was around 250MB) of object keys
passes this into pipe viewer for progress indication
translates (tr) all newline characters into a null character ‘\0’ (effectively every line ending)
chops these up into groups of 1000 and passes the 1000 x key paths as an argument with xargs to the aws s3api delete-object command. This delete command can be passed an Objects array parameter, which is where the 1000 object key paths are fed into.
finally quiet mode is disabled to show the result of the delete requests in the shell, but you can also set this to true to remove that output.
Effectively you end up calling aws s3api delete-object passing in 1000 objects to delete at a time.
This is how it can get through the work so quickly.
This is a pattern I’ve used with success for access to apps running in a number of Kubernetes clusters that were restricted to only having a single ingress load balancer.
Kubernetes clusters (EKS) are on the internal network only (in this case private subnets in an AWS VPC).
IAM permissions are locked down to prevent creation of security groups (we can only use existing, pre-defined security groups) and so the LoadBalancer service type of Kubernetes is off-limits, as the k8s control plane needs to be able to create these automatically with security groups – this operation fails because of the restricted IAM permissions on the cluster. We have one Elastic Load Balancer created with the LoadBalancer service type when the cluster was initial bootstrapped with an nginx ingress controller + service type == LoadBalancer before the permissions were locked down again.
The Ingress Controller that is running is backed by an internal facing Elastic Load Balancer (ELB), created initially as described above.
Applications run across namespaces in each cluster, and the Ingress Controller must be able to provide dynamic access for users of these internal applications that sit on the network outside the k8s cluster.
DNS and ingress must be dynamic enough to allow the same apps to run in different namespaces, use the same URL path, but with differing hostnames. SSL must also be provided for all of these apps using a wildcard SSL certificate. E.g.
Once DNS wildcard CNAME record is created, it is difficult to change to point to a new location if needing changes (reliant on 3rd party to manage DNS).
A Solution with Reverse Proxying
There are of course a number of ways to approach this, like running under cert-manager inside the cluster with the letsencrypt issuer, or if you are running your own PKI with vault, the vault issuer.
cert-manager wouldn’t work well here as services are not publicly accessible for HTTP-01 certificate verification.
It could also be possible to terminate SSL at the ingress controller level in the cluster with the SSL certificate loaded there.
One additional requirement that I didn’t mention above though was that developers who are pushing their apps into the clusters need to be able to ‘dynamically’ configure their own personal ‘dev’ namespaces / ingress rules.
They configure their ingress easily enough with the Kubernetes Ingress resource when they deploy their apps (using Helm), however hostnames are not so easy for them to configure. Route53 is not in use here, and not allowed in this environment, and programmatic access to DNS is not possible.
A reverse proxy with NGINX
This layer exists more or less just to allow easy re-pointing of CNANE wildcard DNS entry to the Kubernetes cluster. As DNS is not easily configured (handled by another team/resource), we can simply leave it pointed to the NGINX elastic load balancer, and then just re-point requests using NGINX configuration if we need to.
It’s worth pointing out that this NGINX layer could be hosted on a multitude of places, including as a containerised solution, or it could even be replaced by a lambda function with API Gateway that could do the reverse proxying instead.
Environments are designated by namespaces in each ‘class’ of cluster. For example a non-production EKS cluster will have namespaces for non-production environments.
Hostnames need to be used to help the ingress rules match correctly with designated paths.
I configured an internal load balancer and setup a fleet of NGINX instances behind it.
Here is a quick runbook of how to setup NGINX and certbot on a vanilla Amazon Linux 2 EC2 instance. Use whichever automation you prefer such as baking your own AMI with packer, using Terraform, or ansible, but the runbook of steps to install NGINX and certbot is effectively:
sudo amazon-linux-extras install nginx1.12
sudo systemctl enable nginx
sudo systemctl start nginx
sudo wget -r --no-parent -A 'epel-release-*.rpm' https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/
sudo rpm -Uvh dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-*.rpm
sudo yum-config-manager --enable epel*
yum repolist all
yum install -y certbot
# request / generate letsencrypt wildcard cert using dns challenge interactively
certbot -d *.your.domain.here --manual --preferred-challenges dns certonly
# Interactive command above, choose to omit this in automation and do manually if you're using DNS-01 like I am here - certbot will give you a dynamically generated TXT record value for DNS-01 that you'll need to create.
systemctl restart nginx
Once NGINX is installed and your certs are generated, you’ll need to configure /etc/nginx/nginx.conf to point to the correct certificate files.
A wildcard CNAME record is created once-off that points anyhost.cluster.foo.bar to the internal ELB hostname for the reverse proxy NGINX instances (these sit outside of the cluster as standard EC2 hosts for now). For example:
I used certbot (letsencrypt) to issue a wildcard SSL certificate for the NGINX fleet servers for *.cluster.foo.bar. DNS-01 challenge type was used, as everything here is in a private, internal network, not accessible by letsencrypt services.
A TXT record just needs to be created with your DNS to verify to letsencrypt that you own the domain in question.
In the NGINX configuration, the generated certificate is loaded up for port 443 and the following location rule is setup to proxy_pass the requests sent to the NGINX fleet back to the Kubernetes Ingress Controller ELB.
The proxy_set_header directive is important, as it adds the host header that the NGINX fleet instance receives from the client, and sends it with the proxied request back to the Kubernetes ingress controller. The ingress rules need to match both hostname AND path in the requests to find the correct service inside the cluster/namespace.
SSL is now effectively terminated at the NGINX fleet layer with a wildcard SSL certificate and services inside the cluster don’t need to worry about configuring their own individual SSL certificates.
Ingress Rule Configuration
Now, developers can deploy their apps, and customise their ingress rules to use both hostname and path to setup access for their apps running in the cluster(s).
There are definitely other ways of doing this. Cleaner possibly, more automated in some ways, however with the constraints in play here (internal EKS, private only networks, no public internet access into the cluster), I think this is a good solution that makes life fairly pleasant for the developers that need to deploy their apps to these Kubernetes clusters.