Provision your own Kubernetes cluster with private network topology on AWS using kops and Terraform – Part 2

Getting Started

If you managed to follow and complete the previous blog post, then you managed to get a Kubernetes cluster up and running in your own private AWS VPC using kops and Terraform to assist you.

In this blog post, you’ll cover following items:

  • Setup upstream DNS for your cluster
  • Get a Kubernetes Dashboard service and deployment running
  • Deploy a basic metrics dashboard for Kubernetes using heapster, InfluxDB and Grafana

Upstream DNS

In order for services running in your Kubernetes cluster to be able to resolve services outside of your cluster, you’ll now configure upstream DNS.

Containers that are started in the cluster will have their local resolv.conf files automatically setup with what you define in your upstream DNS config map.

Create a ConfigMap with details about your own DNS server to use as upstream. You can also set some external ones like Google DNS for example (see example below):

apiVersion: v1
kind: ConfigMap
  name: kube-dns
  namespace: kube-system
  stubDomains: |
    {"yourinternaldomain.local": [""]}
  upstreamNameservers: |
    ["", "", ""]

Save your ConfigMap as kube-dns.yaml and apply it to enable it.

kubectl apply -f kube-dns.yaml

You should now see it listed in Config Maps under the kube-system namespace.

Kubernetes Dashboard

Deploying the Kubernetes dashboard is as simple as running one kubectl command.

kubectl apply -f

You can then start a dashboard proxy using kubectl to access it right away:

kubectl proxy

Head on over to the following URL to access the dashboard via the proxy you ran:


You can also access the Dashboard via the API server internal elastic load balancer that was set up in part 1 of this blog post series. E.g.


Heapster, InfluxDB and Grafana (now deprecated)

Note: Heapster is now deprecated and there are alternative options you could instead look at, such as what the official Kubernetes git repo refers you to (metrics-server). Nevertheless, here are the instructions you can follow should you wish to enable Heapster and get a nice Grafana dashboard that showcases your cluster, nodes and pods metrics…

Clone the official Heapster git repo down to your local machine:

git clone

Change directory to the heapster directory and run:

kubectl create -f deploy/kube-config/influxdb/
kubectl create -f deploy/kube-config/rbac/heapster-rbac.yaml

These commands will essentially launch deployments and services for grafana, heapster, and influxdb.

The Grafana service should attempt to get a LoadBalancer from AWS via your Kubernetes cluster, but if this doesn’t happen, edit the monitoring-grafana service YAML configuration and change the type to LoadBalancer. E.g.

"type": "LoadBalancer",

Save the monitoring-grafana service definition and your cluster should automatically provision a public facing ELB and set it up to point to the Grafana pod.

Note: if you want it available on an internal load balancer instead, you’ll need to create your grafana service using the aws-load-balancer-internal annotation instead.

Grafana dashboard for Kubernetes with Heapster

Now that you have Heapster running, you can also get some metrics displayed directly in your Kubernetes dashboard too.

You may need to restart the dashboard pods to access the new performance stats in the dashboard though. If this doesn’t work, delete the dashboard deployment, service, pods, role, and then re-deploy the dashboard using the same process you followed earlier.

Once its up and running, use the DNS for the new ELB to access grafana’s dashboard, login with admin/admin and change the default admin password to something secure and save. You can now access cluster stats/performance stats in kubernetes, as well as in Grafana.

Closing off

This concludes part two of this series. To sum up, you managed to configure upstream DNS, deploy the Kubernetes dashboard and set up Heapster to allow you to see metrics in the dashboard, as well as deploying InfluxDB for storing the metric data with Grafana as a front end service for viewing dashboards.

Provision your own Kubernetes cluster with private network topology on AWS using kops and Terraform – Part 1


In this post series I’ll be covering how to provision a brand new self-hosted Kubernetes environment provisioned into AWS (on top of EC2 instances) with a specific private networking topology as follows:

  • Deploy into an existing VPC
  • Use existing VPC Subnets
  • Use private networking topology (Calico), with a private/internal ELB to access the API servers/cluster
  • Don’t use Route 53 AWS DNS services or external DNS, instead use Kubernetes gossip DNS service for internal cluster name resolution, and allow for upstream DNS to be set up to your own private DNS servers for outside-of-cluster DNS lookups

This is a more secure set up than a more traditional / standard kops provisioned Kubernetes cluster,  placing API servers on a private subnet, yet still allows you the flexibility of using Load Balanced services in your cluster to expose web services or APIs for example to the public internet if you wish.

Set up your workstation with the right tools

You need a Linux or MacOS based machine to work from (management station/machine). This is because kops only runs on these platforms right now.

sudo apt install python-pip
  • Use pip to install the awscli
pip install awscli --upgrade --user
  • Create yourself an AWS credentials file (~/.aws/credentials) and set it up to use an access and secret key for your kops IAM user you created earlier.
  • Setup the following env variables to reference from, but make sure you fill in the values you require for this new cluster. So change the VPC ID, S3 state store bucket name, and cluster NAME.
export ZONES=us-east-1b,us-east-1c,us-east-1d
export KOPS_STATE_STORE=s3://your-k8s-state-store-bucket
export NAME=yourclustername.k8s.local
export VPC_ID=vpc-yourvpcidgoeshere
  • Note for the above exports above, ZONES is used to specify where the master nodes in the k8s cluster will be placed (Availability Zones). You’ll definitely want these spread out for maximum availability

Set up your S3 state store bucket for the cluster

You can either create this manually, or create it with Terraform. Here is a simple Terraform script that you can throw into your working directory to create it. Just change the name of the bucket to your desired S3 bucket name for this cluster’s state storage.

Remember to use the name for this bucket that you specified in your KOPS_STATE_STORE export variable.

resource "aws_s3_bucket" "state_store" {
  bucket        = "${}-${var.env}-state-store"
  acl           = "private"
  force_destroy = true

  versioning {
    enabled = true

  tags {
    Name        = "${}-${var.env}-state-store"
    Infra       = "${}"
    Environment = "${var.env}"
    Terraformed = "true"

Terraform plan and apply your S3 bucket if you’re using Terraform, passing in variables for name/env to name it appropriately…

terraform plan
terraform apply

Generate a new SSH private key for the cluster

  • Generate a new SSH key. By default it will be created in ~/.ssh/id_rsa
ssh-keygen -t rsa

Generate the initial Kubernetes cluster configuration and output it to Terraform script

Use the kops tool to create a cluster configuration, but instead of provisioning it directly, you’ll output it to terraform script. This is important, as you’ll be wanting to change values in this output file to provision the cluster into existing VPC and subnets. You also want to change the ELB from a public facing ELB to internal only.

kops create cluster --master-zones=$ZONES --zones=$ZONES --topology=private --networking=calico --vpc=$VPC_ID --target=terraform --out=. ${NAME}

Above you ran the kops create cluster command and specified to use a private topology with calico networking. You also designated an existing VPC Id, and told the tool to create terraform script as the output in the current directory instead of actually running the create cluster command against AWS right now.

Change your default editor for kops if you require a different one to vim. E.g for nano:

set EDITOR=nano

Edit the cluster configuration:

kops edit cluster ${NAME}

Change the yaml that references the loadBalancer value as type Public to be Internal.

While you are still in the editor for the cluster config, you need to also change the entire subnets section to reference your existing VPC subnets, and egress pointing to your NAT instances. Remove the current subnets section, and add the following template, (updating it to reference your own private subnet IDs for each region availability zone, and the correct NAT instances for each too. (You might possibly use one NAT instance for all subnets or you may have multiple). The Utility subnets should be your public subnets, and the Private subnets your private ones of course. Make sure that you reference subnets for the correct VPC you are deploying into.

- egress: nat-2xcdc5421df76341
  id: subnet-b32d8afg
  name: us-east-1b
  type: Private
  zone: us-east-1b
- egress: nat-04g7fe3gc03db1chf
  id: subnet-da32gge3
  name: us-east-1c
  type: Private
  zone: us-east-1c
- egress: nat-0cd542gtf7832873c
  id: subnet-6dfb132g
  name: us-east-1d
  type: Private
  zone: us-east-1d
- id: subnet-234053gs
  name: utility-us-east-1b
  type: Utility
  zone: us-east-1b
- id: subnet-2h3gd457
  name: utility-us-east-1c
  type: Utility
  zone: us-east-1c
- id: subnet-1gvb234c
  name: utility-us-east-1d
  type: Utility
  zone: us-east-1d
  • Save and exit the file from your editor.
  • Output a new terraform config over the existing one to update the script based on the newly changed ELB type and subnets section.
kops update cluster --out=. --target=terraform ${NAME}
  • The updated file is now output to in your working directory
  • Run a terraform plan from your terminal, and make sure that the changes will not affect any existing infrastructure, and will not create or change any subnets or VPC related infrastructure in your existing VPC. It should only list out a number of new infrastructure items it is going to create.
  • Once happy, run terraform apply from your terminal
  • Once terraform has run with the new file, the certificate will only allow the standard named cluster endpoint connection (cert only valid for api.internal.clustername.k8s.local for example). You now need to re-run kops update and output to terraform again.
kops update cluster $NAME --target=terraform --out=.
  • This will update the cluster state in your S3 bucket with new certificate details, but not actually change anything in the local file (you shouldn’t see any changes here). However you can now run a rolling update rolling update with the cloudonly and force –yes options:
kops rolling-update cluster $NAME --cloudonly --force --yes

This will roll all the masters and nodes in the cluster (the created autoscaling groups will initialise new nodes from the launch configurations) and when the ASGs initiate new instances, they’ll get the new certs applied from the S3 state storage bucket. You can then access the ELB endpoint on HTTPS, and you should get an auth prompt popup.

Find the endpoint on the internal ELB that was created. The rolling update may take around 10 minutes to complete, and as mentioned before, will terminate old instances in the Autoscaling group and bring new instances up with the new certificate configuration.

Tag your public subnets to allow auto provisioning of ELBs for Load Balanced Services

In order to allow Kubernetes to automatically create load balancers (ELBs) in AWS for services that use the LoadBalancer configuration, you need to tag your utility subnets with a special tag to allow the cluster to find these subnets automatically and provision ELBs for any services you create on-the-fly.

Tag the subnets that you are using as utility subnets (public) with the following tag:

Key: Value: (Don’t add a value, leave it blank)

Tag your private subnets for internal-only ELB provisioning for Load Balanced Services

In order to allow Kubernetes to automatically create load balancers (ELBs) in AWS for services that use the LoadBalancer configuratio and a private facing configuration, you need to tag the private subnets that the cluster operates in with a special tag to allow k8s to find these subnets automatically.

Tag the subnets that you are using as private (where your nodes and master nodes should be running now) with the following two tags:

Key:{yourclusternamehere.k8s.local} Value: shared
Key: Value: 1

As an example for the above, the key might end up with a value of “” if your cluster is named “yourclusternamehere.k8s.local” (remember you named your cluster when you created your local workstation EXPORT value for {NAME}.

Closing off

This concludes part one of this series for now.

As a summary, you should now have a kubernetes cluster up and running in your private subnets, spread across availability zones, and you’ve done it all using kops and Terraform.

Straighten things out by creating a git repository, and commiting your terraform artifacts for the cluster and storing them in version control. Watch out for the artifacts that kops output along with the Terraform script like the private certificate files – these should be kept safe.

Part two should be coming soon, where we’ll run through some more tasks to continue setting the cluster up like setting upstream DNS, provisioning the Kubernetes Dashboard service/pod and more…