Reducing AWS data transfer cost — Going Multi-AZ to Single-AZ Kubernetes

The only silver lining for us during COVID-19 was that we got a chance to optimize our resources in our run to cut costs — a long-pending activity. All the teams across the organization were focused on reducing the operating costs as much as possible. Similarly, DevOps teams were mainly concerned with costs related to our infrastructure.

Cross-AZ data transfer cost was one of the major contributors which ultimately led to a discussion around moving the non-critical workloads to a single AZ in order to reduce cross-AZ data transfer due to chatter between applications and trim some of the cross-AZ data transfer cost. This blog will discuss the challenges we faced while migrating our Kubernetes workload to a single AZ.

Our Stage Infrastructure

At Grofers, we have very short release cycles i.e new features can be released any time. In order to maintain stability and to give developers freedom and flexibility, we provide a fully isolated development environment powered by Kubernetes for each developer. These isolated development environments account for the majority of our stage infrastructure.

This means every developer gets his/her own personal namespace with all the necessary dependencies which are essential for them to develop on and test their services. Developers are supposed to develop their features in their isolated environments which are tested in a freshly prepared integrated environment and regression tested before merging to master.

A typical development environment contains the following Kubernetes Resources:

Deployments
Services
Ingress
Statefulsets
RoleBinding and Roles
Persistent Volume Claims
Volumes
Configuration in Consul

Migration to Single AZ

Migration of Kubernetes Cluster to Single AZ seems pretty simple, right?

We just have to get a list of all available nodegroups in the cluster
Open AWS console, remove other subnets (to support single AZ) in nodegroups and let the ASG do its job.

This seems pretty easy if most of the resources deployed on our stage environment are stateless and it should get deployed immediately on a new node in a new AZ without any issues. The same does not stand true for statefulsets with volumes, let’s discuss this scenario below.

Persistent volumes — the bummer

Persistent volumes come with the added luxury of being independent of the pod they are attached to, making them completely independent from the pod’s lifecycle. Not only that, but they are more flexible than the standard volume, such as having user-specified sizes and performance needs. Kubernetes volumes also come with the nice perk of having a multitude of different types of them to fit a user’s need. One such type of persistent volume is the AWS Elastic Block Store (EBS).

How does Kubernetes provision EBS for every declared persistent volume?

Kubernetes has a construct called storage class. A storage class provides a way for administrators to describe the “classes” of storage they offer. Storage classes have a provisioner that determines what volume plugin is used for provisioning persistent volumes (PVs). This field must be specified. In our case, For EBS it will be “kubernetes.io/aws-ebs” — this storage provisioner will take care that a corresponding EBS volume with the correct parameters is created. Data will persist as long as the corresponding PV resource exists. Deleting the resource will also delete the corresponding EBS volume, which means that all stored data will be lost at that point.

So in our case, if we create a PV resource on Kubernetes with a storage class configured for EBS, all our Pods will use EBS volumes as persistent storage and appropriate EBS volumes will be provisioned from AWS.

Problem with migrating PV and EBS from existing AZ to different AZ

We can easily migrate PVs if the associated EBS volumes are created in the same AZ but this operation would fail if we are trying to migrate PV to a different AZ. EBS volumes cannot be attached to an instance in another AZ. So what happens if you run a pod in one AZ (let’s say us-east-1a) and it has its volume in another AZ (us-east-1b) you will get something like this:

Warning FailedAttachVolume  Pod 109 Multi-Attach error for volume "pvc-6096fcbf-abc1-11e7-940f-06c399d05922" 
Volume is already exclusively attached to one node and can't be attached to another
Warning FailedMount Pod 1 AttachVolume.Attach failed for volume "pvc-6096fcbf-abc1-11e7-940f-06c399d05922" : 
Error attaching EBS volume "vol-03ea2cb51f21f9fac" to instance "i-0e8e0bbf7d97a15df": 
IncorrectState: vol-03ea2cb51f21f9fac is not 'available'.
  status code: 400, request id: 41707341-e239-4808-846f-8f9d19fd1563

Something similar happened when we were trying to migrate our PV to a different AZ because most of our EBS volumes were bound to multiple different supported AZs and when we switched to a single AZ the volumes bound to different AZ were not able to mount on new nodes in the target AZ. This made the migration significantly challenging.

So the real problem is: how to migrate Kubernetes volumes across AZ?

We use a tool called Velero for all Kubernetes related migration and backup-restore activities.

Velero takes a snapshot of all the resources/objects on the cluster and allows you to restore the cluster with the same snapshot. Fortunately, it also takes snapshots of your cluster’s Persistent Volumes using your cloud provider’s block storage snapshot features, and can then restore your cluster’s objects and Persistent Volumes to a previous state. Therefore, we decided to use Velero for moving our volumes.

But Velero just takes the snapshot. It will try to restore the volume in the same AZ where the volume was located earlier. Here comes the tiny hack which we used to change the allocated AZs for all the created PV in the Velero snapshot. We found this script on one of the open issues on Velero’s Github repo:

This script will run a find-replace on the volume snapshots created by Velero and you can use something similar to modify the described AZ in the snapshot. Whenever you will restore a volume using the snapshots, it will create a new volume in the modified AZ.

There you go. We have figured out a way to migrate volume from one AZ to another. Now for the complete migration, we have these easy steps.

Migration Steps

Take a Velero snapshot of all resources including statefulsets / volumes.
Remove all the namespaces
Open AWS console, remove other subnets (to support single AZ) from nodegroup ASGs and let it do its job.
Make sure all the new nodes are in a running state.
Find and replace all the AZs in the Velero snapshot that needs to be migrated with the target AZ (the AZ where you intend to run the entire cluster).
Restore everything from the Velero snapshot.

You can learn more about how to setup and use Velero here — https://velero.io/docs/main/basic-install/.

Reliability Impact & Recovery Procedures

While we took the drastic step of running our production of a single AZ, we do not suggest doing that unless you are fully aware of what you are getting into and how will you recover in case of an outage impacting your AZ.

Running production services out of a single AZ is not a recommended practice. After considering historical AZ level outages in our region, the impact of infrastructure capacity in AWS at the time of COVID, and our well-documented procedures to shift services to other AZs or even setup a new Kubernetes cluster from scratch gave us the confidence to take the risk to meet our short term goal of reducing costs.

An important thing we learned in this process was our investment into Infrastructure as Code practice for setting up our cluster, having well documented and scripted cluster setup and upgrade procedures was a big enabler for us to migrate to a single AZ quickly.

Investing early into a well-scripted (Infrastructure as Code) and documented change management procedure gave us the confidence to operate under a rather extreme architecture.

Reducing AWS data transfer cost — Going Multi-AZ to Single-AZ Kubernetes

Reducing AWS data transfer cost — Going Multi-AZ to Single-AZ Kubernetes

Our Stage Infrastructure

Migration to Single AZ

Persistent volumes — the bummer

How does Kubernetes provision EBS for every declared persistent volume?

Problem with migrating PV and EBS from existing AZ to different AZ

So the real problem is: how to migrate Kubernetes volumes across AZ?

Migration Steps

Reliability Impact & Recovery Procedures

Recommend

CPU Throttling in Kubernetes: A Postmortem

Learnings From Two Years of Kubernetes in Production

The Engineer/Manager Pendulum

Gold Cards

Minimising the Risk of Data Damage

employees

retain

Why every Android developer should use Anko

Runtime permissions and Espresso done right

Why Anko Layouts DSL is better than XML layouts

About Joyk