3

How to overcome Docker Hub rate limits in Github Actions

 1 year ago
source link: https://buildingvts.com/how-to-overcome-docker-hub-rate-limits-in-github-actions-3c16ce1a55db
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

How to overcome Docker Hub rate limits in Github Actions

Our learnings on how to solve different Docker Hub rate limiting scenarios in self-hosted runners

1*L7ddcevX4JbkrhLmY6qqRQ.png
Image Credits

Docker Hub recently announced rate limiting which impacted many organizations that run large Kubernetes clusters or have medium to large scale CI usage.

At VTS we use self-hosted runners (EC2 spot instances) for GitHub Actions. We experienced rate limiting with Docker Hub and found it quite disruptive to our workflows. We then experimented with a few solutions to overcome Docker Hub rate limiting and these could be useful if you’re facing similar constraints.

Those who are on Github hosted runners need not worry because Github has already partnered with Docker Hub to forego rate limits for public containers.

AWS Public ECR

Standard CI workflows usually depend on publicly accessible containers. Docker Hub, however, allows 100 pulls per 6 hours for anonymous usage and this is very low.

Instead of relying on Docker Hub, explore using Amazon’s public ECR which has almost all public images available and has higher rate limits for anonymous use.

Github Actions Workflow Example

For lower scale use, this will do the trick at zero cost. Additionally CI usage (image pulls) can be further split between a free account (200 calls per 6 hours) and AWS public ECR.

AWS ECR pull through caching

For organizations already on AWS, pull through caching is a good solution to avoid Docker Hub rate limits without incurring additional expense.

AWS recently rolled out pull through caching for ECR, which means any calls to publicly accessible images will go through AWS ECR and future pulls will be limited to ECR.

Great, right? You don’t need to depend on Docker Hub for public images and ECR will automatically pull any updates from the remote registry (Docker Hub, Quay etc.) once every 24 hours.

Example command/step(s) using pull through cache

Note: Setting up pull through cache is out of the scope for this article.

Authenticate to Docker Hub at workflow level

For Docker usage within self-hosted runners, you could create a free account and authenticate with Docker Hub in the workflow or use the login action.

Docker Login Step

However this only allows 200 pulls per 6 hours. For organizational use, we suggest going with the Pro plan, it costs $5/month at the time of writing this article and allows 5000 pulls a day which is cost effective and gives room for scaling.

Authenticate to Docker hub at runner level

The solutions discussed so far don’t cover every rate limiting scenario while running self-hosted runners at scale in Github Actions.

At VTS, we faced rate limits even after setting up authenticated usage because we were using Docker powered open source actions in many workflows.

Before we get to the solution, let’s understand the “why” first. We faced the below scenario in Github Actions -

Container pull for the custom actions happens at the first step of the build, even before Docker login, and thus rate limiting still prevails

Edge case scenario that causes rate limit at scale

Silver Bullet

At VTS, our CI runners are powered by AWS EC2 spot instances via a great terraform module that does most of the scaling related heavy lifting for us.

To solve all rate limiting cases we found our silver bullet in the form of EC2 post_install scripts that execute on runner startup.

EC2 user data script to login to dockerhub on startup

As seen above, we added the following script to authenticate every self-hosted runner with Docker Hub on startup through userdata_post_install parameter exposed by the underlying terraform module before registering the runner with Github.

We use Mozilla SOPS to safely pass Docker Hub login credentials but any other way of passing secrets can be used as well.

Note: Those who don’t use the terraform-aws-github-runner module can trigger/run the same user data script on EC2 startup (before running the Github self-hosted runner) to resolve Docker Hub rate limit issues.

Hope the above solutions were helpful in solving your Docker Hub rate limit problem in Github Actions!

Shout out to

& for proofreading and giving suggestions to make this article better.

Dev works as a Senior SRE on the platform infrastructure team at VTS. He is passionate about building developer tooling, reliable systems and driving continuous improvements.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK