Health based traffic control with Kubernetes

Last time we covered how the liveness probe can be integrated with Spring Boot Actuator. Today, I’m going to show an example application for the readiness probe.

Readiness probe

The readiness probe is kind of similar to the liveness probe. It determines if the application running is allowed to serve traffic. Think about the case when the application starts up – so the liveness probe says, it’s all good – but until it can really respond to requests, it has to process a huge file, fill up the caches from the database or contact an external service. In this case you don’t want the application to be restarted by the liveness probe but wait until it’s fully operational.

Another scenario when the app is having some background processing responsibilities as well on top of a normal HTTP API. If it gets overloaded with the background work, it might not have enough resources to reply to HTTP requests, at least in case response time is crucial. With the readiness probe, you can have such functionality implemented so in case the application is lacking the necessary resources, no traffic will be sent to it until it frees up.

Configuring this type of probe is almost identical to other probes, the only difference is the name readinessProbe.

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

app: actuator-healthcheck-example

spec:

replicas: 1

selector:

matchLabels:

app: actuator-healthcheck-example

template:

metadata:

labels:

app: actuator-healthcheck-example

spec:

containers:

- name: actuator-healthcheck-example

image: actuator-healthcheck-example:latest

imagePullPolicy: IfNotPresent

ports:

- containerPort: 8080

readinessProbe:

httpGet:

path: /ready

port: 8080

initialDelaySeconds: 5

periodSeconds: 5

failureThreshold: 1

I’m not going to go through all the settings for the probes, the same can be applied just like for a liveness probe.

Example – startup

I’m going to extend the example I’ve shown in the previous article so if you are out of context, make sure you check it here.

Moving back to writing code. We are going to simulate when the application has to load something at startup that takes several seconds.

First of all, we need a state holder whether the application is ready to serve traffic or not.

@Component

public class ReadinessHolder {

private AtomicBoolean isReady = new AtomicBoolean(false);

public boolean isReady() {

return isReady.get();

Now the startup load simulation. I’m going to use the TaskExecutor interface from Spring to execute an asynchronous task that will set the isReady attribute to true after 20 seconds. The implementation looks like this:

@Component

public class ReadinessHolder {

private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class);

@Autowired

private TaskExecutor taskExecutor;

private AtomicBoolean isReady = new AtomicBoolean(false);

@PostConstruct

public void init() {

taskExecutor.execute(() -> {

logger.info("Sleeping for 20 seconds..");

Thread.sleep(TimeUnit.SECONDS.toMillis(20));

isReady.set(true);

logger.info("Application is ready to serve traffic");

} catch (InterruptedException e) {

Thread.currentThread().interrupt();

public boolean isReady() {

return isReady.get();

In the task, I’m doing a simple log message so we can verify the logs in Kubernetes. Then sleeping the thread for 20 seconds and after that setting the isReady state to true.

Next up, we need to expose this information on HTTP. I’m creating a new controller:

@RestController

public class ReadinessRestController {

@Autowired

private ReadinessHolder readinessHolder;

@GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE)

public ResponseEntity<String> isReady() {

if (readinessHolder.isReady()) {

return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK);

} else {

return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE);

A single GET endpoint that gets the data from the holder and depending on the value, it responds with either HTTP 200

"status":"READY"

or HTTP 503

"status":"NOT_READY"

Now the deployment descriptor:

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

app: actuator-healthcheck-example

spec:

replicas: 1

selector:

matchLabels:

app: actuator-healthcheck-example

template:

metadata:

labels:

app: actuator-healthcheck-example

spec:

containers:

- name: actuator-healthcheck-example

image: actuator-healthcheck-example:latest

imagePullPolicy: IfNotPresent

ports:

- containerPort: 8080

livenessProbe:

httpGet:

path: /actuator/health

port: 8080

initialDelaySeconds: 5

periodSeconds: 10

failureThreshold: 2

readinessProbe:

httpGet:

path: /ready

port: 8080

initialDelaySeconds: 5

periodSeconds: 5

failureThreshold: 1

apiVersion: v1

kind: Service

metadata:

labels:

app: actuator-healthcheck-example

spec:

type: NodePort

ports:

- port: 8080

targetPort: 8080

nodePort: 31704

selector:

app: actuator-healthcheck-example

There are 2 changes I’ve made compared to the previous article. On one hand I’ve added the readinessProbe so it’s mapped to the /ready endpoint we created. The other one is the Service descriptor, I’ve changed it to NodePort so it’s easier to access for the test. You can use the original descriptor if you want though. The NodePort only means that the API can be accessed through the Kubernetes node directly. For minikube, you can use minikube ip to get the address and then http://<ip>:31704 will be the root of the API.

Next up, let’s deploy the application. Usual exercise, building the jar, then the image and applying the Kubernetes descriptor. Don’t forget to execute eval $(minikube docker-env) if you are using minikube.

$ ./gradlew clean build

$ docker build . -t actuator-healthcheck-example

$ kubectl apply -f k8s-deployment.yaml

Observing the running pods:

$ kubectl get pods -w

The -w flag watches for changes. Inspecting the output:

$ kubectl get pods -w

NAME READY STATUS RESTARTS AGE

actuator-healthcheck-example-74bd59c574-92d7j 0/1 Running 0 3s

actuator-healthcheck-example-74bd59c574-92d7j 1/1 Running 0 29s

It’s clearly visible that after 20 seconds, the application suddenly changed it’s ready state, just like we implemented it. During the period of the pod not being ready, no request will be served. So if you try to execute for example the following command during startup:

$ curl <ip>:31704/actuator/health

curl: (7) Failed to connect to <ip> port 31704: Timed out

As soon as the readiness probe says, the pod is ready, executing the same command will result in a proper response:

$ curl <ip>:31704/actuator/health

{"status":"UP"}

That’s it. The readiness probe is working properly and it doesn’t let traffic go to the pod until it’s reported healthy.

Example – background processing

The other application for the readiness probe is when the application is running low on resources. Like if background processing is part of the application that uses threads in a threadpool. If it gets overloaded, the resources might not be sufficient to serve HTTP requests in an acceptable manner.

I hope you didn’t expect me to give you a full-blown background processing engine that will starve the compute power needed for an HTTP API. Rather I’m just going to emulate the insufficient resource state by setting a flag.

Compared to the previous example, we are making a single change for now. Exposing an HTTP API to switch the ready flag manually.

The new holder class with the switchReady method:

@Component

public class ReadinessHolder {

private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class);

@Autowired

private TaskExecutor taskExecutor;

private AtomicBoolean isReady = new AtomicBoolean(false);

@PostConstruct

public void init() {

taskExecutor.execute(() -> {

logger.info("Sleeping for 20 seconds..");

Thread.sleep(TimeUnit.SECONDS.toMillis(20));

isReady.set(true);

logger.info("Application is ready to serve traffic");

} catch (InterruptedException e) {

Thread.currentThread().interrupt();

public boolean isReady() {

return isReady.get();

public void switchReady() {

boolean newReadyValue = !isReady.get();

logger.info("Switching the ready flag to {}", newReadyValue);

isReady.set(newReadyValue);

And the controller with the new /readyswitch API:

@RestController

public class ReadinessRestController {

private static final Logger logger = LoggerFactory.getLogger(ReadinessRestController.class);

@Autowired

private ReadinessHolder readinessHolder;

@GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE)

public ResponseEntity<String> isReady() {

if (readinessHolder.isReady()) {

return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK);

} else {

return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE);

@GetMapping("/readyswitch")

public ResponseEntity<?> readySwitch() {

readinessHolder.switchReady();

return new ResponseEntity<>(HttpStatus.OK);

Building the application again and deploying. After the initial readiness probe lets traffic to the pod, we can simply switch the readiness flag so Kubernetes will stop forwarding requests to the pod.

Verifying the API works after startup:

$ curl <ip>:31704/actuator/health

{"status":"UP"}

Switching the flag:

$ curl <ip>:31704/readyswitch

Verifying the API doesn’t respond anymore:

$ curl <ip>:31704/actuator/health

curl: (7) Failed to connect to <ip> port 31704: Timed out

And Kubernetes is showing the pod as not ready:

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

actuator-healthcheck-example-74bd59c574-bblsl 0/1 Running 0 30m

Of course switching it back through the exposed port is not possible anymore as Kubernetes stopped sending HTTP traffic to the pod. We can still exec into the container though and switch the flag back:

$ kubectl exec -it actuator-healthcheck-example-74bd59c574-bblsl -- bash

bash-4.4# curl localhost:8080/readyswitch

Exiting from the inside container and checking what Kubernetes thinks about the pod:

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

actuator-healthcheck-example-74bd59c574-bblsl 1/1 Running 0 35m

Now it’s back to operation and traffic is allowed.

The real benefit kicks in when you are running the application in multiple instances. To demonstrate this, let’s create a dummy endpoint that logs a single message.

@RestController

public class DummyRestController {

private static final Logger logger = LoggerFactory.getLogger(DummyRestController.class);

@GetMapping(value = "/dummy", produces = MediaType.APPLICATION_JSON_VALUE)

public ResponseEntity<String> dummy() {

logger.info("Dummy call");

return new ResponseEntity<>("{}", HttpStatus.OK);

Making the application run in 2 instances needs a little bit of tweak (replicas attribute):

apiVersion: apps/v1

kind: Deployment

metadata:

labels:

app: actuator-healthcheck-example

spec:

replicas: 2

selector:

matchLabels:

app: actuator-healthcheck-example

template:

metadata:

labels:

app: actuator-healthcheck-example

spec:

containers:

- name: actuator-healthcheck-example

image: actuator-healthcheck-example:latest

imagePullPolicy: IfNotPresent

ports:

- containerPort: 8080

livenessProbe:

httpGet:

path: /actuator/health

port: 8080

initialDelaySeconds: 5

periodSeconds: 10

failureThreshold: 2

readinessProbe:

httpGet:

path: /ready

port: 8080

initialDelaySeconds: 5

periodSeconds: 5

failureThreshold: 1

apiVersion: v1

kind: Service

metadata:

labels:

app: actuator-healthcheck-example

spec:

type: NodePort

ports:

- port: 8080

targetPort: 8080

nodePort: 31704

selector:

app: actuator-healthcheck-example

Open 3 terminals, 2 for monitoring the application logs for the 2 instances and one for executing the requests.

For watching the logs continuously:

$ kubectl logs <podid> -f --tail=10

Execute this command against the 2 pods you have. Then call the /dummy API from the 3rd terminal.

$ curl <ip>:31704/dummy

Now Kubernetes is balancing the requests between the 2 replicas as they are both ready. You can see it in the logs as well:

2020-04-11 12:45:19.826 INFO 1 --- [nio-8080-exec-4] c.a.b.healthcheck.DummyRestController : Dummy call

Sometimes the first pod is serving the request, sometimes the other.

And now the most exciting part, if we switch one of the pods to not be ready.

$ curl <ip>:31704/readyswitch

Triggering the dummy API will always be served from the ready pod:

$ curl <ip>:31704/dummy

2020-04-11 12:48:55.554 INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:56.230 INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:56.743 INFO 1 --- [nio-8080-exec-3] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:57.182 INFO 1 --- [nio-8080-exec-5] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:57.520 INFO 1 --- [nio-8080-exec-7] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:57.905 INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController : Dummy call

2020-04-11 12:48:58.298 INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController : Dummy call

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

actuator-healthcheck-example-74bd59c574-fm7hx 0/1 Running 0 6m28s

actuator-healthcheck-example-74bd59c574-rxtbp 1/1 Running 0 6m28s

If you switch back the non-ready pod, it will continue responding back to requests.

Conclusion

We’ve checked 2 scenarios when the readiness probe is useful. The whole purpose of healthchecks is to create more resilient applications and I can encourage you to invest some time into doing it properly. It will definitely return the investment.

As usual, the code can be found on GitHub. If you liked the article, give it a thumbs up and share it. If you are interested in more, make sure you follow me on Twitter.

Readiness probe

Example – startup

Example – background processing

Conclusion

Recommend

iOS开发系列--IOS程序开发概览

iOS开发系列--让你的应用“动”起来

OnePlus Smartwatch Is In Works: Officially Confirmed

Two useful new accessories from Baseus that we recommend!

Apple closes all its retail stores in California indefinitely

Samsung to launch many low-end 5G phones in the first half of next year

Harmony OS will use a new User Interface (UI) – Huawei confirms

How to Design API Analytics Data Collection for High Volume APIs

API-First Product Managers’ Popular API Tools and API Metrics

AWS Well-Architected and Serverless Part V: Performance Efficiency

About Joyk