How to reduce your JVM app memory footprint in Docker and Kubernetes

Recently, I managed to dramatically reduce the memory usage of a widely-used JVM app container on Kubernetes and save a lot of money. I figured out which JVM flags are more important, how to set them correctly and how to easily measure the impact of my actions on various parts of the app’s memory usage. Here are my Cliff notes.

The story starts with a use case. I work at Wix, as part of the data-streams team, that is in charge of all our Kafka infrastructure. Recently I was tasked with creating a Kafka client proxy for our Node.js services.

Use Case: Kafka client sidecar with a wasteful memory usage

The idea was to delegate all Kafka-related actions (e.g. produce and consume) from a Node.js app to a separate JVM app. The motivation behind this is that a lot of our own infrastructure for Kafka clients is written in Scala (It’s called greyhound — the open source version can be found here).
With a sidecar, the Scala code doesn’t need to get duplicated in other languages. Only a thin wrapper is needed.

Once we deployed the sidecar in production, we noticed that it consumes quite a lot of memory.

Metric used — container_memory_working_set_bytes

As you can see from the table above, the memory footprint for the sidecar (running openjdk 8) alone is 4–5 times bigger than the node-app container when it used to include the kafka library.

I had to understand why and how to reduce it considerably.

Experimenting with production data

I set out to create a test-app that mimics the sidecar of this particular node app in order to be able to freely experiment on it without affecting production. The app contained all the consumers from the production app for the same production topics.

As a way to monitor memory consumption I’ve used metrics such as heapMemoryUsedand nonHeapMemoryUsed exposed from mxbeans inside my application to Prometheus/Grafana but you can also use jconsole or jvisualvm (both come bundled with JDK 8 or above)

First, I tried to understand the impact of each consumer and producer, and of the gRPC client (that calls the node app) and I came to the conclusionthat having one more consumer (or one less) does not affect memory footprint in a meaningful way.

JVM Heap Flags

Then, I turned my attention to heap allocation,
There are two important JVM flags that are related to heap allocation -Xms (heap memory size on startup) and -Xmx (maximum heap memory size)

I’ve played around with many different combinations of the two and recorded the resulting container memory usage:

Container overall used memory with different heap flags

The first conclusion I came to from analysing the data on variations in heap flags was that if Xmx is higher than Xms, and you have an app with a high memory pressure, then the allocated memory for heap is almost certainly going to continue to grow up to the Xmx limit, causing the container’s overall memory usage to grow as well (see comparison in the charts below).

Xmx >> Xms

But if Xmx is the same as Xms, you can have much more control over the overall memory usage, as the heap will not gradually increase over time (see comparison below).

Xmx = Xms

The second conclusion I came to from the heap flags data was, that you can lower Xmx dramatically as long as you don’t see a significant duration of JVM pauses due to Garbage Collection (GC) — more than 500ms for an extended period of time. I’ve again used Grafana for monitoring GC, but you can also use visualgc or gceasy.io

benign JVM pause times due to GC

Please be careful with the number you set for Xmx — if your application has high variation in the message consuming throughput, your application will be more susceptible to GC storms once your app experiences a big burst of incoming messages.

Kafka related tune-up

Our Greyhound (Kafka) Consumer has an internal message buffer that can get as high as 200 messages. When I reduced the maximum allowed size to 20, I’ve noticed that heap memory usage oscillates on a much narrower band than with size=200 (and also has considerably lower usage overall):

Heap memory usage pattern when bufferMax=200

Heap memory usage pattern when bufferMax=20

Of course reducing buffer size means the app will not handle bursts well — so this does not work for high throughput applications. In order to mitigate this I’ve doubled the level of parallelism for greyhound consumer handlers per pod.
I.e., I’ve increased the number of threads that process Kafka messages from 3 to 6. In outlier cases either the app will require more pods, or the max buffer configuration will have to be altered.

Reducing Kafka Consumer fetch.max.bytes from 50M to 5M (to reduce total polled messages size) did not have a noticeable effect on memory footprint. Nor did extracting out the greyhound producer from the sidecar app (It can reside on DaemonSet so it will run on each K8s Node).

Summary — What helped with reducing memory usage

The optimizations i’ve made reduced the container memory usage from 1000M to around 550–600M. Here are the changes that contributed to the lower footprint:

Maintain a consistent heap size allocation
Make -Xms equal to -Xmx
Reduce the amount of discarded objects (garbage)
E.g. buffer less Kafka messages
A little bit GC goes a Long way
Continuelowering xmx as long as GC (new Gen+Old Gen) don’t take considerable percentage (0.25% cpu time)

What didn’t help (substantially)

Reducing KafkaConsumer’s fetch.max.bytes
Removing Kafka producer
Switching from gRPC client to Wix’s custom json-RPC client

Future Work

Explore if GraalVM native image can help
Compare different GC implementations. (I’ve used CMS, but there’s G1)
Reduce the number of threads we use when consuming from Kafka by switching to our open-sourced ZIO based version of greyhound.
Reduce the allocated memory for each thread (by default each thread is assigned 1MB)

More improvements (and a second blog post) are sure to come.

More information

Docker memory resource limits and a heap of Java — blog post

Memory Footprint of a Java Process-Video from GeekOUT conference

How to reduce your JVM app memory footprint in Docker and Kubernetes

How to reduce your JVM app memory footprint in Docker and Kubernetes

Use Case: Kafka client sidecar with a wasteful memory usage

Experimenting with production data

JVM Heap Flags

Kafka related tune-up

Summary — What helped with reducing memory usage

What didn’t help (substantially)

Future Work

More information

Recommend

我们为什么停用微服务？

Useless Use of Cat Award

vSphere 7 – ESXi System Storage Changes

作为字节跳动第六名前端，谈谈公司扁平化架构育人的道与术

How internal politics interfere with data science

3 Highly Practical Operations of Pandas

Raspberry Pi 4 Gets Its 8 Gigs

蔚来高管解读一季度财报：具备在中国市场上市的可能性

中国科学院海洋研究所首次在深海热液区发现气态水

Youtube删除新冠谣言出现附带损伤将内容审核推向风口浪尖

About Joyk