38

Stripping dependency bloat in VictoriaMetrics Docker image

 5 years ago
source link: https://www.tuicool.com/articles/hit/JbaqmqE
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
7jEj6rU.jpg!webfEnIR3I.jpg!web
Photo by Erwan Hesry on  Unsplash

Let’s compare docker image sizes for popular time series database solutions:

Docker image for VictoriaMetrics is the smallest — it occupies only 5MB. This is:

  • 6.8 times smaller than the TimescaleDB image
  • 8.6 times smaller than the Prometheus image
  • 10.2 times smaller than the InfluxDB image
  • 31.8 times smaller than the ClickHouse image

Let’s see how to achieve such a small size for the VictoriaMetrics image comparing to other TSDB solutions.

Step 1: creating statically linked binary on scratch image

VictoriaMetrics is written in Go. This language is known to be able to build statically linked binaries without any dependencies. Such binaries may run on scratch image in Docker.

By default Go doesn’t build statically linked binaries:

$ go build ./app/victoria-metrics/
$ ldd ./victoria-metrics 
 linux-vdso.so.1 (0x00007ffcec9b8000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7369714000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7369323000)
 /lib64/ld-linux-x86-64.so.2 (0x00007f7369933000)

The built binary depends on system libraries — libpthread and libc — which are missing in scratch image. In order to build statically linked binary, -ldflags "-extldflags '-static'" must be passed to go build :

$ go build -ldflags "-extldflags '-static'" ./app/victoria-metrics/
# github.com/valyala/VictoriaMetrics/app/victoria-metrics
/tmp/go-link-905380395/000004.o: In function `_cgo_7e1b3c2abc8d_C2func_getaddrinfo':
/tmp/go-build/cgo-gcc-prolog:57: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

WTF? Well, by default Go uses system library for DNS resolving if the program uses C libraries aka cgo . VictoriaMetrics uses gozstd , which depends on upstream C library . The following option must be passed to go build in order to force using Go-native DNS resolver in this case: -tags netgo . Let’s try it:

$ go build -ldflags "-extldflags '-static'" -tags netgo ./app/victoria-metrics/
$ ldd ./victoria-metrics 
 not a dynamic executable
$ ./victoria-metrics --help
Usage of ./victoria-metrics:
  -httpListenAddr string
     TCP address to listen for http connections (default ":8428")
... skip ...
-retentionPeriod int
     Retention period in months (default 1)
... skip ...
-storageDataPath string
     Path to storage data (default "victoria-metrics-data")

Great! Now we have working statically linked binary, which can run in scratch docker image . Here is a complete Dockerfile for building VictoriaMetrics image:

FROM scratch
COPY --from=local/certs:1.0.1 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY bin/victoria-metrics-prod .
EXPOSE 8428
ENTRYPOINT ["/victoria-metrics-prod"]

Here bin/victoria-metrics-prod is a statically linked binary built at the previous step.

The following line puts root certificates to the docker image, so VictoriaMetrics could interact with external world by https:

COPY --from=local/certs:1.0.1 /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt

Now we have a small docker image for VictoriaMetrics. But its’ size may be reduced further.

Step 2: removing unneeded Go dependencies

We use go modules for building VictoriaMetrics. Initially it had a big go.mod file with a ton of external dependencies like go.mod from Prometheus . The majority of these dependencies were transient and weren’t used by VictoriaMetrics directly. These dependencies had negative impact on build times and the resulting binary size.

We started investigating the possibilities on how to remove unneeded dependencies and worked out the following solution:

  • To use self-contained small packages without big transient dependencies.
  • To extract the required functionality from bloated packages or packages with big transient dependencies.

Sometimes it was hard to extract the required functionality from bloated package. In this case we were implementing the functionality from scratch. For example, we removed github.com/prometheus/client_golang dependency for exposing metrics in Prometheus format, since it was bloated. We created a small self-contained package — lib/metrics — with the required functionality. We plan to open source this package later, so Go developers could choose between comprehensive and bloated github.com/prometheus/client_golang and slim package from VictoriaMetrics for exposing metrics in Prometheus format :)

Now the go.mod file for VictoriaMetrics contains only essential small third-party packages:

module github.com/valyala/VictoriaMetrics
require (
        github.com/VictoriaMetrics/fastcache v1.4.6
        github.com/cespare/xxhash v1.1.0
        github.com/golang/snappy v0.0.1
        github.com/valyala/fastjson v1.4.1
        github.com/valyala/fastrand v0.0.0-20170531153657-19dd0f0bf014
        github.com/valyala/gozstd v1.3.0
        github.com/valyala/quicktemplate v1.0.2
        golang.org/x/sys v0.0.0-20190318195719-6c81ef8f67ca
)

This reduced VictoriaMetrics build times from 5 seconds to 1.5 seconds. The resulting statically linked binary size has been reduces from 23MB to 11MB. The binary is compressed into 5MB when put into scratch Docker image.

Conclusions

It is easy to create small Docker images using the following rules:

And try single-node VictoriaMetrics . It is able to substitute moderately sized cluster built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK