腾讯云开源 Kvass 项目，可轻松让 Prometheus 支持横向自动扩缩容

zMnaQfN.png!mobile

Kvass is a Prometheus horizontal auto-scaling solution , which uses Sidecar to generate special config file only containes part of targets assigned from Coordinator for every Prometheus shard.

Coordinator do service discovery, Prometheus shards management and assign targets to each of shard. Thanos (or other storage solution) is used for global data view.

Overview

Kvass is a Prometheus horizontal auto-scaling solution with following features.

Easy to use
Tens of millions series supported (thousands of k8s nodes)
One prometheus configuration file
Auto scaling
Sharding according to the actual target load instead of label hash
Multiple replicas supported

Architecture

BBza6vB.png!mobile

Components

Coordinator

See flags of Coordinator code

Coordinaotr loads origin config file and do all prometheus service discovery
For every active target, Coordinator do all "relabel_configs" and explore target series scale
Coordinaotr periodly try assgin explored targets to Sidecar according to Head Block Series of Prometheus.

qQBnYb.png!mobile

Sidecar

See flags of Sidecar code

Sidecar receive targets from Coordinator.Labels result of target after relabel process will also be send to Sidecar.
Sidecar generate a new Prometheus config file only use "static_configs" service discovery, and delete all "relabel_configs".
All Prometheus scraping request will be proxied to Sidecar for target series statistics.

Kvass + Thanos

Since the data of Prometheus now distribut on shards, we need a way to get global data view.

Thanos is a good choice. What we need to do is adding Kvass sidecar beside Thanos sidecar, and setting up a Kvass coordinator.

quArayv.png!mobile

Kvass + Remote storage

If you want to use remote storage like influxdb, just set "remote write" in origin Prometheus config.

Multiple replicas

Coordinator use label selector to select shards StatefulSets, every StatefulSet is a replica, Kvass puts together Pods with same index of different StatefulSet into one Shards Group.

--shard.selector=app.kubernetes.io/name=prometheus

Demo

There is a example to show how Kvass work.

git clone https://github.com/tkestack/kvass

cd kvass/example

kubectl create -f ./examples

you can found a Deployment named "metrics" with 6 Pod, each Pod will generate 10045 series (45 series from golang default metrics) metircs。

we will scrape metrics from them。

3q2m2qB.png!mobile

the max series each Prometheus Shard can scrape is a flag of Coordinator Pod.

in the example case we set to 30000.

--shard.max-series=30000

now we have 6 target with 60000+ series and each Shard can scrape 30000 series，so need 3 Shard to cover all targets.

Coordinator automaticly change replicate of Prometheus Statefulset to 3 and assign targets to them.

qMVjqu6.png!mobile

only 20000+ series in prometheus_tsdb_head of one Shard

n6JJzqR.png!mobile

but we can get global data view use thanos-query

fmYvyer.png!mobile

Flag values suggestion

The memory useage of every Prometheus is associated with the max head series.

The recommended "max series" is 750000, set Coordinator flag

--shard.max-series=750000

The memory request of Prometheu with 750000 max series is 8G.

License

Apache License 2.0, see LICENSE .

Table of Contents

Overview

Architecture

Components

Coordinator

Sidecar

Kvass + Thanos

Kvass + Remote storage

Multiple replicas

Demo

Flag values suggestion

License

Recommend

GOF的23种设计模式是如何成就.NET5的？打开源码，一探究竟！

至少这个领域的技术扫盲，在数字化倒逼之下，总算真正开始了。

许昌假发涌入直播间：手工假发价格破万，商家直播日销两千顶

看过60位设计师作品集后，我总结了这些加分技巧

GitHub - tkestack/kvass: Kvass is a Prometheus horizontal auto-scaling solution...

别让 USB 传输速度影响 Android 开发效率 - 知乎

Grails Oauth2 插件适配非标准 SSO 接口 - DTeam 技术日志

闲置的云服务器有近 20w 的 sshd 登陆失败的记录

30 以后还能依靠技术吃饭吗

奇客又一家瑞士加密设备公司被指与情报机构合作

About Joyk