26

Kafka压力测试

 3 years ago
source link: http://mp.weixin.qq.com/s?__biz=MzU2NDc4MjE2Ng%3D%3D&%3Bmid=2247484883&%3Bidx=1&%3Bsn=27e7fa401784086c78dd9d7d883e41a9
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

i6NvyqB.jpg!web

一、测试目的

本次性能测试在正式环境下单台服务器上Kafka处理MQ消息能力进行压力测试。测试包括对Kafka写入MQ消息和消费MQ消息进行压力测试,根据10w、100w和1000w级别的消息处理结果,评估Kafka的处理性能是否满足项目需求。(该项目期望Kafka能够处理上亿级别的MQ消息)

二、测试范围及方法

2.1 测试范围概述

测试使用Kafka自带的测试脚本,通过命令对Kafka发起写入MQ消息和Kafka消费MQ消息的请求。模拟不同数量级的MQ消息写入和MQ消息消费场景,根据Kafka的处理结果,评估Kafka是否满足处理亿级以上的消息的能力。

2.2性能测试场景设计

2.2.1 Kafka写入消息压力测试

测试场景

MQ消息数

每秒写入消息数

记录大小(单位:字节)

写入测试

10W

2000条

1000

10W

5000条

1000

100W

5000条

1000

2.2.2 Kafka消费消息压力测试

测试场景

消费MQ消息数

Kafka消息消费测试

10W

100W

1000W

2.3测试方法简要描述

2.3.1测试目的

验证带台服务器上Kafka写入消息和消费消息的能力,根据测试结果评估当前Kafka集群模式是否满足上亿级别的消息处理能力。

2.3.2测试方法

在服务器上使用Kafka自带的测试脚本,分别模拟10w、100w和1000w的消息写入请求,查看Kafka处理不同数量级的消息数时的处理能力,包括每秒生成消息数、吞吐量、消息延迟时间。Kafka消息吸入创建的topic命名为test_perf,使用命令发起消费该topic的请求,查看Kafka消费不同数量级别的消息时的处理能力。

压测命令信息:

测试项

消息数(W)

测试命令

写入MQ消息

10

./kafka-producer-perf-test.sh --topic test_perf --num-records 100000 --record-size 1000  --throughput 2000 --producer-props bootstrap.servers=10.150.30.60:9092

100

./kafka-producer-perf-test.sh --topic test_perf --num-records 1000000 --record-size 2000  --throughput 5000 --producer-props bootstrap.servers=10.150.30.60:9092

1000

./kafka-producer-perf-test.sh --topic test_perf --num-records 10000000 --record-size 2000  --throughput 5000 --producer-props bootstrap.servers=10.150.30.60:9092

消费MQ消息

10

./kafka-consumer-perf-test.sh --broker-list localhost:9092 --topic test_perf --fetch-size 1048576 --messages 100000 --threads 1

100

./kafka-consumer-perf-test.sh --broker-list localhost:9092 --topic test_perf --fetch-size 1048576 --messages 1000000 --threads 1

1000

./kafka-consumer-perf-test.sh --broker-list localhost:9092 --topic test_perf --fetch-size 1048576 --messages 10000000 --threads 1

脚本执行目录:服务器上安装Kafka的bin目录;

三、测试环境

3.1 测试环境机器配置表

主 机

数量

资 源

操作系统

MQ消息服务/处理

1

硬件:1(核)-4(G)-40(G)软件:Kafka单机(kafka_2.12-2.1.0)

ubuntu-16.04.5-server-amd64

3.2 测试工具

Kafka压测工具

Kafka自带压测脚本

3.3 测试环境搭建

这里仅仅使用单机版的kakfa,为了快速搭建,使用自带的zk。

新建目录

mkdir / opt / kafka_server_test

dockerfile

FROM ubuntu : 16.04

# 修改更新源为阿里云

ADD sources . list / etc / apt / sources . list

ADD kafka_2 .12 - 2.1 . 0 . tgz /

# 安装jdk

RUN apt - get update && apt - get install - y openjdk -8 - jdk -- allow - unauthenticated && apt - get clean all

EXPOSE 9092

# 添加启动脚本

ADD run . sh .

RUN chmod 755 run . sh

ENTRYPOINT [ "/run.sh"

]

run.sh

# !/ bin / bash

# 启动自带的zookeeper

cd / kafka_2 .12 - 2.1 . 0

bin / zookeeper - server - start . sh config / zookeeper . properties &

# 启动kafka

sleep 3

bin / kafka - server - start . sh config /

server

.

properties

sources.list

deb http : // mirrors . aliyun . com / ubuntu / xenial main restricted

deb http : // mirrors . aliyun . com / ubuntu / xenial - updates main restricted

deb http : // mirrors . aliyun . com / ubuntu / xenial universe

deb http : // mirrors . aliyun . com / ubuntu / xenial - updates universe

deb http : // mirrors . aliyun . com / ubuntu / xenial multiverse

deb http : // mirrors . aliyun . com / ubuntu / xenial - updates multiverse

deb http : // mirrors . aliyun . com / ubuntu / xenial - backports main restricted universe multiverse

deb http : // mirrors . aliyun . com / ubuntu xenial - security main restricted

deb http : // mirrors . aliyun . com / ubuntu xenial - security universe

deb http : // mirrors . aliyun . com /

ubuntu xenial

-

security multiverse

目录结构如下:

. /

├── dockerfile

├── kafka_2 .12 - 2.1 . 0 . tgz

├── run . sh

└── sources

.

list

生成镜像

docker build - t kafka_server_test / opt / kafka_server_test

启动kafka

docker run - d - it kafka_server_test

四、测试结果

4.1测试结果说明

本次测试针对Kafka消息处理的能力 进行压力测试,对Kafka集群服务器中的一台进行MQ消息服务的压力测试,关注Kafka消息写入的延迟时间是否满足需求。对Kafka集群服务器中的一台进行MQ消息处理的压力测试,验证Kafka的消息处理能力。

4.2.1写入MQ消息

测试项

总数(单位:w)

单消息大小(字节)

秒发送消息数

写入消息数/秒

95%的消息延迟(单位:ms)

写入MQ消息

10

1000

2000

1999.84

1

100

1000

5000

4999.84

1

1000

1000

5000

4999.99

1

压测结果

在上面已经启动了kafka容器,查看进程

root@ubuntu : / opt# docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

5ced2eb77349 kafka_server_test "/run.sh" 34 minutes ago Up 34 minutes 0.0 . 0.0 : 2181 -> 2181 / tcp , 0.0 . 0.0 : 9092 -> 9092

/

tcp youthful_bhaskara

进入kafka的bin目录

root@ubuntu : / opt# docker exec - it 5ced2eb77349 / bin / bash

root@5ced2eb77349 : / # cd /kafka_2.12-2.1.0/

root@5ced2eb77349 :

/kafka_2.12-2.1.0# cd bin/

1. 写入10w消息压测结果

执行命令

. / kafka - producer - perf - test . sh -- topic test_perf -- num - records 100000 -- record - size 1000 -- throughput 2000 -- producer - props bootstrap . servers = localhost : 9092

输出:

records sent , 1202.4 records / sec ( 1.15 MB / sec ), 1678.8 ms avg latency , 2080.0 max latency .

records sent , 2771.8 records / sec ( 2.64 MB / sec ), 1300.4 ms avg latency , 2344.0 max latency .

records sent , 2061.6 records / sec ( 1.97 MB / sec ), 17.1 ms avg latency , 188.0 max latency .

records sent , 1976.6 records / sec ( 1.89 MB / sec ), 10.0 ms avg latency , 177.0 max latency .

records sent , 2025.2 records / sec ( 1.93 MB / sec ), 15.4 ms avg latency , 253.0 max latency .

records sent , 2000.8 records / sec ( 1.91 MB / sec ), 6.1 ms avg latency , 163.0 max latency .

records sent , 1929.7 records / sec ( 1.84 MB / sec ), 3.7 ms avg latency , 128.0 max latency .

records sent , 2072.0 records / sec ( 1.98 MB / sec ), 14.1 ms avg latency , 163.0 max latency .

records sent , 2001.6 records / sec ( 1.91 MB / sec ), 4.5 ms avg latency , 116.0 max latency .

records sent , 1997.602877 records / sec ( 1.91 MB / sec ), 290.41 ms avg latency , 2344.00 ms max latency , 2 ms 50th , 1992 ms 95th , 2177 ms 99th , 2292 ms 99 .

9th

.

2. 写入100w消息压测结果

执行命令

. / kafka - producer - perf - test . sh -- topic test_perf -- num - records 1000000 -- record - size 1000 -- throughput 5000 -- producer - props bootstrap . servers = localhost : 9092

输出:

records sent , 2158.5 records / sec ( 2.06 MB / sec ), 2134.9 ms avg latency , 2869.0 max latency .

records sent , 7868.4 records / sec ( 7.50 MB / sec ), 1459.2 ms avg latency , 2815.0 max latency .

records sent , 4991.0 records / sec ( 4.76 MB / sec ), 20.3 ms avg latency , 197.0 max latency .

records sent , 4972.3 records / sec ( 4.74 MB / sec ), 61.8 ms avg latency , 395.0 max latency .

records sent , 4880.2 records / sec ( 4.65 MB / sec ), 64.7 ms avg latency , 398.0 max latency .

records sent , 5085.9 records / sec ( 4.85 MB / sec ), 17.7 ms avg latency , 180.0 max latency .

records sent , 5030.8 records / sec ( 4.80 MB / sec ), 14.7 ms avg latency , 157.0 max latency .

records sent , 5056.0 records / sec ( 4.82 MB / sec ), 1.4 ms avg latency , 58.0 max latency .

records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.8 ms avg latency , 58.0 max latency .

records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 25.0 max latency .

records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 14.0 max latency .

records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 19.0 max latency .

records sent , 5005.0 records / sec ( 4.77 MB / sec ), 1.2 ms avg latency , 57.0 max latency .

records sent , 5003.0 records / sec ( 4.77 MB / sec ), 1.3 ms avg latency , 55.0 max latency .

records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.9 ms avg latency , 44.0 max latency .

records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 49.0 max latency .

records sent , 4988.0 records / sec ( 4.76 MB / sec ), 1.1 ms avg latency , 49.0 max latency .

records sent , 5014.0 records / sec ( 4.78 MB / sec ), 0.8 ms avg latency , 44.0 max latency .

records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 10.0 max latency .

records sent , 5009.8 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 25.0 max latency .

records sent , 5001.2 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 7.0 max latency .

records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 49.0 max latency .

records sent , 5005.0 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 25.0 max latency .

records sent , 5006.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .

records sent , 5005.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 19.0 max latency .

records sent , 4976.1 records / sec ( 4.75 MB / sec ), 0.6 ms avg latency , 14.0 max latency .

records sent , 5036.0 records / sec ( 4.80 MB / sec ), 0.6 ms avg latency , 18.0 max latency .

records sent , 4999.8 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .

records sent , 4980.2 records / sec ( 4.75 MB / sec ), 0.5 ms avg latency , 14.0 max latency .

records sent , 5026.0 records / sec ( 4.79 MB / sec ), 0.5 ms avg latency , 14.0 max latency .

records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.4 ms avg latency , 10.0 max latency .

records sent , 5000.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 16.0 max latency .

records sent , 5007.0 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 42.0 max latency .

records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 24.0 max latency .

records sent , 5002.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 14.0 max latency .

records sent , 5009.0 records / sec ( 4.78 MB / sec ), 0.5 ms avg latency , 10.0 max latency .

records sent , 5006.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 18.0 max latency .

records sent , 5001.0 records / sec ( 4.77 MB / sec ), 0.4 ms avg latency , 6.0 max latency .

records sent , 5000.0 records / sec ( 4.77 MB / sec ), 128.2 ms avg latency , 955.0 max latency .

records sent , 4999.375078 records / sec ( 4.77 MB / sec ), 88.83 ms avg latency , 2869.00 ms max latency , 1 ms 50th , 327 ms 95th , 2593 ms 99th , 2838 ms 99 .

9th

.

3. 写入1000w消息压测结果

执行命令

. / kafka - producer - perf - test . sh -- topic test_perf -- num - records 10000000 -- record - size 1000 -- throughput 5000 -- producer - props bootstrap . servers = localhost : 9092

输出:

records sent , 1053.0 records / sec ( 1.00 MB / sec ), 1952.7 ms avg latency , 3057.0 max latency .

records sent , 4173.8 records / sec ( 3.98 MB / sec ), 4585.7 ms avg latency , 5256.0 max latency .

records sent , 9765.2 records / sec ( 9.31 MB / sec ), 2621.9 ms avg latency , 4799.0 max latency .

...

records sent , 5000.8 records / sec ( 4.77 MB / sec ), 0.6 ms avg latency , 79.0 max latency .

records sent , 4999.2 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 54.0 max latency .

records sent , 5003.0 records / sec ( 4.77 MB / sec ), 0.5 ms avg latency , 19.0 max latency .

records sent , 4996.445029 records / sec ( 4.76 MB / sec ), 310.11 ms avg latency , 22474.00 ms max latency , 1 ms 50th , 1237 ms 95th , 7188 ms 99th , 20824 ms 99 .

9th

.

kafka-producer-perf-test.sh 脚本命令的参数解析(以100w写入消息为例):--topic topic名称,本例为test_perf --num-records 总共需要发送的消息数,本例为100000 --record-size 每个记录的字节数,本例为1000 --throughput 每秒钟发送的记录数,本例为5000 --producer-props bootstrap.servers=localhost:9092 (发送端的配置信息,本次测试取集群服务器中的一台作为发送端,可在kafka的config目录,以该项目为例:/usr/local/kafka/config;查看server.properties中配置的zookeeper.connect的值,默认端口:9092)

MQ消息写入测试结果解析:

本例中写入100w条MQ消息为例,每秒平均向kafka写入了4.77MB的数据,大概是4999.375条消息/秒,每次写入的平均延迟为88.83毫秒,最大的延迟为2869毫秒。

4.2.2消费MQ消息

消费MQ消息

消费消息总数(单位:w)

共消费数据(单位:M)

每秒消费数据(单位:M)

每秒消费消息数

消费耗时(单位:s)

消费MQ消息

10

95.36

137

143899.3

0.695

100

953.66

177.19

185804.5

5.38

1000

9536.73

198.25

207878.6

48.11

压测结果

1. 消费10w消息压测结果

. / kafka - consumer - perf - test . sh -- broker - list localhost : 9092 -- topic test_perf -- fetch - size 1048576 -- messages 100000 -- threads 1

注意:此脚本没有--zookeeper选项,参考链接有错误!

必须要执行写入10w消息之后,才能执行上面的命令,否则运行时,会报下面的错误!

[ 2018 - 12 - 06 05 : 47 : 52 , 832 ] WARN [ Consumer clientId = consumer -1 , groupId = perf - consumer -19548 ] Error while fetching metadata with correlation id 18 : { test_perf = LEADER_NOT_AVAILABLE } ( org . apache . kafka . clients . NetworkClient )

WARNING : Exiting before consuming the expected number of messages : timeout ( 10000 ms ) exceeded . You can use the --

timeout option to increase the timeout

.

正常输出:

start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec

2018 - 12 - 06 05 : 50 : 41 : 276 , 2018 - 12 - 06 05 : 50 : 45 : 281 , 95.3674 , 23.8121 , 100000 , 24968.7890 , 78 , 3927 , 24.2851 ,

25464.7313

2. 消费100w消息压测结果

. / kafka - consumer - perf - test . sh -- broker - list localhost : 9092 -- topic test_perf -- fetch - size 1048576 -- messages 1000000 -- threads 1

输出:

start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec

2018 - 12 - 06 05 : 59 : 32 : 360 , 2018 - 12 - 06 05 : 59 : 51 : 624 , 954.0758 , 49.5264 , 1000421 , 51932.1532 , 41 , 19223 , 49.6320 ,

52042.9173

3. 消费1000w消息压测结果

. / kafka - consumer - perf - test . sh -- broker - list localhost : 9092 -- topic test_perf -- fetch - size 1048576 -- messages 10000000 -- threads 1

输出:

start . time , end . time , data . consumed . in . MB , MB . sec , data . consumed . in . nMsg , nMsg . sec , rebalance . time . ms , fetch . time . ms , fetch . MB . sec , fetch . nMsg . sec

2018 - 12 - 06 06 : 35 : 54 : 143 , 2018 - 12 - 06 06 : 38 : 05 : 585 , 9536.9539 , 72.5564 , 10000221 , 76080.8646 , 39 , 131403 , 72.5779 ,

76103.4451

kafka-consumer-perf-test.sh 脚本命令的参数为:--broker-list 指定kafka的链接信息,本例为localhost:9092 --topic 指定topic的名称,本例为test_perf,即4.2.1中写入的消息;--fetch-size 指定每次fetch的数据的大小,本例为1048576,也就是1M --messages 总共要消费的消息个数,本例为1000000,100w

以本例中消费100w条MQ消息为例总共消费了954.07M的数据,每秒消费数据大小为49.52M,总共消费了1000421条消息,每秒消费51932.15条消息。

五、结果分析

一般写入MQ消息设置5000条/秒时,消息延迟时间小于等于1ms,在可接受范围内,说明消息写入及时。

Kafka消费MQ消息时,1000W待处理消息的处理能力如果在每秒20w条以上,那么处理结果是理想的。

根据Kafka处理10w、100w和1000w级的消息时的处理能力,可以评估出Kafka集群服务,是否有能力处理上亿级别的消息。

本次测试是在单台服务器上进行,基本不需要考虑网络带宽的影响。所以单台服务器的测试结果,对评估集群服务是否满足上线后实际应用的需求,很有参考价值。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK