7

Kafka 拔掉 ZooKeeper 的進展已經進到 trunk

 3 years ago
source link: https://blog.gslin.org/archives/2021/04/01/10100/kafka-%e6%8b%94%e6%8e%89-zookeeper-%e7%9a%84%e9%80%b2%e5%b1%95%e5%b7%b2%e7%b6%93%e9%80%b2%e5%88%b0-trunk/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Kafka 拔掉 ZooKeeper 的進展已經進到 trunk

差不多一年前題到了 Kafka 打算拔掉 ZooKeeper 的計畫 (在「Kafka 拔掉 ZooKeeper 的計畫」這邊有提到),昨天 Confluent 發了一篇進展:「Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper」。

這件事情可以在「KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum」這邊看到說明,另外在 Jira 上面則是在 KAFKA-9119 這邊追蹤。

目前的好消息就是 KIP-500 的程式碼都已經推進 trunk,目標在 2.8 釋出時就可以用:

So we’re very pleased to say that the early access of the KIP-500 code has been committed to trunk and is expected to be included in the upcoming 2.8 release.

文件裡面會用「Kafka Raft Metadata mode」或是「KRaft」稱呼這種模式:

For the first time, you can run Kafka without ZooKeeper. We call this the Kafka Raft Metadata mode, typically shortened to KRaft (pronounced like craft) mode.

然後有缺一些功能,另外就是常態性宣導這是測試版,建議不要在 production 環境用:

Beware, there are some features that are not available in this early-access release. We do not yet support the use of ACLs and other security features or transactions. Also, both partition reassignment and JBOD are unsupported in KRaft mode (these are anticipated to be available in an Apache Kafka release later in the year). Hence, consider the quorum controller experimental software—we don’t advise subjecting it to production workloads. If you do try out the software, however, you’ll find a host of new advantages: It’s simpler to deploy and operate, you can run Kafka in its entirety as a single process, and it can accommodate significantly more partitions per cluster (see measurements below).

目前丟出來的測試數據可以看到有大幅改善,但仔細看居然是停機與恢復時間:

不知道實際的效能上有多少影響 (正面或是負面),還是得等了...

Related

Kafka 拔掉 ZooKeeper 的計畫

目前 Kafka cluster 還是會需要透過 ZooKeeper 處理不少資料,但眾所皆知的,ZooKeeper 實在是不好維護,所以 Kafka 官方從好幾年前就一直在想辦法移除對 ZooKeeper 的相依性。 這篇算是其中一塊:「Kafka Needs No Keeper」。 真的自己架過 Kafka cluster 就會知道其中的 ZooKeeper 很不好維護,尤其是 Apache 官方版本的軟體與文件常常脫勾,設定起來就很痛苦。所以一般都會用 Confluent 出的包裝,裡面的 ZooKeeper 軟體與 Confluent 自己寫的文件至少都被測過,不太會遇到官方文件與軟體之間搭不上的問題。 另外一個常見的痛點是,因為 Kafka 推動拔掉 ZooKeeper 的計畫推很久了 (好幾年了),但進展不快,所以有時候會發現在 command line 下,有些指令會把 API endpoint 指到 ZooKeeper 伺服器上,但有些指令卻又指到 Kafka broker 上,這點一直在邏輯上困擾很久,直到看到官方的拔除計畫 (但又不快) 才理解為什麼這麼不一致... 給需要的人參考,當初在架設 Kafka…

March 18, 2020

In "Computer"

C++ 版本的 Kafka

在 Hacker News 首頁上看到的東西,看起來是 C++ 版本的 Apache Kafka 替代品 Redpanda,要注意的是這不是 open source license 軟體:「Redpanda – A Kafka-compatible streaming platform for mission-critical workloads (vectorized.io)」。 目前網站上可以看到,最主要的特點是不需要 Apache ZooKeeper,不過這點在 Kafka 這邊也正在進行了 (之前在「Kafka 拔掉 ZooKeeper 的計畫」這篇有提到),雖然進度有點慢... 另外目前完全沒有 benchmark 資料,只有宣稱 10x 更快,官方在 Hacker News 上的回應是說十二月會有公開的測試報告,這樣我會把 10x 當廣告詞看了,應該會更快,但大概是某種特殊情境下才會達到...

November 17, 2020

In "Computer"

AWS 也把 Kafka 包出來當服務了...

AWS 發表把 Kafka 包起來當服務賣的 Amazon MSK:「Introducing Amazon Managed Streaming for Kafka (Amazon MSK) in Public Preview」。 另外 Kafka 所需的 ZooKeeper 部份已經被包進去了,不需要另外付費: You do not pay for Apache Zookeeper nodes that Amazon MSK provisions for you, or data transfer that occurs between brokers and nodes within clusters. 目前看起來只提供 us-east-1 區域使用。

December 1, 2018

In "AWS"

a611ee8db44c8d03a20edf0bf5a71d80?s=49&d=identicon&r=gAuthor Gea-Suan LinPosted on April 1, 2021Categories Computer, Murmuring, SoftwareTags apache, confluent, engine, kafka, kafka-9119, kip-500, kraft, metadata, mode, raft, stream, streaming, trunk, zookeeper

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

Post navigation


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK