

Meta (Facebook) 把 MySQL replication 丟上自製的 Raft 系統
source link: https://blog.gslin.org/archives/2023/05/19/11191/meta-facebook-%e6%8a%8a-mysql-replication-%e4%b8%9f%e4%b8%8a%e8%87%aa%e8%a3%bd%e7%9a%84-raft-%e7%b3%bb%e7%b5%b1/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Meta (Facebook) 把 MySQL replication 丟上自製的 Raft 系統
看到「Building and deploying MySQL Raft at Meta」這篇,在講 Meta (Facebook) 把 MySQL 的 replication 架構換成自己用 Raft 的系統。
舊的系統是走 MySQL 的 semisync replication:
Previously, our replication solution used the MySQL semisynchronous (semisync) replication protocol.
其中 semisync replication 是在 MySQL 5.5 加入的功能,在至少一個遠端收到 replication log 後才傳回成功 (可以設定數量):「Semisynchronous Replication」。
Semisynchronous replication falls between asynchronous and fully synchronous replication. The source waits until at least one replica has received and logged the events (the required number of replicas is configurable), and then commits the transaction.
然後舊的系統是透過一包 Python 軟體在管理這些機器的各種 failover 操作:
The control plane operations (e.g., promotions, failover, and membership change) would be the responsibility of a set of Python daemons (henceforth called automation).
這個方法常遇到的問題是切換 primary server (以前叫做 master server) 時有可能會因為 binlog position 接不起來而失敗。
所以後來 MySQL 導入了 GTID,可以緩解這個問題,但還是有可能會發生不同的 secondary server (以前叫做 slave server) 會有不一樣的資料。
而在 Meta 改出來的架構裡面,把 replication data 直接寫到一個用 Raft 同步的系統,同步到其他的 secondary server 上面:
In MySQL Raft:
- Primary writes to binlog via Raft, and Raft sends binlog to followers/replicas.
- Replicas/followers receive in binlog and apply the transactions to the engine. An apply log is created during apply.
- Binlog is the replicated log from the Raft point of view.
是個一般單位不太會遇到的架構,而且可以預期其他公司的人遇到類似問題應該也不會用這個方法解...
Related
Galera
在 MySQL Performance Blog 看到「State of the art: Galera – synchronous replication for InnoDB」這篇文章,介紹 codership 的 Galera,一套在 MySQL InnoDB 上面實做 Multi-master 且 synchronous replication 的系統。 因為是 synchronous replication,所以可以看出著重在資料正確性,以架構看起來沒辦法 scale 到很大的系統,但對於量不會太大的資料 (像是購買交易資料) 不是問題。 目前看這套的賣點是在把 synchronous replication patch 放入 MySQL,這是其他的 MySQL distribution 目前沒有直接包進去的功能。至於 Multi-master 與 auto-inc 的管理,目前 open source community 已經有不少解決方案了。
October 28, 2009In "Computer"
MySQL 5.6 的 GTID...
看到 Percona 的人在討論 MySQL 5.6 的 GTID (Global Transaction ID) 功能,剛剛就實際到 AWS 上開了兩台 m1.large 測試:「How to create/restore a slave using GTID replication in MySQL 5.6」。 要測試 GTID,因為剛出來沒多久,沒有多少文件可以看。MySQL 官方的「Replication with Global Transaction Identifiers」是必讀的文件。查 MySQL 官方文件時可以發現 5.6.9 (RC) 到 5.6.10 (GA) 其實還是改了不少變數名稱,如果在網路上找舊文章照抄是不會動的... 先提結論,Galera Cluster 畢竟出來很久了,成熟度比 GTID 高,建議現在先觀望一陣子,至少等 best practice 出來後再進場測試... 之前的 MySQL…
February 11, 2013In "AWS"
MySQL replication lag 的解法
MySQL replication 不是同步進行,也就是說,在 master 寫入或刪除了某筆資料後,在 slave 上可能會讀到舊的資訊。解法主要是依照程度而決定要怎麼做。 如果 replication lag 不太影響整體的觀感,那麼不管這個問題是一個還蠻直接的解法。 如果在一個 application 裡會需要一致性,那麼都到 master 讀寫也是一個還可以的解法。而一般只讀取的 application 只到 slave 取得資料。 如果要求很嚴謹,可以考慮用 SemiSyncReplication,強迫 master 寫入時等到 slave 回應 okay 後才會完成。這種主要是用在寫入不多,但一致性很重要的場合。
April 19, 2009In "Computer"
Author Gea-Suan LinPosted on May 19, 2023Categories Computer, Database, Murmuring, MySQL, Network, SoftwareTags database, db, facebook, meta, mysql, raft, rdbms, replication, semisync
Leave a Reply
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Notify me of follow-up comments by email.
Notify me of new posts by email.
To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Learn More)
Post navigation
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK