Streaming MySQL Binlogs to S3 (or Any Object Storage)
source link: https://www.percona.com/blog/streaming-mysql-binlogs-to-s3-or-any-object-storage/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Problem Statement
Having backups of binary logs is fairly normal these days. The more recent binary logs are copied offsite, the better RPO (Recovery Point Objective) can be achieved. I was asked multiple times recently if something could be done to “stream” the binary logs to S3 as close to real-time as possible. Unfortunately, there is no readily available solution that would do that. Here, I show what can be done and also show a proof of concept implementation, which is not suitable for production use.
In this example, the instance has two binary log files (mysql-bin.000001 and mysql-bin.000002) already closed and mysql-bin.000003 being written. A trivial solution for backing up these binary log files would be to back up just the closed ones (the one that is not written). The default size of the binary log file is 1 GB. This means with this solution we would have a 1 GB binlog not backed up in the worst-case scenario. On average, we would have 500M binary logs not backed up. These numbers can be made better by lowering the max_binlog_size parameter, but that will lead to a slew of files and frequent rotation.
Uploading to Object Storage
It is possible to upload files to S3 in chunks using multipart uploads. With this the file can be uploaded in chunks, the minimum chunk size is 5 MB. This means that a binlog can be read by another process while it’s written, and uploaded to S3 in 5 MB chunks. That’s definitely better than the 500 MB of the file copying or setting the max_binlog_size to 5M. Another thing that our backup solution could do is stream the binary logs to a remote location before it uploads them to S3.
The chunks are assembled together again on S3 when the multipart upload is finished.
The files produced by the mysqlbinlog command can be read and if we have 5M, they can be uploaded to S3. The last chunk of a multipart upload can be less than 5M. With this, it’s guaranteed that the file can be uploaded. The file is read in chunks while it’s written.
Proof of Concept Implementation
The flow of the proof of concept implementation is the following.
- Start a mysqlbinlog process to stream the backup to a temporary directory. This is in the background.
- Read the files in chunks in the temporary directory. It might happen that a read() call will return with less than 5MB worth of data. In order to handle this case, there is a buffer for the read call. If the buffer reaches the minimum chunk size (5M) for multipart upload, we will upload it. This means that it can happen that 4.5M is already read with several small reads to the buffer, and the next read() call will be able to read 5M. In this case, the size of that chunk will be 9.5M. This is totally fine, the chunks can be variable in size. The goal is to upload the data as soon as possible, so it’s better to do it in one request. This means that in this proof of concept implementation, the chunk sizes will be between 5M and 10M.
- Once the end of the file is reached, the final part is uploaded regardless of size, and the file will be closed, a chunk from the next file will be read next. The final part in a multipart upload can be less than 5M. After a file is successfully uploaded to S3 in full, the file is deleted from the local temp directory. So, the local temp directory holds files that are either being uploaded, or they didn’t start to upload yet.
- If the reader is on the last, not closed file, it will just wait for more data, when the buffer fills, it will continue to upload parts.
Example
In this example, I have a server with two binlogs:
The max_binlog_size is 100M for the sake of convenience.
The temporary directory has the binary logs:
In this case, streaming the binary logs from the beginning is much faster than uploading them to S3 (because I am streaming from a virtual machine locally, and I am uploading to S3 on a home internet connection).
Soon enough the binlog will be uploaded:
Part 17 will be bigger because it has less than a 5M buffer from the time when there were new binary logs, and when new data became available. It could read an additional 5M on top of that.
The uploaded parts are accessible with the S3 API (and they can be assembled to binlogs):
The S3 bucket can have a policy to auto-delete not finished multipart uploads periodically (for example unfinished multipart uploads that are older than 7 days).
The proof of concept code is available at https://github.com/pboros/binlog2s3.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK