5

Repoint Replica Servers in MySQL/Percona Server for MySQL 8.0

 2 years ago
source link: https://www.percona.com/blog/repoint-replica-servers-in-mysql-percona-server-for-mysql-8-0/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Repoint Replica Servers in MySQL/Percona Server for MySQL 8.0

When doing migrations or failovers in MySQL, there is usually a need to do a topology change and repoint replica servers to obtain replication data from a different server.

For example, given servers {A, B, and C} and the following topology:

MySQL Topology

If you need to repoint C to be a replica of B, i.e:

repoint mysql

You can follow the next steps:

Note: log_replica_updates should be enabled on the soon-to-be primary as it is a prerequisite for chain replication.

Note: It is assumed that both replicas only stream from Server A and there are no conflicting replication filters in place that might break replication later on.

If Using File/Position-Based Replication:

Shell
1) Stop B and C
STOP REPLICA;
2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. To do so, execute the following commands on BOTH nodes:
START REPLICA UNTIL SQL_AFTER_MTS_GAPS;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations
3) Then check which is the node that is more up to date by looking at Relay_Source_Log_File and Exec_Source_Log_Pos. Run on BOTH nodes:
SHOW REPLICA STATUS\G
# Take note of Relay_Source_Log_File/Exec_Source_Log_Pos from the most up to date node.
4) Sync replicas with UNTIL. Run on the most delayed node with above outputs:
START REPLICA UNTIL SOURCE_LOG_FILE='<Relay_Source_Log_File>', SOURCE_LOG_POS=<Exec_Source_Log_Pos>;
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:
SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Relay_Source_Log_File must match in both replicas
# Exec_Source_Log_Pos must match in both replicas
6) Get current coordinates from B (new intermediate primary). Execute on B:
SHOW MASTER STATUS \G
# Take note of File and Position
7) Repoint C to B. Execute on C with coords from previous step:
CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>';
8) If you had disabled MTS, you should re-enable here for both B and C;
SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value
9) Restart replication normally. Run on both nodes:
START REPLICA;

If Using GTID-Based Replication:

Shell
1) Stop B and C:
STOP REPLICA;
2) If replicas are multi-threaded, correct MTS gaps and make them single-threaded until all changes are applied. Run on BOTH nodes:
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
STOP REPLICA;
SELECT @@global.replica_parallel_workers; -- take note to restore later
SET GLOBAL replica_parallel_workers=0; -- disable MTS during the operations
3) Then check which is the node that is more up to date by looking at sequence numbers in Executed_Gtid_Set. Run on BOTH nodes:
SHOW REPLICA STATUS\G
# Take note of Executed_Gtid_Set with the largest sequence number. If there is a mismatch in the gtid sets it means there were either local writes or writes coming from some other server. In that case you should check data consistency between the servers, for example with pt-table-checksum . Then you need to fix gtid differences by either restoring the replica from scratch or fix errant transactions as explained on this other blogpost
4) Bring up all nodes to the same point in time. Run on node with smallest GTID sequence number;
START REPLICA UNTIL SQL_AFTER_GTIDS='<Executed_Gtid_Set>';
SHOW REPLICA STATUS\G -- repeat this until you see "Replica_SQL_Running: No"
5) If followed above steps, at this point both replicas should have the exact same data set and should be in sync at the same point in time.
# Double check that both replicas are stopped and with the same coords as doing topology changes while replication is ongoing and with diffs coords can cause inconsistencies:
SHOW REPLICA STATUS\G
# Replica_IO_Running must be “NO” in both replicas
# Replica_SQL_Running must be “NO” in both replicas
# Executed_Gtid_Set must match in both replicas
6) Now both replicas have identical data, so you can re-point C to replicate from B. Run on C:
CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-B>'
7) If you had disabled MTS, you should re-enable here for both B and C;
SET GLOBAL replica_parallel_workers=X; -- see output of step 2 for correct value
8) Restart replication normally. Run on both nodes
START REPLICA;

Doing the opposite replication change from chain replication (A->B->C) into one primary with two replicas should be simpler:

If Using File/Position-Based Replication:

Shell
1) Stop replication on B and make sure B is not receiving any write activity:
STOP REPLICA;
2) Check current binary log position on B:
SHOW MASTER STATUS \G
3) On C check replication until C does catch up with B. On C:
SHOW REPLICA STATUS \G
# For C to have catch up with B, the following conditions should be met:
# “File” from B on step 2) should match Relay_Source_Log_File from 3)
# “Position” from B on step2) should match Exec_Source_Log_Pos from 3)
# After catchup, both servers will be in sync with the same data set.
4) Check current replication coords from B:
SHOW REPLICA STATUS \G
# Write down Relay_Source_Log_File and Exec_Source_Log_Pos from B, as we will be using this coords on C
5) Re point C to replicate from A. File and positions used should be the ones taken from B on last step:
CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>', SOURCE_LOG_FILE='<File>', SOURCE_LOG_POS='<Position>'
6) Restart replication normally. Run on both nodes:
START REPLICA;

If Using GTID-Based Replication:

Shell
1) Stop replication on B and make sure B is not receiving any write activity:
STOP REPLICA;
2) Check current binary log position on B:
SHOW MASTER STATUS \G
3) On C check replication until C does catch up with B. On C:
SHOW REPLICA STATUS \G
# For C to have catch up with B, the following conditions should be met:
# Executed_Gtid_Set from B step 2) should match Executed_Gtid_Set from 3)
# After catchup, both servers will be in sync with the same data set.
4) Re point C to replicate from A:
CHANGE REPLICATION SOURCE TO SOURCE_HOST='<ip-address-of-A>'
5) Restart replication normally. Run on both nodes
START REPLICA;

Conclusion:

Doing topology changes might seem hard at first, but with the above procedure, it should be easy and error-free! If you do not want to do the manual approach, then you can consider using tools like Orchestrator which allows for automatic failover and promotions.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure, open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Download Percona Distribution for MySQL Today

STAY UP-TO-DATE With Percona!

p

Join 50,000+ of your fellow open-source enthusiasts! Our newsletter provides updates on Percona open source software releases, technical resources, and valuable MySQL, MariaDB, PostgreSQL, and MongoDB-related articles. Get information about Percona Live, our technical webinars, and upcoming events and meetups where you can talk with our experts.

Enter your work email address:*

By submitting my information I agree that Percona may use my personal data in send communication to me about Percona services. I understand that I can unsubscribe from the communication at any time in accordance with the Percona Privacy Policy.

Author

Carlos Tutte

Computer engineer from Montevideo, Uruguay, joined Percona on February 2018, first as a support engineer, then moving to the consulting team. Working in complex IT solutions for more than 10 years, Carlos now specializes in MySQL and related technologies


Leave a Reply Cancel reply


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK