Using Jobs to Perform Schema Changes Against MySQL Databases on K8s
source link: https://www.percona.com/blog/using-jobs-to-perform-schema-changes-against-mysql-databases-on-k8s/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Using Jobs to Perform Schema Changes Against MySQL Databases on K8s
October 10, 2023
Performing an operation is always challenging when dealing with K8s.
When on-prem or DBaaS like RDS or Cloud SQL, it is relatively straightforward to apply a change. You can perform a DIRECT ALTER, use a tool such as pt-osc, or even, for certain cases where async replication is in use, perform changes on replicas and failover.
In this blog post, I’ll provide some thoughts on how schema changes can be performed when running MySQL on Kubernetes
I won’t focus on DIRECT ALTERs as it is pretty straightforward to apply them. You can just connect to the MySQL service POD and perform the ALTER.
But how can we apply changes in more complex scenarios where we may want to benefit from pt-osc, gain better control over the operation, or take advantage of the K8s features?
One convenient way that I’ve found working well is the use of an external POD running the percona-toolkit. If you need other tools or more flexibility, you can, of course, use your own image.
Assuming that there is already a Secrets
object holding all system user passwords, the concept is really simple. We are going to create a Job.
The use of Jobs provides a convenient way for both Devs and DBAs to apply changes, taking advantage of the following benefits
- Full track of the event lifecycle, including execution logs
- Auditing
- ConfigMap options, if any, can be reused
- You don’t have to explicitly pass credentials or endpoints to the execution, as these all are available through the Secrets objects and environmental variables
- Execution can be scheduled in a predefined maintenance window (you can easily convert a Job to a cronjob)
- A task can be easily and consistently executed multiple times across multiple environments
Let’s proceed to the proof of concept now.
I deployed a three-node Percona XtraDB Cluster (PXC) using the Percona Operator for MySQL. HaProxy was acting as a load balancer in front of the cluster.
ubuntu@ip-172-31-25-177:~$ kubectl get pods NAME READY STATUS RESTARTS AGE minimal-cluster-haproxy-0 2/2 Running 0 27m minimal-cluster-haproxy-1 2/2 Running 0 26m minimal-cluster-haproxy-2 2/2 Running 0 26m minimal-cluster-pxc-0 3/3 Running 0 27m minimal-cluster-pxc-1 3/3 Running 0 26m minimal-cluster-pxc-2 3/3 Running 0 26m percona-xtradb-cluster-operator-6f4f56f5d5-pvx92 1/1 Running 2 (18h ago) 3d20h |
These are my Secrets:
ubuntu@ip-172-31-25-177:~$ kubectl get secret NAME TYPE DATA AGE internal-minimal-cluster Opaque 7 3d20h minimal-cluster-secrets Opaque 7 3d20h |
The following simple table was created in the PXC cluster:
ubuntu@ip-172-31-25-177:~$ kubectl run -it --rm percona-client --image=percona:8.0 --restart=Never bash If you don't see a command prompt, try pressing enter. [mysql@percona-client /]$ mysql -h minimal-cluster-pxc -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 256 Server version: 8.0.32-24.2 Percona XtraDB Cluster (GPL), Release rel24, Revision 2119e75, WSREP version 26.1.4.3 Copyright (c) 2009-2023 Percona LLC and/or its affiliates Copyright (c) 2000, 2023, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> create database if not exists atsaloux; Query OK, 1 row affected (0.02 sec) mysql> use atsaloux; Database changed mysql> CREATE TABLE `authors` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `first_name` VARCHAR(50) NOT NULL , `last_name` VARCHAR(50) NOT NULL, `email` VARCHAR(100) NOT NULL , `birthdate` DATE NOT NULL, `added` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE INDEX `email` (`email`) ) ENGINE=InnoDB; Query OK, 0 rows affected, 1 warning (0.04 sec) mysql> show create table authorsG *************************** 1. row *************************** Table: authors Create Table: CREATE TABLE `authors` ( `id` int NOT NULL AUTO_INCREMENT, `first_name` varchar(50) NOT NULL, `last_name` varchar(50) NOT NULL, `email` varchar(100) NOT NULL, `birthdate` date NOT NULL, `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 1 row in set (0.00 sec) mysql> insert into authors (first_name,last_name,email,birthdate) values ('user1','lastname1','[email protected]','1980/01/01'); Query OK, 1 row affected, 1 warning (0.01 sec) mysql> insert into authors (first_name,last_name,email,birthdate) values ('user2','lastname2','[email protected]','1980/01/01'); Query OK, 1 row affected, 1 warning (0.00 sec) mysql> insert into authors (first_name,last_name,email,birthdate) values ('user3','lastname3','[email protected]','1980/01/01'); Query OK, 1 row affected, 1 warning (0.01 sec) mysql> insert into authors (first_name,last_name,email,birthdate) values ('user4','lastname4','[email protected]','1980/01/01'); Query OK, 1 row affected, 1 warning (0.00 sec) mysql> insert into authors (first_name,last_name,email,birthdate) values ('user5','lastname5','[email protected]','1980/01/01'); Query OK, 1 row affected, 1 warning (0.01 sec) mysql> select * from authors; +----+------------+-----------+-----------------------+------------+---------------------+ | id | first_name | last_name | email | birthdate | added | +----+------------+-----------+-----------------------+------------+---------------------+ | 1 | user1 | lastname1 | [email protected] | 1980-01-01 | 2023-07-26 12:18:40 | | 4 | user2 | lastname2 | [email protected] | 1980-01-01 | 2023-07-26 12:18:49 | | 7 | user3 | lastname3 | [email protected] | 1980-01-01 | 2023-07-26 12:18:58 | | 10 | user4 | lastname4 | [email protected] | 1980-01-01 | 2023-07-26 12:19:08 | | 13 | user5 | lastname5 | [email protected] | 1980-01-01 | 2023-07-26 12:19:16 | +----+------------+-----------+-----------------------+------------+---------------------+ 5 rows in set (0.00 sec) |
What I wanted to do was apply the following simple schema change using pt-osc:
ALTER TABLE atsaloux.authors ADD INDEX email_idx(email) |
To do so, I created a K8s Job YAML file as below:
ubuntu@ip-172-31-25-177:~$ cat pt-osc-job.yaml apiVersion: batch/v1 kind: Job metadata: name: pt-osc-atsaloux-request-123456 spec: template: spec: containers: - name: pt-osc-request-123456 image: perconalab/percona-toolkit:latest command: ["pt-online-schema-change"] args: - --user - root - --print - --progress - time,10 - --alter=$(PT_OSC_ALTER) - --tries - create_triggers:10000:0.25,drop_triggers:10000:0.25,copy_rows:10000:0.25,swap_tables:10000:0.25,update_foreign_keys:10000:0.25,analyze_table:10000:0.25 - --set-vars - lock_wait_timeout=1 - h=minimal-cluster-pxc,D=$(PT_OSC_DATABASE),t=$(PT_OSC_TABLE) - --recursion-method=none - --max-flow-ctl=0 - --critical-load - threads_running=99999 - --max-load - Threads_running=50 - --chunk-time=1 - --nocheck-unique-key-change - --execute - name: DBI_PASS valueFrom: secretKeyRef: name: minimal-cluster-secrets key: root - name: PT_OSC_DATABASE value: atsaloux - name: PT_OSC_TABLE value: authors - name: PT_OSC_ALTER value: ADD INDEX email_idx(email) restartPolicy: Never backoffLimit: 1 |
I’m not going to explain the YAML definition above as I believe this is pretty straightforward for you to understand and adjust.
I then applied the YAML file, and the Job was created:
ubuntu@ip-172-31-25-177:~$ kubectl apply -f pt-osc-job.yaml job.batch/pt-osc-atsaloux-request-123456 created ubuntu@ip-172-31-25-177:~$ kubectl get pods NAME READY STATUS RESTARTS AGE minimal-cluster-haproxy-0 2/2 Running 0 68m minimal-cluster-haproxy-1 2/2 Running 0 67m minimal-cluster-haproxy-2 2/2 Running 0 67m minimal-cluster-pxc-0 3/3 Running 0 68m minimal-cluster-pxc-1 3/3 Running 0 68m minimal-cluster-pxc-2 3/3 Running 0 67m percona-xtradb-cluster-operator-6f4f56f5d5-pvx92 1/1 Running 4 (68m ago) 5d pt-osc-atsaloux-request-123456-h6t9p 0/1 ContainerCreating 0 2s ubuntu@ip-172-31-25-177:~$ kubectl get pods NAME READY STATUS RESTARTS AGE minimal-cluster-haproxy-0 2/2 Running 0 68m minimal-cluster-haproxy-1 2/2 Running 0 67m minimal-cluster-haproxy-2 2/2 Running 0 67m minimal-cluster-pxc-0 3/3 Running 0 68m minimal-cluster-pxc-1 3/3 Running 0 68m minimal-cluster-pxc-2 3/3 Running 0 67m percona-xtradb-cluster-operator-6f4f56f5d5-pvx92 1/1 Running 4 (68m ago) 5d pt-osc-atsaloux-request-123456-h6t9p 0/1 Completed 0 3s |
As you can see above, after a few seconds the pt-osc-atsaloux-request-123456-h6t9p
transitioned to a Completed
status. This indicates that the execution of the event was successful.
I’m now able to review the logs using the following:
ubuntu@ip-172-31-25-177:~$ kubectl logs pt-osc-atsaloux-request-123456-h6t9p No slaves found. See --recursion-method if host minimal-cluster-pxc-0 has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: analyze_table, 10000, 0.25 copy_rows, 10000, 0.25 create_triggers, 10000, 0.25 drop_triggers, 10000, 0.25 swap_tables, 10000, 0.25 update_foreign_keys, 10000, 0.25 Altering `atsaloux`.`authors`... Creating new table... CREATE TABLE `atsaloux`.`_authors_new` ( `id` int NOT NULL AUTO_INCREMENT, `first_name` varchar(50) NOT NULL, `last_name` varchar(50) NOT NULL, `email` varchar(100) NOT NULL, `birthdate` date NOT NULL, `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`) ) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Created new table atsaloux._authors_new OK. Altering new table... ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx(email) Altered `atsaloux`.`_authors_new` OK. 2023-07-26T12:22:42 Creating triggers... ----------------------------------------------------------- Event : DELETE Name : pt_osc_atsaloux_authors_del SQL : CREATE TRIGGER `pt_osc_atsaloux_authors_del` AFTER DELETE ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `atsaloux`.`_authors_new` WHERE `atsaloux`.`_authors_new`.`id` <=> OLD.`id`; END Suffix: del Time : AFTER ----------------------------------------------------------- ----------------------------------------------------------- Event : UPDATE Name : pt_osc_atsaloux_authors_upd SQL : CREATE TRIGGER `pt_osc_atsaloux_authors_upd` AFTER UPDATE ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `atsaloux`.`_authors_new` WHERE !(OLD.`id` <=> NEW.`id`) AND `atsaloux`.`_authors_new`.`id` <=> OLD.`id`; REPLACE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) VALUES (NEW.`id`, NEW.`first_name`, NEW.`last_name`, NEW.`email`, NEW.`birthdate`, NEW.`added`); END Suffix: upd Time : AFTER ----------------------------------------------------------- ----------------------------------------------------------- Event : INSERT Name : pt_osc_atsaloux_authors_ins SQL : CREATE TRIGGER `pt_osc_atsaloux_authors_ins` AFTER INSERT ON `atsaloux`.`authors` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; REPLACE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) VALUES (NEW.`id`, NEW.`first_name`, NEW.`last_name`, NEW.`email`, NEW.`birthdate`, NEW.`added`);END Suffix: ins Time : AFTER ----------------------------------------------------------- 2023-07-26T12:22:43 Created triggers OK. 2023-07-26T12:22:43 Copying approximately 4 rows... INSERT LOW_PRIORITY IGNORE INTO `atsaloux`.`_authors_new` (`id`, `first_name`, `last_name`, `email`, `birthdate`, `added`) SELECT `id`, `first_name`, `last_name`, `email`, `birthdate`, `added` FROM `atsaloux`.`authors` LOCK IN SHARE MODE /*pt-online-schema-change 1 copy table*/ 2023-07-26T12:22:43 Copied rows OK. 2023-07-26T12:22:43 Analyzing new table... 2023-07-26T12:22:43 Swapping tables... RENAME TABLE `atsaloux`.`authors` TO `atsaloux`.`_authors_old`, `atsaloux`.`_authors_new` TO `atsaloux`.`authors` 2023-07-26T12:22:43 Swapped original and new tables OK. 2023-07-26T12:22:43 Dropping old table... DROP TABLE IF EXISTS `atsaloux`.`_authors_old` 2023-07-26T12:22:43 Dropped old table `atsaloux`.`_authors_old` OK. 2023-07-26T12:22:43 Dropping triggers... DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_del` DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_upd` DROP TRIGGER IF EXISTS `atsaloux`.`pt_osc_atsaloux_authors_ins` 2023-07-26T12:22:43 Dropped triggers OK. Successfully altered `atsaloux`.`authors`. |
And finally confirmed that the schema changes had been successfully applied:
bash-5.1$ mysql -h minimal-cluster-haproxy -u root -p -e 'show create table atsaloux.authorsG' Enter password: *************************** 1. row *************************** Table: authors Create Table: CREATE TABLE `authors` ( `id` int NOT NULL AUTO_INCREMENT, `first_name` varchar(50) NOT NULL, `last_name` varchar(50) NOT NULL, `email` varchar(100) NOT NULL, `birthdate` date NOT NULL, `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`), KEY `email_idx` (`email`) ) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
For the purposes of this blog post, I repeated the operation by creating a variation of the Job above as below. The difference here is that the ALTER statement was actually incorrect.
ubuntu@ip-172-31-25-177:~$ cat pt-osc-job2.yaml apiVersion: batch/v1 kind: Job metadata: name: pt-osc-atsaloux-request-78910 spec: template: spec: containers: - name: pt-osc-request-123456 image: perconalab/percona-toolkit:latest command: ["pt-online-schema-change"] args: - --user - root - --print - --progress - time,10 - --alter=$(PT_OSC_ALTER) - --tries - create_triggers:10000:0.25,drop_triggers:10000:0.25,copy_rows:10000:0.25,swap_tables:10000:0.25,update_foreign_keys:10000:0.25,analyze_table:10000:0.25 - --set-vars - lock_wait_timeout=1 - h=minimal-cluster-pxc,D=$(PT_OSC_DATABASE),t=$(PT_OSC_TABLE) - --recursion-method=none - --max-flow-ctl=0 - --critical-load - threads_running=99999 - --max-load - Threads_running=50 - --chunk-time=1 - --nocheck-unique-key-change - --execute - name: DBI_PASS valueFrom: secretKeyRef: name: minimal-cluster-secrets key: root - name: PT_OSC_DATABASE value: atsaloux - name: PT_OSC_TABLE value: authors - name: PT_OSC_ALTER value: ADD INDEX email_idx_new(email_not_exists) restartPolicy: Never backoffLimit: 1 |
I then applied it again to create the new Job and noticed that it errored out this time:
ubuntu@ip-172-31-25-177:~$ kubectl apply -f pt-osc-job2.yaml job.batch/pt-osc-atsaloux-request-78910 created ubuntu@ip-172-31-25-177:~$ kubectl get pods NAME READY STATUS RESTARTS AGE minimal-cluster-haproxy-0 2/2 Running 0 120m minimal-cluster-haproxy-1 2/2 Running 0 119m minimal-cluster-haproxy-2 2/2 Running 0 119m minimal-cluster-pxc-0 3/3 Running 0 120m minimal-cluster-pxc-1 3/3 Running 0 120m minimal-cluster-pxc-2 3/3 Running 0 119m percona-xtradb-cluster-operator-6f4f56f5d5-pvx92 1/1 Running 4 (120m ago) 5d1h pt-osc-atsaloux-request-123456-h6t9p 0/1 Completed 0 52m pt-osc-atsaloux-request-78910-q2lk4 0/1 Error 0 2s ubuntu@ip-172-31-25-177:~$ kubectl get jobs NAME COMPLETIONS DURATION AGE pt-osc-atsaloux-request-123456 1/1 4s 52m pt-osc-atsaloux-request-78910 0/1 6s 6s |
But why? You can see in the logs Key column 'email_not_exists' doesn't exist in table
ubuntu@ip-172-31-25-177:~$ kubectl logs pt-osc-atsaloux-request-78910-q2lk4 No slaves found. See --recursion-method if host minimal-cluster-pxc-0 has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: analyze_table, 10000, 0.25 copy_rows, 10000, 0.25 create_triggers, 10000, 0.25 drop_triggers, 10000, 0.25 swap_tables, 10000, 0.25 update_foreign_keys, 10000, 0.25 Altering `atsaloux`.`authors`... Creating new table... CREATE TABLE `atsaloux`.`_authors_new` ( `id` int NOT NULL AUTO_INCREMENT, `first_name` varchar(50) NOT NULL, `last_name` varchar(50) NOT NULL, `email` varchar(100) NOT NULL, `birthdate` date NOT NULL, `added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `email` (`email`), KEY `email_idx` (`email`) ) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Created new table atsaloux._authors_new OK. Altering new table... ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists) 2023-07-26T13:14:44 Dropping new table... (in cleanup) Error altering new table `atsaloux`.`_authors_new`: DBD::mysql::db do failed: Key column 'email_not_exists' doesn't exist in table [for Statement "ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists)"] at /usr/bin/pt-online-schema-change line 9537. Error altering new table `atsaloux`.`_authors_new`: DBD::mysql::db do failed: Key column 'email_not_exists' doesn't exist in table [for Statement "ALTER TABLE `atsaloux`.`_authors_new` ADD INDEX email_idx_new(email_not_exists)"] at /usr/bin/pt-online-schema-change line 9537. DROP TABLE IF EXISTS `atsaloux`.`_authors_new`; 2023-07-26T13:14:44 Dropped new table OK. `atsaloux`.`authors` was not altered. |
Now, I can delete the Job, if needed, make adjustments, and re-run. Jobs documentation has quite a few details for failure handling, retries, deadlines, etc. that you may find useful.
ubuntu@ip-172-31-25-177:~$ kubectl get jobs NAME COMPLETIONS DURATION AGE pt-osc-atsaloux-request-123456 1/1 4s 53m pt-osc-atsaloux-request-78910 0/1 83s 83s ubuntu@ip-172-31-25-177:~$ kubectl delete job pt-osc-atsaloux-request-78910 job.batch "pt-osc-atsaloux-request-78910" deleted ubuntu@ip-172-31-25-177:~$ kubectl get jobs NAME COMPLETIONS DURATION AGE pt-osc-atsaloux-request-123456 1/1 4s 53m |
Again, this is a high-level idea, and other than schema change operations can be performed in a similar way. You can, of course, further improve or follow other approaches. In K8s, there are several ways to achieve the same result, so feel free to comment!
Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!
Try Percona Distribution for MySQL today!
Share This Post!
Subscribe
Label
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK