1

Fixing Misplaced Rows in a Partitioned Table

 1 year ago
source link: https://www.percona.com/blog/fixing-misplaced-rows-in-a-partitioned-table/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Fixing Misplaced Rows in a Partitioned Table

March 29, 2023

Smit Arora

A partitioned table in MySQL has its data separated into different tablespaces while still being viewed as a single table. Partitioning can be a useful approach in some cases when handling huge sets of data. Deleting huge data sets could be quickened up in a partitioned table, but if not handled properly, it can misplace your data in the table. In this blog, I will share how to check and fix the data in such a table with minimal disruption to the table.

In this example, we use a table partitioned based on a date range.

MySQL
mysql> show create table salariesG
*************************** 1. row ***************************
       Table: salaries
Create Table: CREATE TABLE `salaries` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50500 PARTITION BY RANGE  COLUMNS(from_date)
(PARTITION p01 VALUES LESS THAN ('1985-12-31') ENGINE = InnoDB,
 PARTITION p02 VALUES LESS THAN ('1986-12-31') ENGINE = InnoDB,
 PARTITION p03 VALUES LESS THAN ('1987-12-31') ENGINE = InnoDB,
 PARTITION p04 VALUES LESS THAN ('1988-12-31') ENGINE = InnoDB,
 PARTITION p05 VALUES LESS THAN ('1989-12-31') ENGINE = InnoDB,
 PARTITION p06 VALUES LESS THAN ('1990-12-31') ENGINE = InnoDB,
 PARTITION p07 VALUES LESS THAN ('1991-12-31') ENGINE = InnoDB,
 PARTITION p08 VALUES LESS THAN ('1992-12-31') ENGINE = InnoDB,
 PARTITION p09 VALUES LESS THAN ('1993-12-31') ENGINE = InnoDB,
 PARTITION p10 VALUES LESS THAN ('1994-12-31') ENGINE = InnoDB,
 PARTITION p11 VALUES LESS THAN ('1995-12-31') ENGINE = InnoDB,
 PARTITION p12 VALUES LESS THAN ('1996-12-31') ENGINE = InnoDB,
 PARTITION p13 VALUES LESS THAN ('1997-12-31') ENGINE = InnoDB,
 PARTITION p14 VALUES LESS THAN ('1998-12-31') ENGINE = InnoDB,
 PARTITION p15 VALUES LESS THAN ('1999-12-31') ENGINE = InnoDB,
 PARTITION p16 VALUES LESS THAN ('2000-12-31') ENGINE = InnoDB,
 PARTITION p17 VALUES LESS THAN ('2001-12-31') ENGINE = InnoDB,
 PARTITION p18 VALUES LESS THAN ('2002-12-31') ENGINE = InnoDB,
 PARTITION p19 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */

If we query an arbitrary data set, MySQL returns it quickly and without any issue.    

MySQL
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'; 
+--------+--------+------------+------------+ 
| emp_no | salary | from_date  | to_date    | 
+--------+--------+------------+------------+ 
|  10001 |  84917 | 1999-06-23 | 2000-06-22 | 
+--------+--------+------------+------------+ 
1 row in set (0.00 sec)

But once the data is misplaced, it does not return any rows.  

MySQL
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23';
Empty set (0.00 sec)

At first glance, it looks like the data has been deleted; even EXPLAIN plan returns no result.

MySQL
mysql> EXPLAIN SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'G
*************************** 1. row ***************************
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: no matching row in const table
1 row in set, 1 warning (0.00 sec)

But, the data still exists in the table; it’s just misplaced in a different partition.

MySQL
mysql> SELECT * FROM salaries PARTITION(p08) LIMIT 1;
+--------+--------+------------+------------+
| emp_no | salary | from_date  | to_date    |
+--------+--------+------------+------------+
|  10001 |  84917 | 1999-06-23 | 2000-06-22 |
+--------+--------+------------+------------+
1 row in set (0.00 sec)

How to check if there are more misplaced rows?

To check for more misplaced rows in the table, we could execute the CHECK command to find any more misplaced rows.

MySQL
mysql> CHECK TABLE salaries;
+--------------------+-------+----------+---------------------------------------------------------------------------------------+
| Table              | Op    | Msg_type | Msg_text                                                                              |
+--------------------+-------+----------+---------------------------------------------------------------------------------------+
| employees.salaries | check | error    | Found a misplaced row in part 7 should be in part 14:  from_date:1999-06-23           |
| employees.salaries | check | error    | Partition p08 returned error                                                          |
| employees.salaries | check | error    | Table upgrade required. Please do "REPAIR TABLE `salaries`" or dump/reload to fix it! |
+--------------------+-------+----------+---------------------------------------------------------------------------------------+
3 rows in set (0.83 sec)

How can rows be misplaced?

The above observation does raise one question: how can MySQL allow this to happen, and how can it be fixed? The answer to the former is that MySQL allows users to exchange a partition with a separate table; as explained in my previous blog Quick Data Archival in MySQL Using Partitions, by default, it checks every row before inserting data into the table. 

MySQL
mysql> ALTER TABLE salaries EXCHANGE PARTITION p15 WITH TABLE salaries_p15;
Query OK, 0 rows affected (0.04 sec)

In the above query, salaries_p15 is an empty table with the same structure as salaries, except having no partitions in it. 

MySQL
mysql> show create table salaries_p15G
*************************** 1. row ***************************
       Table: salaries_p15
Create Table: CREATE TABLE `salaries_p15` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

After exchanging the partition, partition p15 is empty, and salaries_p15 has data of partition p15.

MySQL
mysql> SELECT count(*) from salaries_p15;
+----------+
| count(*) |
+----------+
|   260926 |
+----------+
1 row in set (0.02 sec)
mysql> SELECT count(*) from salaries PARTITION(p15);
+----------+
| count(*) |
+----------+
+----------+
1 row in set (0.06 sec)

MySQL allows exchanging partitions without checking every row so that exchanging data can be sped up.

MySQL
mysql> ALTER TABLE salaries EXCHANGE PARTITION p08 WITH TABLE salaries_p15 WITHOUT VALIDATION;
Query OK, 0 rows affected (0.04 sec)

The above query will exchange the data in salaries_p15 quickly with an empty partition p08, assuming that the data is correct. In this case, it will result in misplaced rows in the wrong partition.

MySQL
mysql> SELECT * FROM salaries where emp_no=10001 and from_date='1999-06-23'; 
+--------+--------+------------+------------+ 
| emp_no | salary | from_date  | to_date    | 
+--------+--------+------------+------------+ 
|  10001 |  84917 | 1999-06-23 | 2000-06-22 | 
+--------+--------+------------+------------+ 
1 row in set (0.00 sec)

How to fix it

There are two ways to fix it – first, run the MySQL recommended REPAIR TABLE command, or second recreate the table by running an empty alter.

When the REPAIR TABLE is executed, the misplaced rows are arranged to their proper positions by analyzing all the rows, but it acquires a SYSTEM LOCK on the table. 

MySQL
mysql> repair table salaries;
+--------------------+--------+----------+-----------------------------+
| Table              | Op     | Msg_type | Msg_text                    |
+--------------------+--------+----------+-----------------------------+
| employees.salaries | repair | warning  | Moved 260926 misplaced rows |
| employees.salaries | repair | status   | OK                          |
+--------------------+--------+----------+-----------------------------+
2 rows in set (5.30 sec)

The other way is to run an empty alter that will recreate the table.

One can run a direct alter on the table that will recreate the table and lock it until the alter is completed.

MySQL
ALTER  employees.salaries ENGINE=Innodb;

One can also use pt-online schema change to recreate the table, which will not lock the table and have other benefits (Using Percona Toolkit to Alter Database Tables Online: A Controlled Approach) as well.

MySQL
pt-online-schema-change --execute  --alter "ENGINE=InnoDb" h=localhost,D=employees,t=salaries

Conclusion

It is always recommended to be extra cautious while working with partitions and avoid using WITHOUT VALIDATION in the production environment to avoid any misplaced data.

Percona Distribution for MySQL is the most complete, stable, scalable, and secure open-source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!

Try Percona Distribution for MySQL today!

Share This Post!

Subscribe
Connect with
guest
Label
0 Comments

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK