ALTERS, Foreign Keys, and Metadata Locks, oh my!

qaUFjeE.jpg!web I’m sure you’ve seen it – you kick off an ALTER and you get the dreaded “waiting on metadata lock”. In many cases, this is expected if you are actively working on the table. However, I recently had a case with a client where the table being altered was rarely updated and very small (<100 rows). The ALTER just sat for hours during a load test (more on this shortly) and never completed until the load test was stopped. Upon stopping the load test, the ALTER completed in less than a second as expected. So what was going on here?

Check Foreign Keys

My first instinct, whenever there is odd locking, is to check foreign keys. Naturally, this table had some FKs referencing a much busier table. However, this behavior still seemed rather strange. When running an ALTER against a table, there is a request for a SHARED_UPGRADEABLE metadata lock against the child table. There is also a SHARED_READ_ONLY metadata lock against the parent and this is where things can get messy.

Let’s take a look at how MDLs are acquired per the documentation ( https://dev.mysql.com/doc/refman/en/metadata-locking.html :

If there are multiple waiters for a given lock, the highest-priority lock request is satisfied first , with an exception related to the max_write_lock_count system variable. Write lock requests have higher priority than read lock requests.

It is important to note that the lock order is serialized:

Statements acquire metadata locks one by one, not simultaneously, and perform deadlock detection in the process.

Normally, when thinking of a queue, we think of a FIFO process. If I issue the following three statements (in this order), they would complete in this order:

INSERT INTO parent…
ALTER TABLE child…
INSERT INTO parent…

However, as the child ALTER statement requests a read lock against parent, the two inserts will complete PRIOR to the ALTER despite the ordering. Here is a sample scenario in which this can be demonstrated:

Table setup and initial population:

CREATE TABLE `parent` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `val` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;
 
CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `parent_id` int(11) DEFAULT NULL,
  `val` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_parent` (`parent_id`),
  CONSTRAINT `fk_parent` FOREIGN KEY (`parent_id`) REFERENCES `parent` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB;
 
INSERT INTO `parent` VALUES (1, "one"), (2, "two"), (3, "three"), (4, "four");

Session 1:

start transaction;
update parent set val = "four-new" where id = 4;

Session 2:

alter table child add index `idx_new` (val);

Session 3:

start transaction;
update parent set val = "three-new" where id = 3;

At this point, session 1 has an open transaction and is in sleep with a write metadata lock granted on parent. Session 2 has an upgradeable (write) lock granted on child and is waiting on a read lock on the parent. Finally, session 3 has a granted write lock against parent:

mysql> select * from performance_schema.metadata_locks;
+-------------+-------------+-------------------+---------------+-------------+
| OBJECT_TYPE | OBJECT_NAME | LOCK_TYPE         | LOCK_DURATION | LOCK_STATUS |
+-------------+-------------+-------------------+---------------+-------------+
| TABLE       | child       | SHARED_UPGRADABLE | TRANSACTION   | GRANTED     | <- ALTER (S2)
| TABLE       | parent      | SHARED_WRITE      | TRANSACTION   | GRANTED     | <- UPDATE (S1)
| TABLE       | parent      | SHARED_WRITE      | TRANSACTION   | GRANTED     | <- UPDATE (S3)
| TABLE       | parent      | SHARED_READ_ONLY  | STATEMENT     | PENDING     | <- ALTER (S2)
+-------------+-------------+-------------------+---------------+-------------+

Notice, the only session with a pending lock status is session 2 (the ALTER). Session 1 and session 3 (issued before and after the ALTER respectively) both have been granted the write locks. Where the ordering breaks down is when the commit happens on session 1. When thinking about an ordered queue, one would expect session 2 to acquire the lock and things would just move on. However, due to the priority nature of the metadata lock system, session 2 still waits and now session 3 has the lock.

If another write session comes in and starts a new transaction and acquires a write lock against the parent table, then even when session 3 completes, the ALTER will still be blocked. You can see where this is going…

As long as I keep an active transaction that has an MDL against the parent table open, the ALTER on the child table will never complete. Making this worse, since the write lock on the child table was successful (but the full statement is waiting on acquiring the parent read lock), all incoming read requests against the child table will be blocked!!

Also, think about how you normally try to troubleshoot a statement that won’t complete. You look at transactions (both in the processlist and InnoDB status) that have been open for a longer time. But since the blocking thread is now younger than the ALTER thread, the oldest transaction/thread you will see is the ALTER. Queue hair pulling!!

This is exactly what was happening in this scenario. In preparation for a release, our client was running their ALTER statements in conjunction with a load test (a very good practice!) to ensure a smooth release. The problem was that the load test kept an active write transaction open against the parent table. That isn’t to say it just kept writing, but rather there were multiple threads and one was ALWAYS active. This prevented the ALTER from ever completing and blocked ensuing read requests to the relatively static child table.

Fortunately, there is a solution to this problem (other than banishing FKs from the schema). The variable max_write_lock_count ( https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_write_lock_count ) can be used to allow read locks to be granted ahead of write locks after a threshold of consecutive write locks. By default, this variable is set to 18,446,744,073,709,551,615. So you just have to wait for 18 quintillion write requests to complete before the read would be granted. To put that in perspective, if you were issuing 10,000 writes/sec against that table, your read would be blocked for 58 million years .

To prevent this from happening, you can simply reduce max_write_lock_count to a small number (say 10?) and after every 10 write locks get acquired, the MDL subsystem will look for pending read locks, grant one, then move back to writes. Problem solved!

Being a dynamic variable, this can be adjusted at runtime to allow the waiting ALTER to complete. In general, this is more of an edge case as there normally is some time in between writes to a table for read locks to get acquired. However, if your use case keeps concurrent sessions running that ALWAYS have a transaction against a table that is referenced as an FK, you could see this situation crop up. Fortunately, the fix is straightforward and can be done on the fly!

NOTE: This troubleshooting was made possible through the performance schema and enabling the metadata_locks table as described here: https://dev.mysql.com/doc/refman/5.7/en/metadata-locks-table.html

Check Foreign Keys

Session 1:

Session 2:

Session 3:

Recommend

A in-complete Bird app clone

火绒5.0七大硬核功能介绍——应用加固 - 知乎

Vector, a High-Performance Log and Metric Router Written in Rust

Introduction to cloud-native application environment architecture

在白板上写写画画，集成AutoML的数据分析也能如此简单

一键“脱衣”应用遭下架！吴恩达称，“DeepNude死了，我很高兴” - 知乎

Anomaly Detection for Dummies

Python Tutorial For Researchers Who use R

Optimizing the Future of Global Humanitarian Efforts

You can now speak using someone else’s voice with Deep Learning

About Joyk