44

Dropping useless MySQL indexes

 4 years ago
source link: https://www.tuicool.com/articles/zqYZJvu
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
U3uANn6.jpg!web

This railway viaduct near Liverpool Street (London)

seems to allow to reach some places quickly, but like many indexes, it is unused.

Before reading this article, I suggest to read Indexes bad practices , if you missed it. In that article I mentioned that useless indexes should be deleted, and in the comments Vipul asked how to find such indexes. Here I explain how to do it.

Finding useless indexes

First of all, let’s define what a useless index is. There are two types of them:

  • Duplicate indexes;
  • Unused indexes.

Duplicate indexes

If you don’t know what duplicate indexes are, you should really check the previous article. We’re not discussing the matter again here.

There is at least one tool to find such indexes: pt-duplicate-key-checker .

This great tool finds duplicate indexes, and outputs a list of them, as well as the ALTER TABLE s to drop them. It also outputs the index definitions, so you will easily find out why a certain index is a duplicate.

The tool is safe with respect to constraints. UNIQUE indexes are not considered duplicates, unless they duplicate other UNIQUE indexes. Foreign keys are considered duplicates only if they are identical.

Paranoid modes

pt-duplicate-key-checker is usually called without any arguments, and that is what I do. For completeness, I will mention here some additional checks that could be done – but probably no one does.

  • --all-structs : Ignored the index types. An index can duplicate another even if they are of different types – for example, a FULLTEXT index and a regular BTREE. This is normally wrong, but there is an exception: a BTREE could duplicate a HASH index. If your database has improper HASH indexes, you may want to try this parameter. But then, you will have to check manually the output.
  • --ignore-order : Normally, an index on (a, b) should not be considered as a duplicate of (b, a). The reason is that some queries can only be server by one of these indexes, as explained in the previous article. This parameter causes these indexes to be considered as duplicates. This is only true if we compare both these columns with = and don’t use GROUP BY and ORDER BY on them. If you have these non-common case you may use --ignore-order to spot duplicates, but you will have to check manually the output.

Unused indexes

If you use MariaDB or Percona Server, you have a plugin called user_statistics . It can also be installed in MySQL.

The user_statistics method

If you enable it, you will have a table called INDEX_STATISTICS , in the information_schema database. All indexes not present in that table were not used since the last restart of MySQL. To spot them you can use this query:

SELECT st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME
    FROM information_schema.STATISTICS st
    LEFT JOIN information_schema.INDEX_STATISTICS idx
        ON  idx.INDEX_NAME    = st.INDEX_NAME
        AND idx.TABLE_NAME    = st.TABLE_NAME
        AND idx.TABLE_SCHEMA  = st.TABLE_SCHEMA
    WHERE
        (idx.INDEX_NAME IS NULL OR idx.ROWS_READ = 0)
        AND st.NON_UNIQUE = 1
    ORDER BY 1, 2, 3
;

The performance_schema method

There is a less safe alternative. For that, you need to have the performance_schema enabled. If it is disabled, you need to restart your server to enable it.

Unfortunately, the performance_schema is disabled by default in MariaDB since version 10.0. Let’s hope that they’ll change their mind, for a lot of reasons.

With the performance_schema enabled, you can run this query to find unused indexes:

SELECT object_schema, object_name, index_name
    FROM performance_schema.table_io_waits_summary_by_index_usage
    WHERE index_name IS NOT NULL AND count_star = 0
    ORDER BY object_schema, object_name, index_name;

Why is it less safe? Because it assumes that an index is only unused if it never caused an IO wait.

Risks

Duplicate indexes are not necessarily unused. This is not a problem if another index can be used in place of the duplicate. But there is an exception, that I mentioned in The perils of ALTER TABLE in MySQL/MariaDB . MySQL supports the IGNORE/USE/FORCE INDEX syntax, which mentions an index by name. If the index doesn’t exist, the query fails – even in the care of IGNORE INDEX .

Dropping an unused index is safe. As mentioned, indexes found with the user_statistics method are surely unused. But indexes found with the performance_schema method could theoretically be used.

To reduce these risks, MySQL 8 has a useful feature: invisible indexes. You can make an index invisible in this way:

ALTER TABLE t ALTER INDEX idx_a INVISIBLE;

If an index is invisible, MySQL will pretend that it doesn’t exist. Therefore, if it is mentioned by a query with IGNORE/USE/FORCE INDEX , the query will return an error. And if it is used by a query, the query will become slow – or it will use another index. The advantage is that making an index invisible or visible is an instantaneous operation. Therefore, even if the table is very big, if a problem appears, we can make the index visible again in a moment.

Conclusions

We saw how to find duplicate indexes, including some edge cases that are not reported by default by pt-duplicate-key-checker. Then we saw how to find unused indexes: the performance_schema database should always be enabled, but for this particular task the user_statistics plugin is better. Finally, we discussed how to use invisible indexes to reduce risks.

As usual my conclusion is: what do you think? If you have questions, disagree with something or have more ideas, please comment.

Toodle pip,

Federico

Photo credit


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK