58

Improvements to ROLLUP in MySQL

 5 years ago
source link: https://www.tuicool.com/articles/hit/M3a6Bzm
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

ROLLUP has been supported in MySQL for sometime now. But until now its use has come with two implementation restrictions: it cannot be combined with DISTINCT or ORDER BY in the same query expression. Starting in 8.0.12, these limitations have been lifted.

Introduction

Rows from the ROLLUP operation are added to the result set at the very end. MySQL previously had no post processing of data after ROLLUP. However things changed when window functions were introduced in 8.0. MySQL had to start processing result set after ROLLUP was done to avoid windowing being blocked by ROLLUP. So we made a few changes to accommodate this.  Once post processing of the result set after ROLLUP had been added, along came the support for ORDER BY and DISTINCT with ROLLUP.

ROLLUP with ORDER BY

In MySQL 5.7, GROUP BY would sort the result set on the grouping expressions. It sorts ascendingly if no ordering were specified, but an explicit “ASC” or “DESC” could be added to each grouping expression. Still, the user could not order the final result set in any other way without resorting to wrapping the query in a subquery. Let us look at an example to see what happened in MySQL 5.7:

mysql-5.7> create table t1 (a integer, b integer, c integer);
Query OK, 0 rows affected (0.03 sec)
 
mysql-5.7> insert into t1 values (111,11,11);
Query OK, 1 row affected (0.01 sec)
 
mysq-5.7> insert into t1 values (222,22,22);
Query OK, 1 row affected (0.01 sec)
 
mysql-5.7> insert into t1 values (111,12,12);
Query OK, 1 row affected (0.01 sec)
 
mysql-5.7> insert into t1 values (222,23,23);
Query OK, 1 row affected (0.01 sec)
 
mysql-5.7> SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a,b WITH ROLLUP; 
+------+------+------+
| a    | b    | SUM  |
+------+------+------+
|  111 |   11 |   11 |
|  111 |   12 |   12 |
|  111 | NULL |   23 |
|  222 |   22 |   22 |
|  222 |   23 |   23 |
|  222 | NULL |   45 |
| NULL | NULL |   68 |
+------+------+------+
7 rows in set (0.01 sec)

Note that the super-aggregate rows appear just after the rows used to compute them.  This is because internally data is sorted before the super-aggregate rows are generated. So, ROLLUP adds rows to the already sorted data.  In MySQL 5.7 and earlier versions, GROUP BY always sorts. So the above query is interpreted as “SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a ASC, b ASC WITH ROLLUP;”. However implicit and explicit sorting for GROUP BY has been removed in 8.0.

Let us look at what happens when user specifies explicit ASC/DESC for the grouped column with ROLLUP in MySQL-5.7.

mysql-5.7> SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a ASC , b DESC WITH ROLLUP;
+------+------+------+
| a    | b    | SUM  |
+------+------+------+
|  111 |   12 |   12 |
|  111 |   11 |   11 |
|  111 | NULL |   23 |
|  222 |   23 |   23 |
|  222 |   22 |   22 |
|  222 | NULL |   45 |
| NULL | NULL |   68 |
+------+------+------+
7 rows in set, 2 warnings (0.00 sec)

As can be seen, the aggregated rows are ordered as specified in the query for the grouped columns. But super-aggregate rows do not respect the ordering specified in the query. The reason is the same as mentioned earlier. Super-aggregate rows are added after the data is sorted. Again, note that explicit use of ASC/DESC with GROUP BY has  been removed in 8.0.

In MySQL 8.0, users can specify ORDER BY with ROLLUP to get data ordered correctly.

mysql> SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a, b WITH ROLLUP ORDER BY a,b;
+------+------+------+
| a    | b    | SUM  |
+------+------+------+
| NULL | NULL |   68 |
|  111 | NULL |   23 |
|  111 |   11 |   11 |
|  111 |   12 |   12 |
|  222 | NULL |   45 |
|  222 |   22 |   22 |
|  222 |   23 |   23 |
+------+------+------+
7 rows in set (0.00 sec)
 
mysql> SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a, b WITH ROLLUP ORDER BY a, b DESC;
+------+------+------+
| a    | b    | SUM  |
+------+------+------+
| NULL | NULL |   68 |
|  111 |   12 |   12 |
|  111 |   11 |   11 |
|  111 | NULL |   23 |
|  222 |   23 |   23 |
|  222 |   22 |   22 |
|  222 | NULL |   45 |
+------+------+------+
7 rows in set (0.01 sec)

As seen above, data is now ordered as specified in the query.  If you want the ordering  the same way as in MySQL-5.7 which is more readable since the super-aggregate rows are placed right after the aggregated rows, you could make use of the GROUPING function added in 8.0.

ROLLUP with GROUPING function in ORDER BY

I have previously written a bloghere about GROUPING function and its uses. Here is one more use of the function w.r.t. ordering of data for ROLLUP. For the rows which are not super-aggregates, GROUPING() has a value of 0, where as super-aggregate row has value of one. Hence, ordering on it ascendingly puts the super aggregates last.

mysql> SELECT a, b, SUM(c) as SUM FROM t1 GROUP BY a ,b WITH ROLLUP ORDER BY GROUPING(a), a, GROUPING(b), b;
+------+------+------+
| a    | b    | SUM  |
+------+------+------+
|  111 |   11 |   11 |
|  111 |   12 |   12 |
|  111 | NULL |   23 |
|  222 |   22 |   22 |
|  222 |   23 |   23 |
|  222 | NULL |   45 |
| NULL | NULL |   68 |
+------+------+------+
7 rows in set (0.00 sec)

Voila! we get back to the order that user would get to see in MySQL 5.7.

Along with ORDER BY, MySQL 8.0 now allows usage of DISTINCT with ROLLUP. This fixes Bug#76229 .

Conclusion

Beside these improvements, we have also fixed a number of bugs related to ROLLUP in 8.0. Please use the new improvements and let us know your feedback. Thanks for using MySQL!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK