52

JSON specific window functions in MySQL 8.0

 5 years ago
source link: https://www.tuicool.com/articles/hit/UNNJzmi
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In 8.0, we introduced window functions in MySQL server. A detailed explanation of how window functions can be used and what is supported in 8.0 is explained here .

Now, let’s see  what is new in window functions in the latest version 8.0. Starting in 8.0.14, JSON aggregate functions can be used as window functions, i.e they can be specified with an OVER() clause. An explanation of JSON grouped aggregate functions (i.e without the OVER() clause) can be found here .

JSON_ARRAYAGG() as window function:

While a grouped aggregate returns a JSON_ARRAY for each group, the corresponding aggregate used as a window function would return a JSON_ARRAY for every row. The JSON_ARRAY constructed for each row will differ or be the same based on the frame, partition and order specification for the window. Let us look at an example.

mysql> CREATE TABLE employee (emp_no INTEGER, first_name VARCHAR(15), last_name VARCHAR(15), hire_date DATE, dept_id INTEGER, PRIMARY KEY (emp_no) );
Query OK, 0 rows affected (0.36 sec)
 
mysql> insert into employee values (1, "Samanyu", "Kishore",'2010-10-10',1);
Query OK, 1 row affected (0.22 sec)
 
mysql> insert into employee values (2, "Nethra", "Deepak", '2003-08-08', 1);
Query OK, 1 row affected (0.07 sec)
 
mysql> insert into employee values (3, "Sushma", "Sunil", '2008-01-02', 1);
Query OK, 1 row affected (0.10 sec)
 
mysql> insert into employee values (4, "Kishore", "Krishna", '1999-06-01', 2);
Query OK, 1 row affected (0.03 sec)
 
mysql> insert into employee values (5, "Muni", "Raju", '2000-10-01', 2);
Query OK, 1 row affected (0.05 sec)
 
mysql> insert into employee values (6, "Keshav", "Basappa", '1999-08-01', 2);
Query OK, 1 row affected (0.07 sec)
 
mysql> insert into employee values (7, "Bhagath", "Reddy", '2000-01-01', 2);
Query OK, 1 row affected (0.04 sec)

We will retrieve the names of all those employees who have been hired on or before a particular date in a department using the JSON_ARRAYAGG window function. To achieve this we use the window function’s PARTITION BY and ORDER BY clause.

mysql>  select dept_id, hire_date, json_arrayagg(first_name) over (partition by dept_id  order by hire_date) from employee;
+---------+------------+---------------------------------------------------------------------------+
| dept_id | hire_date  | json_arrayagg(first_name) over (partition by dept_id  order by hire_date) |
+---------+------------+---------------------------------------------------------------------------+
|       1 | 2003-08-08 | ["Nethra"]                                                                |
|       1 | 2008-01-02 | ["Nethra", "Sushma"]                                                      |
|       1 | 2010-10-10 | ["Nethra", "Sushma", "Samanyu"]                                           |
|       2 | 1999-06-01 | ["Kishore"]                                                               |
|       2 | 1999-08-01 | ["Kishore", "Keshav"]                                                     |
|       2 | 2000-01-01 | ["Kishore", "Keshav", "Bhagath"]                                          |
|       2 | 2000-10-01 | ["Kishore", "Keshav", "Bhagath", "Muni"]                                  |
+---------+------------+---------------------------------------------------------------------------+
7 rows in set (0.00 sec)

JSON_OBJECTAGG as window function:

Similar to the case of JSON_ARRAYAGG, when JSON_OBJECTAGG  is used as a window function, it returns a JSON object having key-value pairs for every row. The result for each row depends on the frame, partition and order specification for that window.

For the same table as used above,  if we want to get the hiring dates and the employee names in key-value pairs , we could use JSON_OBJECTAGG and get the desired result.

mysql> select dept_id, json_objectagg(first_name, hire_date) over (partition by dept_id order by hire_date) from employee;
+---------+--------------------------------------------------------------------------------------------------+
| dept_id | json_objectagg(first_name, hire_date) over (partition by dept_id order by hire_date)             |
+---------+--------------------------------------------------------------------------------------------------+
|       1 | {"Nethra": "2003-08-08"}                                                                         |
|       1 | {"Nethra": "2003-08-08", "Sushma": "2008-01-02"}                                                 |
|       1 | {"Nethra": "2003-08-08", "Sushma": "2008-01-02", "Samanyu": "2010-10-10"}                        |
|       2 | {"Kishore": "1999-06-01"}                                                                        |
|       2 | {"Keshav": "1999-08-01", "Kishore": "1999-06-01"}                                                |
|       2 | {"Keshav": "1999-08-01", "Bhagath": "2000-01-01", "Kishore": "1999-06-01"}                       |
|       2 | {"Muni": "2000-10-01", "Keshav": "1999-08-01", "Bhagath": "2000-01-01", "Kishore": "1999-06-01"} |
+---------+--------------------------------------------------------------------------------------------------+
7 rows in set (0.01 sec)

What happens when we have duplicate keys? The value that gets picked for a particular key is non-deterministic in case of a grouped aggregate i.e JSON_OBJECTAGG without the OVER() clause (The reason for the same is explained here ). However, when used as a window function, we can use order by to make the result deterministic.

Let us look at another example:

mysql> insert into employee values (10, '1973-04-07', "Nethra", "Reddy", 'F', '2004-01-02', 1);
Query OK, 1 row affected (0.00 sec)
 
mysql> select dept_id, json_objectagg(first_name, hire_date) over (partition by dept_id order by hire_date) from employee;
+---------+--------------------------------------------------------------------------------------------------+
| dept_id | json_objectagg(first_name, hire_date) over (partition by dept_id order by hire_date)             |
+---------+--------------------------------------------------------------------------------------------------+
|       1 | {"Nethra": "2003-08-08"}                                                                         |
|       1 | {"Nethra": "2004-01-02"}                                                                         |
|       1 | {"Nethra": "2004-01-02", "Sushma": "2008-01-02"}                                                 |
|       1 | {"Nethra": "2004-01-02", "Sushma": "2008-01-02", "Samanyu": "2010-10-10"}                        |
|       2 | {"Kishore": "1999-06-01"}                                                                        |
|       2 | {"Keshav": "1999-08-01", "Kishore": "1999-06-01"}                                                |
|       2 | {"Keshav": "1999-08-01", "Bhagath": "2000-01-01", "Kishore": "1999-06-01"}                       |
|       2 | {"Muni": "2000-10-01", "Keshav": "1999-08-01", "Bhagath": "2000-01-01", "Kishore": "1999-06-01"} |
+---------+--------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)

One can see that based on the order,  the last value for the  key “Nethra” gets picked. As of now JSON_OBJECTAGG() without OVER() clause does not support ordering.  So, using window function could be an alternative to get over that limitation.

That is all, folks.  Please try out the new feature and let us know your feedback. Thanks for using MySQL!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK