52

Selecting all columns of a database table with an SQL GROUP BY expression

 5 years ago
source link: https://www.tuicool.com/articles/hit/mAJvui6
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Suppose we have an SQL database table named “temperatures” with the following contents:

LOCATION  TIME        CELSIUS
inside    2018-08-01  24
inside    2018-08-02  28
inside    2018-08-03  21
inside    2018-08-04  28
outside   2018-08-01  29
outside   2018-08-02  31
outside   2018-08-03  25
outside   2018-08-04  30

We want to find the highest temperature for each location. We use the MAX aggregate function and a GROUP BY expression:

SELECT location, MAX(celsius) celsius
FROM temperatures
GROUP BY location;

As expected, the result is:

LOCATION  CELSIUS
outside   31
inside    28

Now suppose we also want to know when each of these extreme temperatures occured. Naively, we try the following query:

SELECT location, time, MAX(celsius) celsius
FROM temperatures
GROUP BY location;

The response is an error: “not a GROUP BY expression”. In a GROUP BY expression all selected columns must be either part of the GROUP BY clause or an aggregate.

To achieve what we want we can use a JOIN:

SELECT
  t.location, t.time, t.celsius
FROM
  temperatures t
JOIN (SELECT location, MAX(celsius) celsius
      FROM temperatures
      GROUP BY location) tmax
ON
  t.location=tmax.location AND t.celsius=tmax.celsius;

This query results in multiple rows per location if the maximum temperature was recorded at different times:

LOCATION  TIME        CELSIUS
outside   2018-08-02  31
inside    2018-08-04  28
inside    2018-08-02  28

If we are only interested in the first occurrence of the maximum temperature per location, we can use the following query:

SELECT
  location,
  MIN(time) KEEP (DENSE_RANK LAST ORDER BY celsius) time,
  MAX(celsius) celsius
FROM
  temperatures
GROUP BY
  location;
LOCATION  TIME        CELSIUS
inside    2018-08-02  28
outside   2018-08-02  31

Here we don’t need a JOIN anymore, because select clause for the time column is an aggregate as well.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK