4

The Oracle SQL Limit Clause and teaching a New Dog Old Tricks

 1 year ago
source link: https://mikesmithers.wordpress.com/2023/01/31/the-oracle-sql-limit-clause-and-teaching-a-new-dog-old-tricks/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

It’s been a few weeks now and I’ve finally gotten over England’s latest World Cup penalty drama.
I’m not so sure about Teddy though…

thoughtful_teddy.jpg?w=1024

“Merde !”

Anyhow, he has decided that I need to know about the SQL Limit Clause that Oracle introduced in 12c and has decided to use data from the Tournament in the examples that follow…

Environment

These examples have all been run on my OCI Free Tier database ( Oracle 19c at the time of writing).
The table we’re going to use contains details of players who scored or gave an assist during the recent Men’s World Cup Finals and looks like this :

create table goal_scorers (
player varchar2(4000),
team varchar2(500),
goals number,
assists number)
/

The table contains 176 rows.

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
/



PLAYER                         TEAM                    GOALS ASSISTS
-----------------------        ----------------------- ----- -------
MBAPPE                         FRANCE                   8       2
MESSI                          ARGENTINA                7       3
ALVAREZ                        ARGENTINA                4       0
GIROUD                         FRANCE                   4       0
MORATA                         SPAIN                    3       1
GONCALO RAMOS                  PORTUGAL                 3       1
SAKA                           ENGLAND                  3       0
RASHFORD                       ENGLAND                  3       0
GAKPO                          NETHERLANDS              3       0
RICHARLISON                    BRAZIL                   3       0
E VALENCIA                     ECUADOR                  3       0

... snip ...

PRECIADO                       ECUADOR                  0       1
MCGREE                         AUSTRALIA                0       1
RODRYGO                        BRAZIL                   0       1

Before we get into the new(ish) syntax, let’s have a quick history lesson…

Before 12c

Back in the mists of time, if you wanted to retrieve the first five rows in a table based on an order you specified, you’d need to do something like this :

select player, team, goals, assists
from (
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc, player, team)
where rownum <= 5;
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
GONCALO RAMOS        PORTUGAL                 3       1

5 rows selected. 

Teddy reckons that you’ll only come across code like this if it’s really old…or written by a really old programmer.

Ignoring the cheeky young pup, we move on to the point were Analytic Functions become available (8i if you’re counting), which allows us to dispense with the – in this context – artificial order criteria of player and team and ensure that our query returns any tied records :

with top_scorers as (
select player, team, goals, assists,
rank() over ( order by goals desc, assists desc) as recnum
from goal_scorers )
select *    
from top_scorers   
where recnum <= 5
order by recnum;
PLAYER               TEAM                 GOALS ASSISTS     RECNUM
-------------------- -------------------- ----- ------- ----------
MBAPPE               FRANCE                   8       2          1
MESSI                ARGENTINA                7       3          2
ALVAREZ              ARGENTINA                4       0          3
GIROUD               FRANCE                   4       0          3
MORATA               SPAIN                    3       1          5
GONCALO RAMOS        PORTUGAL                 3       1          5

6 rows selected. 

In the latest Oracle versions, Teddy reckons that you can get that same analytical function goodness but with a bit less typing…

Fetch First

Let’s re-write our top 5 query using the limit clause…

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 rows only
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
MORATA               SPAIN                    3       1

5 rows selected. 

As we’ve already seen, that’s ever so slightly different from the top 5 goal scorers. Like Morata, Goncalo Ramos also has 3 goals and 1 assist but he has been arbitrarily excluded from the result set.
Remember, Oracle does not guarantee the order of a result set of a SELECT statement other than that specified in the ORDER BY clause. Therefore, there’s no guarantee a future run of this query won’t include Goncalo Ramos and exclude Morata.

A more reliable query would include all of the tied records for 5th place, as with the RANK() query above.
Happily, we can achieve the same effect with a limit clause :

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 rows with ties
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
MORATA               SPAIN                    3       1
GONCALO RAMOS        PORTUGAL                 3       1

6 rows selected. 

As well as specifying a set number of rows, you can also specify a percentage using either the ONLY clause…

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 percent rows only
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
MORATA               SPAIN                    3       1
GONCALO RAMOS        PORTUGAL                 3       1
SAKA                 ENGLAND                  3       0
RASHFORD             ENGLAND                  3       0
GAKPO                NETHERLANDS              3       0

9 rows selected. 

… or the WITH TIES clause…

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 5 percent rows with ties
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
MORATA               SPAIN                    3       1
GONCALO RAMOS        PORTUGAL                 3       1
SAKA                 ENGLAND                  3       0
RASHFORD             ENGLAND                  3       0
GAKPO                NETHERLANDS              3       0
RICHARLISON          BRAZIL                   3       0
E VALENCIA           ECUADOR                  3       0

11 rows selected. 

Offset

If we want to skip the first n rows we can use the OFFSET clause :

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
E VALENCIA           ECUADOR                  3       0
KANE                 ENGLAND                  2       3
BRUNO FERNANDES      PORTUGAL                 2       1
LEWANDOWSKI          POLAND                   2       1
NEYMAR               BRAZIL                   2       1
...snip...
RODRYGO              BRAZIL                   0       1

166 rows selected.

We can also specify the number of rows to fetch in the offset clause :

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows only
/
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows with ties
/

If you’re thinking of “paging” your results using this method, but you are not certain that your order by clause will not result in any ties, you may get some unexpected results.

For example, when I ran :

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
fetch first 10 rows only
/

I got :

PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
MORATA               SPAIN                    3       1
GONCALO RAMOS        PORTUGAL                 3       1
E VALENCIA           ECUADOR                  3       0
SAKA                 ENGLAND                  3       0
RICHARLISON          BRAZIL                   3       0
GAKPO                NETHERLANDS              3       0

10 rows selected. 

However, when I then ran…

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc
offset 10 rows fetch next 10 rows only
/

Gapko appeared again…

PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
GAKPO                NETHERLANDS              3       0
KANE                 ENGLAND                  2       3
NEYMAR               BRAZIL                   2       1
LEWANDOWSKI          POLAND                   2       1
BRUNO FERNANDES      PORTUGAL                 2       1
TAREMI               IRAN                     2       1
ABOUBAKAR            CAMEROON                 2       1
CHO                  SOUTH KOREA              2       0
KRAMARIC             CROATIA                  2       0
AL DAWSARI           SAUDI ARABIA             2       0

10 rows selected. 

In this example, if I want to eliminate duplicates then I need to make sure that my order by clause does not allow any ties :

select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc, player, team
fetch first 10 rows only
/
PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
MBAPPE               FRANCE                   8       2
MESSI                ARGENTINA                7       3
ALVAREZ              ARGENTINA                4       0
GIROUD               FRANCE                   4       0
GONCALO RAMOS        PORTUGAL                 3       1
MORATA               SPAIN                    3       1
E VALENCIA           ECUADOR                  3       0
GAKPO                NETHERLANDS              3       0
RASHFORD             ENGLAND                  3       0
RICHARLISON          BRAZIL                   3       0

10 rows selected. 
select player, team, goals, assists
from goal_scorers
order by goals desc, assists desc, player, team
offset 10 rows fetch next 10 rows only
/

PLAYER               TEAM                 GOALS ASSISTS
-------------------- -------------------- ----- -------
SAKA                 ENGLAND                  3       0
KANE                 ENGLAND                  2       3
ABOUBAKAR            CAMEROON                 2       1
BRUNO FERNANDES      PORTUGAL                 2       1
LEWANDOWSKI          POLAND                   2       1
NEYMAR               BRAZIL                   2       1
TAREMI               IRAN                     2       1
AL DAWSARI           SAUDI ARABIA             2       0
CHO                  SOUTH KOREA              2       0
DE ARRASCAETA        URUGUAY                  2       0

10 rows selected. 

So that’s it then, give the dog a biscuit and be on our way ?
Well, it’s just possible that the rownum method from olden times is not ready to be consigned to history quite yet…

Comparative Performance

Let’s say we have a table with lots of rows…

create table bigtab as
    select rownum as id from dual connect by rownum <= 10000000;

If we run the following queries against it, the relative performance may be a bit of a surprise…

set timing on
with ordered_recs as (
    select id
    from bigtab
    order by 1)
select id
from bigtab
where rownum <=5;

       ID
----------
         1
         2
         3
         4
         5

Elapsed: 00:00:00.042

select id
from bigtab
order by id
fetch first 5 rows only;

        ID
----------
         1
         2
         3
         4
         5

Elapsed: 00:00:00.608


select id 
from (
    select id, rank() over (order by id) as recnum
    from bigtab)
    where recnum <= 5;    
    
        ID
----------
         1
         2
         3
         4
         5

Elapsed: 00:00:00.636

Note that this is the second run of each query to account for the effects of caching.

The similarity in runtime between the limit clause and the analytic function is not that surprising. If you look at an explain plan for a limit clause query, it appears to be using analytical functions under the covers.

The big news here is that the shonky old rownum query is about 14 times faster than the shiny new limit clause equivalent.
Remember, I’m running this on an OCI 19c instance managed by Oracle so there’s nothing odd in the database configuration.

It appears that there is a patch to address an optimizer issue with this type of query, details of which can be found in this article by Nigel Bayliss.

However, if you do come across a performance issue with a limit clause query, it may be worth seeing if good old-fashioned rownum will dig you out of a hole.

Loading...

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK