The Oracle SQL Limit Clause and teaching a New Dog Old Tricks
source link: https://mikesmithers.wordpress.com/2023/01/31/the-oracle-sql-limit-clause-and-teaching-a-new-dog-old-tricks/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
It’s been a few weeks now and I’ve finally gotten over England’s latest World Cup penalty drama.
I’m not so sure about Teddy though…
“Merde !”
Anyhow, he has decided that I need to know about the SQL Limit Clause that Oracle introduced in 12c and has decided to use data from the Tournament in the examples that follow…
Environment
These examples have all been run on my OCI Free Tier database ( Oracle 19c at the time of writing).
The table we’re going to use contains details of players who scored or gave an assist during the recent Men’s World Cup Finals and looks like this :
create table goal_scorers ( player varchar2(4000), team varchar2(500), goals number, assists number) / |
The table contains 176 rows.
select player, team, goals, assists from goal_scorers order by goals desc, assists desc / PLAYER TEAM GOALS ASSISTS ----------------------- ----------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 RICHARLISON BRAZIL 3 0 E VALENCIA ECUADOR 3 0 ... snip ... PRECIADO ECUADOR 0 1 MCGREE AUSTRALIA 0 1 RODRYGO BRAZIL 0 1
Before we get into the new(ish) syntax, let’s have a quick history lesson…
Before 12c
Back in the mists of time, if you wanted to retrieve the first five rows in a table based on an order you specified, you’d need to do something like this :
select player, team, goals, assists from ( select player, team, goals, assists from goal_scorers order by goals desc , assists desc , player, team) where rownum <= 5; |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 GONCALO RAMOS PORTUGAL 3 1 5 rows selected.
Teddy reckons that you’ll only come across code like this if it’s really old…or written by a really old programmer.
Ignoring the cheeky young pup, we move on to the point were Analytic Functions become available (8i if you’re counting), which allows us to dispense with the – in this context – artificial order criteria of player and team and ensure that our query returns any tied records :
with top_scorers as ( select player, team, goals, assists, rank() over ( order by goals desc , assists desc ) as recnum from goal_scorers ) select * from top_scorers where recnum <= 5 order by recnum; |
PLAYER TEAM GOALS ASSISTS RECNUM -------------------- -------------------- ----- ------- ---------- MBAPPE FRANCE 8 2 1 MESSI ARGENTINA 7 3 2 ALVAREZ ARGENTINA 4 0 3 GIROUD FRANCE 4 0 3 MORATA SPAIN 3 1 5 GONCALO RAMOS PORTUGAL 3 1 5 6 rows selected.
In the latest Oracle versions, Teddy reckons that you can get that same analytical function goodness but with a bit less typing…
Fetch First
Let’s re-write our top 5 query using the limit clause…
select player, team, goals, assists from goal_scorers order by goals desc , assists desc fetch first 5 rows only / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 5 rows selected.
As we’ve already seen, that’s ever so slightly different from the top 5 goal scorers. Like Morata, Goncalo Ramos also has 3 goals and 1 assist but he has been arbitrarily excluded from the result set.
Remember, Oracle does not guarantee the order of a result set of a SELECT statement other than that specified in the ORDER BY clause. Therefore, there’s no guarantee a future run of this query won’t include Goncalo Ramos and exclude Morata.
A more reliable query would include all of the tied records for 5th place, as with the RANK() query above.
Happily, we can achieve the same effect with a limit clause :
select player, team, goals, assists from goal_scorers order by goals desc , assists desc fetch first 5 rows with ties / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 6 rows selected.
As well as specifying a set number of rows, you can also specify a percentage using either the ONLY clause…
select player, team, goals, assists from goal_scorers order by goals desc , assists desc fetch first 5 percent rows only / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 9 rows selected.
… or the WITH TIES clause…
select player, team, goals, assists from goal_scorers order by goals desc , assists desc fetch first 5 percent rows with ties / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 SAKA ENGLAND 3 0 RASHFORD ENGLAND 3 0 GAKPO NETHERLANDS 3 0 RICHARLISON BRAZIL 3 0 E VALENCIA ECUADOR 3 0 11 rows selected.
Offset
If we want to skip the first n rows we can use the OFFSET clause :
select player, team, goals, assists from goal_scorers order by goals desc , assists desc offset 10 rows / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- E VALENCIA ECUADOR 3 0 KANE ENGLAND 2 3 BRUNO FERNANDES PORTUGAL 2 1 LEWANDOWSKI POLAND 2 1 NEYMAR BRAZIL 2 1 ...snip... RODRYGO BRAZIL 0 1 166 rows selected.
We can also specify the number of rows to fetch in the offset clause :
select player, team, goals, assists from goal_scorers order by goals desc , assists desc offset 10 rows fetch next 10 rows only / |
select player, team, goals, assists from goal_scorers order by goals desc , assists desc offset 10 rows fetch next 10 rows with ties / |
If you’re thinking of “paging” your results using this method, but you are not certain that your order by clause will not result in any ties, you may get some unexpected results.
For example, when I ran :
select player, team, goals, assists from goal_scorers order by goals desc , assists desc fetch first 10 rows only / |
I got :
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 MORATA SPAIN 3 1 GONCALO RAMOS PORTUGAL 3 1 E VALENCIA ECUADOR 3 0 SAKA ENGLAND 3 0 RICHARLISON BRAZIL 3 0 GAKPO NETHERLANDS 3 0 10 rows selected.
However, when I then ran…
select player, team, goals, assists from goal_scorers order by goals desc , assists desc offset 10 rows fetch next 10 rows only / |
Gapko appeared again…
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- GAKPO NETHERLANDS 3 0 KANE ENGLAND 2 3 NEYMAR BRAZIL 2 1 LEWANDOWSKI POLAND 2 1 BRUNO FERNANDES PORTUGAL 2 1 TAREMI IRAN 2 1 ABOUBAKAR CAMEROON 2 1 CHO SOUTH KOREA 2 0 KRAMARIC CROATIA 2 0 AL DAWSARI SAUDI ARABIA 2 0 10 rows selected.
In this example, if I want to eliminate duplicates then I need to make sure that my order by clause does not allow any ties :
select player, team, goals, assists from goal_scorers order by goals desc , assists desc , player, team fetch first 10 rows only / |
PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- MBAPPE FRANCE 8 2 MESSI ARGENTINA 7 3 ALVAREZ ARGENTINA 4 0 GIROUD FRANCE 4 0 GONCALO RAMOS PORTUGAL 3 1 MORATA SPAIN 3 1 E VALENCIA ECUADOR 3 0 GAKPO NETHERLANDS 3 0 RASHFORD ENGLAND 3 0 RICHARLISON BRAZIL 3 0 10 rows selected.
select player, team, goals, assists from goal_scorers order by goals desc, assists desc, player, team offset 10 rows fetch next 10 rows only / PLAYER TEAM GOALS ASSISTS -------------------- -------------------- ----- ------- SAKA ENGLAND 3 0 KANE ENGLAND 2 3 ABOUBAKAR CAMEROON 2 1 BRUNO FERNANDES PORTUGAL 2 1 LEWANDOWSKI POLAND 2 1 NEYMAR BRAZIL 2 1 TAREMI IRAN 2 1 AL DAWSARI SAUDI ARABIA 2 0 CHO SOUTH KOREA 2 0 DE ARRASCAETA URUGUAY 2 0 10 rows selected.
So that’s it then, give the dog a biscuit and be on our way ?
Well, it’s just possible that the rownum method from olden times is not ready to be consigned to history quite yet…
Comparative Performance
Let’s say we have a table with lots of rows…
create table bigtab as
select rownum as id from dual connect by rownum <= 10000000;
If we run the following queries against it, the relative performance may be a bit of a surprise…
set timing on with ordered_recs as ( select id from bigtab order by 1) select id from bigtab where rownum <=5; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.042 select id from bigtab order by id fetch first 5 rows only; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.608 select id from ( select id, rank() over (order by id) as recnum from bigtab) where recnum <= 5; ID ---------- 1 2 3 4 5 Elapsed: 00:00:00.636
Note that this is the second run of each query to account for the effects of caching.
The similarity in runtime between the limit clause and the analytic function is not that surprising. If you look at an explain plan for a limit clause query, it appears to be using analytical functions under the covers.
The big news here is that the shonky old rownum query is about 14 times faster than the shiny new limit clause equivalent.
Remember, I’m running this on an OCI 19c instance managed by Oracle so there’s nothing odd in the database configuration.
It appears that there is a patch to address an optimizer issue with this type of query, details of which can be found in this article by Nigel Bayliss.
However, if you do come across a performance issue with a limit clause query, it may be worth seeing if good old-fashioned rownum will dig you out of a hole.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK