Automatic Indexing: JSON Expressions Part I (Making Plans For Nigel)
source link: https://richardfoote.wordpress.com/2022/04/13/automatic-indexing-json-expressions-part-i-making-plans-for-nigel/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Automatic Indexing: JSON Expressions Part I (Making Plans For Nigel) April 13, 2022
Posted by Richard Foote in Automatic Indexing, Autonomous Database, CBO, Exadata, Function Based Indexes, Index statistics, JSON, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Virtual Columns.
trackback
When Automatic Indexing was first released, one of the restrictions was that automatic indexes on JSON expressions were NOT supported.
However, the Oracle Database 21c doco mentions:
“Automatic indexes can be single or multi-column. They are considered for the following: Selected expressions (for example, JSON expressions)“.
So on my (admittedly dodgy) “Exadata” VM, I thought I’ll check out how AI now indeed deals with JSON expressions.
I start by creating a simple little table that uses the new 21c JSON datatype and populate it with some JSON documents (note the PONumber key has effectively unique numeric values assigned):
SQL> CREATE TABLE bowie_json
(id number,
bowie_date date,
bowie_order JSON);
SQL> insert into bowie_json
select
rownum,
sysdate,
'{"PONumber" : '
|| rownum || ',
"Reference"
:
"2022' || rownum || 'DBOWIE"
,
"Requestor"
:
"David Bowie"
,
"User"
:
"DBOWIE"
,
"CostCenter"
:
"A42"
,
"ShippingInstructions"
: {
"name"
:
"David Bowie"
,
"Address"
: {
"street"
:
"42 Ziggy Street"
,
"city"
:
"Canberra"
,
"state"
:
"ACT"
,
"zipCode"
:
2601
,
"country"
:
"Australia"
},
"Phone"
: [{
"type"
:
"Office"
,
"number"
:
"417-555-7777"
},
{
"type"
:
"Mobile"
,
"number"
:
"417-555-1234"
}]},
"Special Instructions"
: null,
"AllowPartialShipment"
: true,
"LineItems"
: [{
"ItemNumber"
:
1
,
"Part"
: {
"Description"
:
"Hunky Dory"
,
"UnitPrice"
:
10.95
},
"Quantity"
:
5.0
},
{
"ItemNumber"
:
2
,
"Part"
: {
"Description"
:
"Pin-Ups"
,
"UnitPrice"
:
10.95
},
"Quantity"
:
3.0
}]}'
from dual connect by
level
<=
2000000
;
2000000
rows created.
SQL> commit;
Commit complete
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>
'BOWIE_JSON'
);
PL/SQL procedure successfully completed.
As always, it’s important to ensure the table has statistics, as AI does not work properly without them.
I then run a number of SQL statements, with different JSON expression based predicates, including:
SQL> select * from bowie_json where json_value(bowie_order,
'$.PONumber'
)=
'42'
;
SQL> select * from bowie_json z where z.bowie_order.PONumber.number()=
4242
;
SQL> select * from bowie_json where json_value(bowie_order,
'$.PONumber'
returning number)=
42
;
Execution Plan
----------------------------------------------------------
Plan hash value:
1196930810
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
|
0
| SELECT STATEMENT | |
20000
|
12
M |
34476
(
1
) |
00:
00:
02
|
|*
1
| TABLE ACCESS FULL | BOWIE_JSON |
20000
|
12
M |
34476
(
1
) |
00:
00:
02
|
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1
- filter(JSON_VALUE(
"BOWIE_ORDER"
/*+ LOB_BY_VALUE */
FORMAT OSON
,
'$.PONumber'
RETURNING NUMBER NULL ON ERROR)=
42
)
Statistics
----------------------------------------------------------
0
recursive calls
0
db
block
gets
259127
consistent gets
200279
physical reads
0
redo size
1595
bytes sent via SQL*Net to client
526
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
0
sorts (memory)
0
sorts (disk)
1
rows processed
They all return just the one row, but must currently use a Full Table Scan with no indexes present.
So what does AI make of things?
The first thing to note is that running the AI last activity report generates the following error:
SQL> select dbms_auto_index.report_last_activity() report from dual;
ERROR:
ORA
-30954:
char
0
is invalid in json_value(BOWIE_ORDER,
'$.PONumber'
returning VA
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX"
, line
177
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX"
, line
107
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX_INTERNAL"
, line
8676
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX_INTERNAL"
, line
8676
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX_INTERNAL"
, line
9226
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX"
, line
89
ORA
-06512:
at
"SYS.DBMS_AUTO_INDEX"
, line
167
ORA
-06512:
at line
1
no rows selected
If we look at the indexes now present with the table:
SQL> select index_name, index_type,
auto
, visibility, status, num_rows, leaf_blocks, clustering_factor from user_indexes
where table_name=
'BOWIE_JSON'
;
INDEX_NAME INDEX_TYPE AUT VISIBILIT STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- ------------------------- --- --------- -------- ---------- ----------- -----------------
SYS_IL
0000081096
C
00003
$$ LOB NO VISIBLE VALID
SYS_AI_ayvj
257
jd
93
cv FUNCTION-BASED NORMAL YES VISIBLE VALID
2000000
5141
380000
SYS_AI_gpdkwzugdn
055
FUNCTION-BASED NORMAL YES VISIBLE VALID
2000000
4596
200000
SQL> select index_name, column_expression from user_ind_expressions where table_name=
'BOWIE_JSON'
;
INDEX_NAME COLUMN_EXPRESSION
------------------------- --------------------------------------------------------------------------------
SYS_AI_ayvj
257
jd
93
cv JSON_VALUE(
"BOWIE_ORDER"
FORMAT OSON ,
'$.PONumber'
RETURNING VARCHAR
2
(
4000
) ERR
OR ON ERROR NULL ON EMPTY)
SYS_AI_gpdkwzugdn
055
JSON_VALUE(
"BOWIE_ORDER"
FORMAT OSON ,
'$.PONumber'
RETURNING NUMBER ERROR ON ER
ROR NULL ON EMPTY)
We can see that AI has indeed created two new automatic indexes, one on the VARCHAR2 JSON expression and one on the NUMBER JSON expression.
If we re-run the SQLs, we notice 3 very important points. Note the following example was run soon after the automatic indexes were created:
SQL> select * from bowie_json where json_value(bowie_order,
'$.PONumber'
)=
'42'
;
Execution Plan
----------------------------------------------------------
Plan hash value:
832017402
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
------------------------------------------------------------------------------------------------------------
|
0
| SELECT STATEMENT | |
20000
|
12
M |
1524
(
1
) |
00:
00:
01
|
|
1
| TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_JSON |
20000
|
12
M |
1524
(
1
) |
00:
00:
01
|
|*
2
| INDEX RANGE SCAN | SYS_AI_ayvj
257
jd
93
cv |
8000
| |
3
(
0
) |
00:
00:
01
|
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2
- access(JSON_VALUE(
"BOWIE_ORDER"
/*+ LOB_BY_VALUE */
FORMAT OSON ,
'$.PONumber'
RETURNING
VARCHAR
2
(
4000
) ERROR ON ERROR NULL ON EMPTY)=
'42'
)
Statistics
----------------------------------------------------------
0
recursive calls
0
db
block
gets
234168
consistent gets
200279
physical reads
0
redo size
1595
bytes sent via SQL*Net to client
526
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
0
sorts (memory)
0
sorts (disk)
1
rows processed
The first point to note is that the CBO now chooses to use the newly created automatic index. As only one row is return, this is as one would hope.
But there are two other very important points/issues worth making about the above execution plan and associated costs and statistics. One is associated with new AI behaviour introduced in 21c and the other is associated with an old trap in relation to function-based indexes.
I’ll leave it to the discernible reader to spot these issues, before I cover them in Part II in the coming days…
Comments»
1. Automatic Indexing: JSON Expressions Part II (“Without You I’m Nothing” | Richard Foote's Oracle Blog - April 14, 2022
[…] my previous post on Automatic Indexing (AI) and JSON expressions, I mentioned two additional issues with the execution plan generated after the automatic indexes […]
Leave a Reply Cancel reply
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK