Using the INTERVAL data type ( and some pictures) to tell the difference between...
source link: https://mikesmithers.wordpress.com/2021/04/05/using-the-interval-data-type-and-some-pictures-to-tell-the-difference-between-two-timestamps-in-oracle/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Using the INTERVAL data type ( and some pictures) to tell the difference between two Timestamps in Oracle
Looking after an overnight batch process can be a big fraught at times.
If it’s a good day, you might begin the morning with a nice cup of coffee and a leisurely scroll through the logs to confirm that all is well.
In contrast, if the batch has overrun you may well find yourself scrambling through those same logs whilst gulping down large quantities of caffeine in a desperate attempt to hot-wire your brain into working out the elapsed time between each entry. Not great. Especially when you consider that, as Terry Pratchett put it,
“Coffee is a way of stealing time that should by rights belong to your older self”.
A better approach might be to get Oracle to do it for you.
What we’re going to look at here is :
- the INTERVAL data type that holds the difference between two TIMESTAMP values
- using the LAG SQL function to report the interval time between timestamps across rows of a query
- creating some graphs using SQLDeveloper User Defined Reports for when it’s just too early for words and numbers.
The Batch Log table
Running on Oracle 18cXE, our overnight batch logs to the BATCH_LOG table :
create
table
batch_log (
id number generated always
as
identity,
log_ts
timestamp
not
null
,
job_name varchar2(50),
status varchar2(25))
/
I’ve used the following script to populate the table with some test data :
set
serverout
on
size
unlimited
clear screen
declare
tbl_jobs dbms_utility.name_array;
tbl_statuses dbms_utility.name_array;
start_ts
timestamp
:= to_timestamp(
'16-MAR-2021 01:00'
,
'DD-MON-YYYY HH24:MI'
);
v_ts
timestamp
;
v_duration pls_integer;
begin
tbl_jobs(1) :=
'EXTRACT'
;
tbl_jobs(2) :=
'TRANSFORM'
;
tbl_jobs(3) :=
'LOAD'
;
tbl_jobs(4) :=
'REPORT'
;
tbl_statuses(1) :=
'STARTED'
;
tbl_statuses(2) :=
'COMPLETED'
;
for
i
in
1..7 loop
v_ts := start_ts;
for
j
in
1..tbl_jobs.
count
loop
v_duration :=
case
mod(j, 4)
when
3
then
trunc(dbms_random.value(8,11))
when
2
then
trunc(dbms_random.value(30,41))
when
1
then
trunc(dbms_random.value(25,31))
else
trunc(dbms_random.value(15,21))
end
;
for
k
in
1..tbl_statuses.
count
loop
insert
into
batch_log( log_ts, job_name, status)
values
(
case
when
tbl_statuses(k) =
'STARTED'
then
v_ts
else
v_ts + (v_duration/1440)
end
,
tbl_jobs(j),
tbl_statuses(k));
end
loop;
v_ts := v_ts + (v_duration/1440);
end
loop;
start_ts := start_ts + interval
'1'
day
;
end
loop;
end
;
/
-- Now simulate a long running Load job
update
batch_log
set
log_ts = log_ts + (60/1440)
where
id >= (
select
max
(id)
from
batch_log
where
job_name =
'LOAD'
and
status =
'COMPLETED'
);
commit
;
As a result, the table now has a week’s worth of log entries.
The INTERVAL data type
You can find lots of information about Oracle’s DATETIME and INTERVAL data types in the documentation. Here’s the 18c version , for example ( all of the code here is written and tested on 18cXE).
You will see from this that we have two distinct INTERVAL data types – Year to Month and Day to Second.
It’s the latter of these which is relevant in the current context, unless you’re having extreme performance issues on your overnight batch.
If you have any scheduler jobs running, you can see an example of the INTERVAL data type in action in the _scheduler_jobs views :
select owner, job_name, last_run_duration
from dba_scheduler_jobs t
where run_count > 0;
OWNER JOB_NAME LAST_RUN_DURATION --------------- ---------------------------------------- ------------------- SYS PURGE_LOG +00 00:00:00.924991 SYS PMO_DEFERRED_GIDX_MAINT_JOB +00 00:00:00.117572 SYS CLEANUP_NON_EXIST_OBJ +00 00:00:00.221356 SYS CLEANUP_ONLINE_IND_BUILD +00 00:00:00.382708 SYS CLEANUP_TAB_IOT_PMO +00 00:00:00.129863 SYS CLEANUP_TRANSIENT_TYPE +00 00:00:00.071285 SYS CLEANUP_TRANSIENT_PKG +00 00:00:00.094254 SYS CLEANUP_ONLINE_PMO +00 00:00:00.024660 SYS FILE_SIZE_UPD +00 00:00:00.166189 SYS ORA$AUTOTASK_CLEAN +00 00:00:00.151000 SYS DRA_REEVALUATE_OPEN_FAILURES +00 00:00:00.387121 SYS BSLN_MAINTAIN_STATS_JOB +00 00:00:00.443146 SYS RSE$CLEAN_RECOVERABLE_SCRIPT +00 00:00:00.324392 SYS SM$CLEAN_AUTO_SPLIT_MERGE +00 00:00:00.023483 APEX_180200 ORACLE_APEX_PURGE_SESSIONS +00 00:00:00.868204 APEX_180200 ORACLE_APEX_MAIL_QUEUE +00 00:00:00.026859 APEX_180200 ORACLE_APEX_WS_NOTIFICATIONS +00 00:00:01.685787 APEX_180200 ORACLE_APEX_DAILY_MAINTENANCE +00 00:00:00.943306 ORDS_METADATA CLEAN_OLD_ORDS_SESSIONS +00 00:00:00.640826 19 rows selected.
According to the documentation, INTERVALs are not subject to a format model in the same way that dates and timestamps are. However, you can re-format their contents using the EXTRACT function.
For example, if we want to convert the LAST_RUN_DURATION from our previous query into seconds, then we can run :
select
owner, job_name,
(extract (
minute
from
last_run_duration) * 60) +
extract(
second
from
last_run_duration)
as
last_run_secs
from
dba_scheduler_jobs
where
run_count > 0
order
by
3
desc
/
…which returns…
OWNER JOB_NAME LAST_RUN_SECS --------------- ---------------------------------------- ------------- APEX_180200 ORACLE_APEX_WS_NOTIFICATIONS 1.685787 APEX_180200 ORACLE_APEX_DAILY_MAINTENANCE .943306 SYS PURGE_LOG .924991 APEX_180200 ORACLE_APEX_PURGE_SESSIONS .868204 ORDS_METADATA CLEAN_OLD_ORDS_SESSIONS .640826 SYS BSLN_MAINTAIN_STATS_JOB .443146 SYS DRA_REEVALUATE_OPEN_FAILURES .387121 SYS CLEANUP_ONLINE_IND_BUILD .382708 SYS RSE$CLEAN_RECOVERABLE_SCRIPT .324392 SYS CLEANUP_NON_EXIST_OBJ .221356 SYS FILE_SIZE_UPD .166189 SYS ORA$AUTOTASK_CLEAN .151 SYS CLEANUP_TAB_IOT_PMO .129863 SYS PMO_DEFERRED_GIDX_MAINT_JOB .117572 SYS CLEANUP_TRANSIENT_PKG .094254 SYS CLEANUP_TRANSIENT_TYPE .071285 APEX_180200 ORACLE_APEX_MAIL_QUEUE .026859 SYS CLEANUP_ONLINE_PMO .02466 SYS SM$CLEAN_AUTO_SPLIT_MERGE .023483 19 rows selected.
What we need now is a way to calculate the durations between rows in a single query…
The LAG function
As ever, OracleBase has a pretty good explanation of the LEAD and LAG functions.
All we need to do is to use the LAG function to find the elapsed time between the timestamp of the current record and that of it’s predecessor in the result set :
select
trunc(log_ts)
as
run_date,
job_name,
status,
log_ts,
case
when
status =
'COMPLETED'
then
log_ts - lag( log_ts) over(
order
by
log_ts)
end
as
duration,
case
when
status =
'COMPLETED'
then
extract(
hour
from
log_ts - lag(log_ts) over (
order
by
log_ts)) * 60 +
extract(
minute
from
log_ts - lag(log_ts) over (
order
by
log_ts))
end
as
duration_mins
from
batch_log
order
by
log_ts
/
We could use what we have so far as the basis for an ad-hoc script to give us information about batch runtimes. Then again, if it’s one of those mornings where you’re struggling a bit, looking at a picture might make things a bit easier. That last listing was from SQLDeveloper, which is handy considering where we’re going next …
SQLDeveloper User Defined Chart Reports – The Line Chart
We can see the total runtime for the batch for each day using the query :
with batch_runs as (
select job_name, status, log_ts,
case status when 'COMPLETED' then
(extract( hour from log_ts - lag( log_ts, 1) over( order by log_ts)) * 60) +
extract(minute from log_ts - lag(log_ts, 1) over( order by log_ts))
end as duration_mins
from batch_log
where (
(job_name = 'EXTRACT' and status = 'STARTED')
or (job_name = 'REPORT' and status = 'COMPLETED'))
order by log_ts)
select trunc(log_ts) as run_date,
duration_mins
from batch_runs
where duration_mins is not null
order by 1
/
…which returns…
RUN_DATE DURATION_MINS ----------- ------------- 16-MAR-2021 88 17-MAR-2021 83 18-MAR-2021 90 19-MAR-2021 84 20-MAR-2021 94 21-MAR-2021 91 22-MAR-2021 155 7 rows selected.
Jeff Smith has a nifty way of turning a SQLDeveloper query grid into a chart report, which I’ve largely plagiarised below.
First, we run the query in a SQLDeveloper worksheet. Then, in the result grid, right-click and select Save Grid as Report :
This will take you to the Create Report Window :
As you can see, SQLDeveloper has reformatted the query a bit.
Now we can add a Name for the report.
Before we go any further, it’s probably a good idea to specify a connection at this point. This will help us when we’re configuring the report as we’ll be able to use live data to see how things look.
The other thing we need to do is to specify a series name so that we can produce a line graph.
This means that the report query now looks like this :
SELECT
"RUN_DATE"
,
"DURATION_MINS"
,
'Overnight Runs'
as
series_name
FROM
(
with
batch_runs
as
(
select
job_name, status, log_ts,
case
status
when
'COMPLETED'
then
(extract(
hour
from
log_ts - lag( log_ts, 1) over(
order
by
log_ts)) * 60) +
extract(
minute
from
log_ts - lag(log_ts, 1) over(
order
by
log_ts))
end
as
duration_mins
from
batch_log
where
(
(job_name =
'EXTRACT'
and
status =
'STARTED'
)
or
(job_name =
'REPORT'
and
status =
'COMPLETED'
))
order
by
log_ts)
select
trunc(log_ts)
as
run_date,
duration_mins
from
batch_runs
where
duration_mins
is
not
null
order
by
1
)
…and with a connection specified, I can check everything is still working as expected by clicking the Test Report button…
…which provides the expected output.
Now I know that my query still works, I can change the report Style from Table to Chart :
Now we need to click on the Property node of the navigation tree in the left-hand pane of the Create Report Window.
Here, we change the Chart Type to Line :
Next we need to specify the Data for the report so we need to click on the Data Tree Node.
As we’re connected to a database, we can Fetch Column Names so we can select the Group, Series and Value columns from a drop-down.
Now we can check the Use Live Data checkbox and click Refresh Report to see what the output will look like :
Finally, click the Apply button in the bottom right of the Create Report window and the new report will appear in the User Defined Reports node of the main Reports pane :
Now we can run the report in the usual way…
A picture paints a thousand words…but in this case just says “Hmmm…”
If we want a more granular view of what’s happening during the batch run, but we’re not quite ready to deal with actual text, we can try using…
A Stacked Bar Chart
This time, we want to know the elapsed time for each process in the batch. Therefore, we’ll use this query as the basis for our report :
select
trunc(log_ts)
as
run_date,
job_name,
status,
log_ts,
case
when
status =
'COMPLETED'
then
log_ts - lag( log_ts) over(
order
by
log_ts)
end
as
duration,
case
when
status =
'COMPLETED'
then
extract(
hour
from
log_ts - lag(log_ts) over (
order
by
log_ts)) * 60 +
extract(
minute
from
log_ts - lag(log_ts) over (
order
by
log_ts))
end
as
duration_mins
from
batch_log
order
by
log_ts
The Report creation is as before, except that this time we choose a Chart Type of Bar – Vertical Stack.
Additionally, in order to view the total time for each day, we go to the Plot Area Property and check the Cumulative box in the Data Tips section :
Running this new report, we can see that the increase in duration is due to the Load process taking rather longer than normal :
Should it really be that colour ?
Of course, you’ll need to investigate further to find out just exactly what’s going on but, thanks to some handy SQL features and a couple of SQLDeveloper Chart Reports, at least you know where to start looking.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK