2

Using the INTERVAL data type ( and some pictures) to tell the difference between...

 3 years ago
source link: https://mikesmithers.wordpress.com/2021/04/05/using-the-interval-data-type-and-some-pictures-to-tell-the-difference-between-two-timestamps-in-oracle/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Using the INTERVAL data type ( and some pictures) to tell the difference between two Timestamps in Oracle

Posted on April 5, 2021

Looking after an overnight batch process can be a big fraught at times.
If it’s a good day, you might begin the morning with a nice cup of coffee and a leisurely scroll through the logs to confirm that all is well.
In contrast, if the batch has overrun you may well find yourself scrambling through those same logs whilst gulping down large quantities of caffeine in a desperate attempt to hot-wire your brain into working out the elapsed time between each entry. Not great. Especially when you consider that, as Terry Pratchett put it,
“Coffee is a way of stealing time that should by rights belong to your older self”.
A better approach might be to get Oracle to do it for you.

What we’re going to look at here is :

  • the INTERVAL data type that holds the difference between two TIMESTAMP values
  • using the LAG SQL function to report the interval time between timestamps across rows of a query
  • creating some graphs using SQLDeveloper User Defined Reports for when it’s just too early for words and numbers.

The Batch Log table

Running on Oracle 18cXE, our overnight batch logs to the BATCH_LOG table :

create table batch_log (
id number generated always as identity,
log_ts timestamp not null,
job_name varchar2(50),
status varchar2(25))
/

I’ve used the following script to populate the table with some test data :

set serverout on size unlimited
clear screen
declare
tbl_jobs dbms_utility.name_array;
tbl_statuses dbms_utility.name_array;
start_ts timestamp := to_timestamp('16-MAR-2021 01:00', 'DD-MON-YYYY HH24:MI');
v_ts timestamp;
v_duration pls_integer;
begin
tbl_jobs(1) := 'EXTRACT';
tbl_jobs(2) := 'TRANSFORM';
tbl_jobs(3) := 'LOAD';
tbl_jobs(4) := 'REPORT';
tbl_statuses(1) := 'STARTED';
tbl_statuses(2) := 'COMPLETED';
for i in 1..7 loop
v_ts := start_ts;
for j in 1..tbl_jobs.count loop
v_duration :=
case mod(j, 4)
when 3 then trunc(dbms_random.value(8,11))
when 2 then trunc(dbms_random.value(30,41))
when 1 then trunc(dbms_random.value(25,31))
else trunc(dbms_random.value(15,21))
end;
for k in 1..tbl_statuses.count loop
insert into batch_log( log_ts, job_name, status)
values(
case when tbl_statuses(k) = 'STARTED' then v_ts else v_ts + (v_duration/1440) end,
tbl_jobs(j),
tbl_statuses(k));
end loop;
v_ts := v_ts + (v_duration/1440);
end loop;
start_ts := start_ts + interval '1' day;
end loop;   
end;
/
-- Now simulate a long running Load job
update batch_log
set log_ts = log_ts + (60/1440)
where id >= (select max(id) from batch_log where job_name = 'LOAD' and status = 'COMPLETED');
commit;

As a result, the table now has a week’s worth of log entries.

The INTERVAL data type

You can find lots of information about Oracle’s DATETIME and INTERVAL data types in the documentation. Here’s the 18c version , for example ( all of the code here is written and tested on 18cXE).

You will see from this that we have two distinct INTERVAL data types – Year to Month and Day to Second.
It’s the latter of these which is relevant in the current context, unless you’re having extreme performance issues on your overnight batch.

If you have any scheduler jobs running, you can see an example of the INTERVAL data type in action in the _scheduler_jobs views :

select owner, job_name, last_run_duration
from dba_scheduler_jobs t
where run_count > 0;
OWNER           JOB_NAME                                 LAST_RUN_DURATION  
--------------- ---------------------------------------- -------------------
SYS             PURGE_LOG                                +00 00:00:00.924991
SYS             PMO_DEFERRED_GIDX_MAINT_JOB              +00 00:00:00.117572
SYS             CLEANUP_NON_EXIST_OBJ                    +00 00:00:00.221356
SYS             CLEANUP_ONLINE_IND_BUILD                 +00 00:00:00.382708
SYS             CLEANUP_TAB_IOT_PMO                      +00 00:00:00.129863
SYS             CLEANUP_TRANSIENT_TYPE                   +00 00:00:00.071285
SYS             CLEANUP_TRANSIENT_PKG                    +00 00:00:00.094254
SYS             CLEANUP_ONLINE_PMO                       +00 00:00:00.024660
SYS             FILE_SIZE_UPD                            +00 00:00:00.166189
SYS             ORA$AUTOTASK_CLEAN                       +00 00:00:00.151000
SYS             DRA_REEVALUATE_OPEN_FAILURES             +00 00:00:00.387121
SYS             BSLN_MAINTAIN_STATS_JOB                  +00 00:00:00.443146
SYS             RSE$CLEAN_RECOVERABLE_SCRIPT             +00 00:00:00.324392
SYS             SM$CLEAN_AUTO_SPLIT_MERGE                +00 00:00:00.023483
APEX_180200     ORACLE_APEX_PURGE_SESSIONS               +00 00:00:00.868204
APEX_180200     ORACLE_APEX_MAIL_QUEUE                   +00 00:00:00.026859
APEX_180200     ORACLE_APEX_WS_NOTIFICATIONS             +00 00:00:01.685787
APEX_180200     ORACLE_APEX_DAILY_MAINTENANCE            +00 00:00:00.943306
ORDS_METADATA   CLEAN_OLD_ORDS_SESSIONS                  +00 00:00:00.640826

19 rows selected. 

According to the documentation, INTERVALs are not subject to a format model in the same way that dates and timestamps are. However, you can re-format their contents using the EXTRACT function.
For example, if we want to convert the LAST_RUN_DURATION from our previous query into seconds, then we can run :

select owner, job_name,
(extract ( minute from last_run_duration) * 60) +
extract( second from last_run_duration) as last_run_secs
from dba_scheduler_jobs
where run_count > 0
order by 3 desc
/

…which returns…

OWNER           JOB_NAME                                 LAST_RUN_SECS
--------------- ---------------------------------------- -------------
APEX_180200     ORACLE_APEX_WS_NOTIFICATIONS                  1.685787
APEX_180200     ORACLE_APEX_DAILY_MAINTENANCE                  .943306
SYS             PURGE_LOG                                      .924991
APEX_180200     ORACLE_APEX_PURGE_SESSIONS                     .868204
ORDS_METADATA   CLEAN_OLD_ORDS_SESSIONS                        .640826
SYS             BSLN_MAINTAIN_STATS_JOB                        .443146
SYS             DRA_REEVALUATE_OPEN_FAILURES                   .387121
SYS             CLEANUP_ONLINE_IND_BUILD                       .382708
SYS             RSE$CLEAN_RECOVERABLE_SCRIPT                   .324392
SYS             CLEANUP_NON_EXIST_OBJ                          .221356
SYS             FILE_SIZE_UPD                                  .166189
SYS             ORA$AUTOTASK_CLEAN                                .151
SYS             CLEANUP_TAB_IOT_PMO                            .129863
SYS             PMO_DEFERRED_GIDX_MAINT_JOB                    .117572
SYS             CLEANUP_TRANSIENT_PKG                          .094254
SYS             CLEANUP_TRANSIENT_TYPE                         .071285
APEX_180200     ORACLE_APEX_MAIL_QUEUE                         .026859
SYS             CLEANUP_ONLINE_PMO                              .02466
SYS             SM$CLEAN_AUTO_SPLIT_MERGE                      .023483

19 rows selected.

What we need now is a way to calculate the durations between rows in a single query…

The LAG function

As ever, OracleBase has a pretty good explanation of the LEAD and LAG functions.

All we need to do is to use the LAG function to find the elapsed time between the timestamp of the current record and that of it’s predecessor in the result set :

select trunc(log_ts) as run_date,
job_name,
status,
log_ts,
case when status = 'COMPLETED' then
log_ts - lag( log_ts) over( order by log_ts)
end as duration,
case when status = 'COMPLETED' then
extract( hour from log_ts - lag(log_ts) over (order by log_ts)) * 60 +
extract( minute from log_ts - lag(log_ts) over (order by log_ts))
end as duration_mins
from batch_log
order by log_ts
/

We could use what we have so far as the basis for an ad-hoc script to give us information about batch runtimes. Then again, if it’s one of those mornings where you’re struggling a bit, looking at a picture might make things a bit easier. That last listing was from SQLDeveloper, which is handy considering where we’re going next …

SQLDeveloper User Defined Chart Reports – The Line Chart

We can see the total runtime for the batch for each day using the query :

with batch_runs as (
select job_name, status, log_ts,
case status when 'COMPLETED' then
(extract( hour from log_ts - lag( log_ts, 1) over( order by log_ts)) * 60) +
extract(minute from log_ts - lag(log_ts, 1) over( order by log_ts))
end as duration_mins
from batch_log
where (
(job_name = 'EXTRACT' and status = 'STARTED')
or (job_name = 'REPORT' and status = 'COMPLETED'))   
order by log_ts)
select trunc(log_ts) as run_date,
duration_mins
from batch_runs   
where duration_mins is not null
order by 1
/

…which returns…

RUN_DATE    DURATION_MINS
----------- -------------
16-MAR-2021            88
17-MAR-2021            83
18-MAR-2021            90
19-MAR-2021            84
20-MAR-2021            94
21-MAR-2021            91
22-MAR-2021           155

7 rows selected. 

Jeff Smith has a nifty way of turning a SQLDeveloper query grid into a chart report, which I’ve largely plagiarised below.

First, we run the query in a SQLDeveloper worksheet. Then, in the result grid, right-click and select Save Grid as Report :

This will take you to the Create Report Window :

As you can see, SQLDeveloper has reformatted the query a bit.

Now we can add a Name for the report.
Before we go any further, it’s probably a good idea to specify a connection at this point. This will help us when we’re configuring the report as we’ll be able to use live data to see how things look.
The other thing we need to do is to specify a series name so that we can produce a line graph.

This means that the report query now looks like this :

SELECT "RUN_DATE", "DURATION_MINS",
'Overnight Runs' as series_name
FROM(
with batch_runs as (
select job_name, status, log_ts,
case status when 'COMPLETED' then
(extract( hour from log_ts - lag( log_ts, 1) over( order by log_ts)) * 60) +
extract(minute from log_ts - lag(log_ts, 1) over( order by log_ts))
end as duration_mins
from batch_log
where (
(job_name = 'EXTRACT' and status = 'STARTED')
or (job_name = 'REPORT' and status = 'COMPLETED'))   
order by log_ts)
select trunc(log_ts) as run_date,
duration_mins
from batch_runs   
where duration_mins is not null
order by 1
)

…and with a connection specified, I can check everything is still working as expected by clicking the Test Report button…

…which provides the expected output.

Now I know that my query still works, I can change the report Style from Table to Chart :

Now we need to click on the Property node of the navigation tree in the left-hand pane of the Create Report Window.
Here, we change the Chart Type to Line :

Next we need to specify the Data for the report so we need to click on the Data Tree Node.

As we’re connected to a database, we can Fetch Column Names so we can select the Group, Series and Value columns from a drop-down.

Now we can check the Use Live Data checkbox and click Refresh Report to see what the output will look like :

Finally, click the Apply button in the bottom right of the Create Report window and the new report will appear in the User Defined Reports node of the main Reports pane :

Now we can run the report in the usual way…

A picture paints a thousand words…but in this case just says “Hmmm…”

If we want a more granular view of what’s happening during the batch run, but we’re not quite ready to deal with actual text, we can try using…

A Stacked Bar Chart

This time, we want to know the elapsed time for each process in the batch. Therefore, we’ll use this query as the basis for our report :

select trunc(log_ts) as run_date,
job_name,
status,
log_ts,
case when status = 'COMPLETED' then
log_ts - lag( log_ts) over( order by log_ts)
end as duration,
case when status = 'COMPLETED' then
extract( hour from log_ts - lag(log_ts) over (order by log_ts)) * 60 +
extract( minute from log_ts - lag(log_ts) over (order by log_ts))
end as duration_mins
from batch_log
order by log_ts

The Report creation is as before, except that this time we choose a Chart Type of Bar – Vertical Stack.
Additionally, in order to view the total time for each day, we go to the Plot Area Property and check the Cumulative box in the Data Tips section :

Running this new report, we can see that the increase in duration is due to the Load process taking rather longer than normal :

Should it really be that colour ?

Of course, you’ll need to investigate further to find out just exactly what’s going on but, thanks to some handy SQL features and a couple of SQLDeveloper Chart Reports, at least you know where to start looking.

Advertisements
Report this ad

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK