DBMS_SCHEDULER – Fiddling with File Watchers | The Anti-Kyte
source link: https://mikesmithers.wordpress.com/2020/09/30/dbms_scheduler-fiddling-with-file-watchers/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
DBMS_SCHEDULER – Fiddling with File Watchers
I recently updated my blog profile picture. Let’s face it, it was about time. The old one was taken 14 years ago.
This time, I thought I’d go for a contemporary portrait, which has the added benefit of being less likely to frighten animals and small children.
Unfortunately, it is also crying out for a caption :
Just popping into the bank. Keep the engine running
One or two alternatives captions may well crop up in the following paragraphs.
Primarily though, I’ll be talking about the file watcher capabilities of DBMS_SCHEDULER.
Introduced in 11g, it enables developers to trigger actions when a file arrives without having to leave the comfort of their database.
There are a few moving parts and getting them to work together can take a bit of, well, fiddling.
Specifically, we’ll be looking at :
- configuring the frequency at which filewatchers are run
- creating a credential to use for file watching
- creating a file watcher to determine when a file has arrived
- creating a job for the file watcher to execute
- adding a program so that we can get some information about the file
- getting the filewatcher to wait for a large file transfer to complete
Before we get into all of that, we need to have a quick chat about environments…
Don’t try this at home
More specifically, don’t try this on Oracle 18c Express Edition.
The examples that follow have been tested on Oracle Enterprise Edition 12cR2 ( 12.2.0.1.0) running on Oracle Linux 7.3. I would normally run something like this on Oracle 18cXE. However, Express Edition is missing one of the components required for the File Watcher mechanism to work, namely the DBMS_ISCHED.FILE_WATCH_JOB procedure.
Finding and Configuring File Watcher settings
It’s probably worth taking a whistle-stop tour through the data dictionary views that provide some detail about how Oracle’s file watcher functionality hangs together.
Let’s start with the main scheduler job itself :
select
program_owner, program_name, schedule_name,
comments
from
dba_scheduler_jobs
where
owner =
'SYS'
and
job_name =
'FILE_WATCHER'
/
PROGRAM_OWNER PROGRAM_NAME SCHEDULE_NAME COMMENTS
--------------- -------------------- ------------------------- ------------------------------
SYS FILE_WATCHER_PROGRAM FILE_WATCHER_SCHEDULE File watcher job
We can see that the job is associated with both a Program and a Schedule.
If we look at the program, we can find out more details about what Oracle runs under the covers when a file watcher is initiated :
select
program_type, program_action, comments
from
dba_scheduler_programs
where
owner =
'SYS'
and
program_name =
'FILE_WATCHER_PROGRAM'
/
PROGRAM_TYPE PROGRAM_ACTION COMMENTS
-------------------- ------------------------------ ------------------------------
STORED_PROCEDURE dbms_isched.file_watch_job File Watcher program
Of more immediate interest is the schedule, which will tell us how often a filewatcher will run :
select
schedule_type, repeat_interval
from
dba_scheduler_schedules
where
schedule_name =
'FILE_WATCHER_SCHEDULE'
/
SCHEDULE_TYPE REPEAT_INTERVAL
-------------------- ------------------------------
CALENDAR freq=minutely;interval=10
What this means is that the filewatcher will check every 10 minutes to see if any file it’s looking for has arrived since it last checked.
Actually, what it means is that the filewatcher will check to see if a file it’s looking for exists and has a timestamp later than the last time it checked.
This distinction will become more relevant when we start testing. Speaking of which, testing will be rather less tedious if we can make the filewatcher run more regularly.
Therefore, I’m going to set it to run every minute.
To achieve this, I need to connect as SYSDBA and run…
alter
session
set
container = pdb1;
begin
dbms_scheduler.set_attribute(
name
=>
'file_watcher_schedule'
,
attribute =>
'repeat_interval'
,
value =>
'freq=minutely;interval=1'
);
end
;
/
…where pdb1 is the name of your container database.
If we check again we should see that the FILE_WATCHER_SCHEDULE now reflects the new repeat_interval :
select
repeat_interval
from
dba_scheduler_schedules
where
schedule_name =
'FILE_WATCHER_SCHEDULE'
/
REPEAT_INTERVAL
------------------------------
freq=minutely;interval=1
Cautious Credential Creation
We’re going to need access to an OS account that can read the files as they come in.
If we’re going to do anything with the files (e.g. load data) then the OS user that started Oracle (usually oralce) will also need to read these files.
The easiest way to achieve this is simply to create a credential for the oracle OS user.
There was an exploit back in 11gR2, which meant that it was possible to discover the password of the OS user used in a credential.
This particular vulnerability seems to have been fixed in later versions. It doesn’t work on the 12c R2 database I’m using here for example.
Whilst I’m not aware of any such vulnerabilities extant for more recent versions, it would seem prudent to use a somewhat less privileged OS user for our credential…
Sorry about the bag, didn’t have time to shave!
You’re not obliged to laugh at that rather oblique Blackadder reference in this heading, which I thought might be suitable as a caption for that photo. Anyhow, it does explain the name of the OS user I’ve created to use for our credential…
sudo
useradd
-m ponsonby
sudo
passwd
ponsonby
…creating a password for ponsonby when prompted.
Whilst I’m here, I may as well create the directory to hold the files under ponsonby’s home directory, which will ensure the user has access to these files.
To allow oracle to also see the files, I’ll set the directory group to be oinstall, the oracle user’s primary group.
Note that I don’t need to make ponsonby a member of this group.
Finally, I need to set the others executable bit on ponsonby’s home directory so that oracle can access the in_files sub-directory :
sudo
mkdir
/home/ponsonby/in_files
sudo
chown
ponsonby
/home/ponsonby/in_files
chgrp
oinstall
/home/ponsonby/in_files
sudo
chmod
a+x
/home/ponsonby
Now we have a landing directory for our files, we can create a directory object for it in the database. Whilst this is not required for the filewatcher itself, I’m going to want to read the contents of the file from within the database at some point so…
create
directory in_files
as
'/home/ponsonby/in_files'
/
Finally, we can create our credential…
begin
dbms_credential.create_credential(
credential_name =>
'caption_cred'
,
username =>
'ponsonby'
,
password
=>
'a-fiendishly-difficult-to-guess-password'
,
comments =>
'Credentials for file watcher on IN_FILES directory'
);
end
;
/
…replacing “a-fiendishly-difficult-to-guess-password” with the password we set for ponsonby.
We can now see our credential in USER_SCHEDULER_CREDENTIALS :
select
credential_name, username, comments
from
user_scheduler_credentials
where
credential_name =
'CAPTION_CRED'
/
CREDENTIAL_NAME USERNAME COMMENTS
--------------- ---------- --------------------------------------------------
CAPTION_CRED ponsonby Credentials
for
file watcher
on
IN_FILES directory
Right, now that’s all sorted we can…
Create a File Watcher
To start with we’re going to look for files called captionsomething.txt :
begin
dbms_scheduler.create_file_watcher(
file_watcher_name =>
'caption_competition_fw'
,
directory_path =>
'/home/ponsonby/in_files'
,
file_name =>
'caption*.txt'
,
credential_name =>
'fw_credential'
,
enabled =>
false
,
comments =>
'Watching for "witty" captions'
);
end
;
/
Note that we’ve created the filewatcher in a disabled state at the moment, as we can see here :
select
enabled, directory_path, file_name,
credential_name, comments
from
user_scheduler_file_watchers
where
file_watcher_name =
'CAPTION_COMP_FW'
/
ENABLED DIRECTORY_PATH FILE_NAME CREDENTIAL_NAME COMMENTS
---------- ------------------------- --------------- --------------- --------------------------------------------------
FALSE
/home/ponsonby/in_files caption*.txt CAPTION_CRED Watching
for
"witty"
captions
This is because we’ve still got to create all the other stuff we need for it to do anything useful.
Speaking of which…
Yes, that is my Death Star parked outside
When the file watcher runs it will return an object of type SCHEDULER_FILEWATCHER_RESULT.
We’re going to take the scalar properties of that object and dump them into a table :
create
table
incoming_files(
destination VARCHAR2(4000),
directory_path VARCHAR2(4000),
actual_file_name VARCHAR2(4000),
file_size NUMBER,
file_timestamp
TIMESTAMP
WITH
TIME
ZONE)
/
…using this stored procedure :
create
or
replace
procedure
save_incoming_file( i_result sys.scheduler_filewatcher_result)
as
begin
insert
into
incoming_files(
destination,
directory_path,
actual_file_name,
file_size,
file_timestamp)
values
(
i_result.destination,
i_result.directory_path,
i_result.actual_file_name,
i_result.file_size,
i_result.file_timestamp);
end
;
/
NOTE – when this gets executed from the scheduler job, the transaction will commit.
You do not need to do this explicitly. Additionally if you attempt to use an autonomous transaction, your change is likely to disappear into the ether.
We need a means of calling this stored procedure when the filewatcher is triggered. This requires two further objects…
The Dentist will see you now
First, we need a scheduler program, which we’ll use to call the procedure. As we want to read the output of the filewatcher, we’ll need to add an argument to the program.
Then, we’ll create a scheduler job to tie the program to the file watcher.
I know, let’s do all of that in one fell swoop :
begin
dbms_scheduler.create_program(
program_name =>
'caption_prog'
,
program_type =>
'stored_procedure'
,
program_action =>
'save_incoming_file'
,
number_of_arguments => 1,
enabled =>
false
);
-- need to make sure this program can see the message sent by the filewatcher...
dbms_scheduler.define_metadata_argument(
program_name =>
'caption_prog'
,
metadata_attribute =>
'event_message'
,
argument_position => 1);
-- Create a job that links our filewatcher to our program...
dbms_scheduler.create_job(
job_name =>
'caption_job'
,
program_name =>
'caption_prog'
,
event_condition =>
null
,
queue_spec =>
'caption_comp_fw'
,
auto_drop =>
false
,
enabled =>
false
);
end
;
/
We can now see the program is in the data dictionary…
select
program_type, program_action, number_of_arguments,
enabled
from
user_scheduler_programs
where
program_name =
'CAPTION_PROG'
/
PROGRAM_TYPE PROGRAM_ACTION NUMBER_OF_ARGUMENTS ENABLED
-------------------- -------------------- ------------------- ----------
STORED_PROCEDURE save_incoming_file 1
FALSE
…along with the job…
select
program_name, schedule_type, file_watcher_name,
enabled
from
user_scheduler_jobs
where
job_name =
'CAPTION_JOB'
/
PROGRAM_NAME SCHEDULE_TYPE FILE_WATCHER_NA ENABLED
-------------------- -------------------- --------------- ----------
CAPTION_PROG FILE_WATCHER CAPTION_COMP_FW
FALSE
Now we’ve created all of the components, we can enable everything :
begin
dbms_scheduler.enable(
'caption_comp_fw'
);
dbms_scheduler.enable(
'caption_prog'
);
dbms_scheduler.enable(
'caption_job'
);
end
;
/
To confirm everything in our stack is enabled :
select
file_watcher_name
as
scheduler_object, enabled
from
user_scheduler_file_watchers
where
file_watcher_name =
'CAPTION_COMP_FW'
union
all
select
program_name, enabled
from
user_scheduler_programs
where
program_name =
'CAPTION_PROG'
union
all
select
job_name, enabled
from
user_scheduler_jobs
where
program_name =
'CAPTION_PROG'
and
file_watcher_name =
'CAPTION_COMP_FW'
/
SCHEDULER_OBJECT ENABLED
------------------------------ --------------------
CAPTION_COMP_FW
TRUE
CAPTION_PROG
TRUE
CAPTION_JOB
TRUE
To test it, we need to create a file with an appropriate name in the directory we’re watching :
cd
/home/ponsonby/in_files
sudo
su
ponsonby
echo
"Inside every old person is a young person wondering what happened."
>caption01.txt
You’ll need to give it a minute from the file creation to the job running.
As this is a scheduler job, you can check how the run went in the usual places :
select
log_id, job_name, log_date, status
from
user_scheduler_job_log
where
job_name =
'CAPTION_JOB'
/
LOG_ID JOB_NAME LOG_DATE STATUS
---------- ------------------------------ ----------------------------------- ---------------
27010 CAPTION_JOB 28-SEP-20 20.51.01.508628000 +01:00 SUCCEEDED
For further details :
select
job_name, run_duration, status,
error#, additional_info, errors
from
user_scheduler_job_run_details
where
log_id = 27010
/
JOB_NAME RUN_DURATION STATUS ERROR# ADDITIONAL_INFO ERRORS
------------------------------ ------------------- --------------- ---------- -------------------- --------------------
CAPTION_JOB +00 00:00:00.000000 SUCCEEDED 0
Of course, the important bit here is what, if anything, has been inserted into the table :
select
destination, directory_path, actual_file_name,
file_size, file_timestamp
from
incoming_files
/
DESTINATION DIRECTORY_PATH ACTUAL_FILE_NAM FILE_SIZE FILE_TIMESTAMP
--------------- ------------------------- --------------- ---------- --------------------------------
localhost /home/ponsonby/in_files caption01.txt 67 28-SEP-20 19.49.23.000000000 GMT
Wait for it !
All of this is fine if you want to process a small file, or even use it solely as a trigger to kick something off, but what about a larger file or, at least, one that takes more than a few seconds to transfer into the directory we’re watching ?
Fortunately, we can use the filewatcher’s steady_state_duration parameter to specify a minimum time the filewatcher should wait after the file is last written to before it considers the file to have arrived :
begin
dbms_scheduler.set_attribute(
name
=>
'caption_comp_fw'
,
attribute =>
'steady_state_duration'
,
value => interval
'20'
second
);
end
;
/
To confirm the change :
select
steady_state_duration
from
user_scheduler_file_watchers
where
file_watcher_name =
'CAPTION_COMP_FW'
/
STEADY_STATE_DURATION
------------------------------
+00 00:00:20.000000
This means that the filewatcher should now wait until a file has not changed for 20 seconds before it kicks anything off.
Note that the steady_state_duration must be set to a value less than the interval at which the file watchers are set to poll ( 1 minute in our case).
Just as well as we’ve added a table to our application…
create
table
captions(
file_name varchar2(4000),
caption_text varchar2(4000))
/
…and made some changes to the procedure we’re calling…
create
or
replace
procedure
save_incoming_file( i_result sys.scheduler_filewatcher_result)
as
fh utl_file.file_type;
buffer varchar2(4000);
fname varchar2(4000);
C_DIR constant all_directories.directory_name%type :=
'IN_FILES'
;
begin
insert
into
incoming_files(
destination,
directory_path,
actual_file_name,
file_size,
file_timestamp)
values
(
i_result.destination,
i_result.directory_path,
i_result.actual_file_name,
i_result.file_size,
i_result.file_timestamp);
fname := i_result.actual_file_name;
fh := utl_file.fopen(C_DIR, fname,
'R'
);
loop
begin
-- nested block to read the file
utl_file.get_line(fh, buffer);
if buffer
is
null
then
exit;
end
if;
insert
into
captions( file_name, caption_text)
values
( fname, buffer);
exception
when
no_data_found
then
-- no more data to read
exit;
end
;
-- file read block
end
loop;
utl_file.fclose(fh);
end
;
/
Now we just need something to create our file…slowly. In my case I’m using this bash script ( running as ponsonby from /home/ponsonby) :
#!/bin/sh
outfile=
'in_files/caption02.txt'
echo
'Dress-down day at the Sith Order'
>$outfile
sleep
19
echo
'Tall...and handsome (in the dark)'
>>$outfile
exit
Once this has completed, we can check our application and confirm that the filewatcher has found the file…
select
actual_file_name, file_size, file_timestamp
from
incoming_files
where
actual_file_name =
'caption02.txt'
/
ACTUAL_FILE_NAME FILE_SIZE FILE_TIMESTAMP
-------------------- ---------- --------------------------------
caption02.txt 67 29-SEP-20 20.47.21.000000000 GMT
…and waited long enough for it to finish being written before processing it…
select
caption_text
from
captions
where
file_name =
'caption02.txt'
/
CAPTION_TEXT
------------------------------------------------------------
Dress down
day
at
the Sith
Order
Tall...
and
handsome(
in
the dark)
Acknowledgements
As ever Oracle Base was a veritable font of wisdom on the issue of file watchers.
I also found lots of useful info in Aychin’s Oracle RDBMS Blog.
Finally, this Stack Exchange article was invaluable in solving the mystery of *nix directory permissions.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK