4

DBMS_SCHEDULER – Fiddling with File Watchers | The Anti-Kyte

 3 years ago
source link: https://mikesmithers.wordpress.com/2020/09/30/dbms_scheduler-fiddling-with-file-watchers/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

DBMS_SCHEDULER – Fiddling with File Watchers

Posted on September 30, 2020

I recently updated my blog profile picture. Let’s face it, it was about time. The old one was taken 14 years ago.
This time, I thought I’d go for a contemporary portrait, which has the added benefit of being less likely to frighten animals and small children.
Unfortunately, it is also crying out for a caption :

Just popping into the bank. Keep the engine running

One or two alternatives captions may well crop up in the following paragraphs.
Primarily though, I’ll be talking about the file watcher capabilities of DBMS_SCHEDULER.
Introduced in 11g, it enables developers to trigger actions when a file arrives without having to leave the comfort of their database.
There are a few moving parts and getting them to work together can take a bit of, well, fiddling.
Specifically, we’ll be looking at :

  • configuring the frequency at which filewatchers are run
  • creating a credential to use for file watching
  • creating a file watcher to determine when a file has arrived
  • creating a job for the file watcher to execute
  • adding a program so that we can get some information about the file
  • getting the filewatcher to wait for a large file transfer to complete

Before we get into all of that, we need to have a quick chat about environments…

Don’t try this at home

More specifically, don’t try this on Oracle 18c Express Edition.
The examples that follow have been tested on Oracle Enterprise Edition 12cR2 ( 12.2.0.1.0) running on Oracle Linux 7.3. I would normally run something like this on Oracle 18cXE. However, Express Edition is missing one of the components required for the File Watcher mechanism to work, namely the DBMS_ISCHED.FILE_WATCH_JOB procedure.

Finding and Configuring File Watcher settings

It’s probably worth taking a whistle-stop tour through the data dictionary views that provide some detail about how Oracle’s file watcher functionality hangs together.

Let’s start with the main scheduler job itself :

select program_owner, program_name, schedule_name,
comments
from dba_scheduler_jobs
where owner = 'SYS'
and job_name = 'FILE_WATCHER'
/
PROGRAM_OWNER   PROGRAM_NAME         SCHEDULE_NAME             COMMENTS                     
--------------- -------------------- ------------------------- ------------------------------
SYS             FILE_WATCHER_PROGRAM FILE_WATCHER_SCHEDULE     File watcher job             

We can see that the job is associated with both a Program and a Schedule.

If we look at the program, we can find out more details about what Oracle runs under the covers when a file watcher is initiated :

select program_type, program_action, comments
from dba_scheduler_programs
where owner = 'SYS'
and program_name = 'FILE_WATCHER_PROGRAM'
/
PROGRAM_TYPE         PROGRAM_ACTION                 COMMENTS                     
-------------------- ------------------------------ ------------------------------
STORED_PROCEDURE     dbms_isched.file_watch_job     File Watcher program         

Of more immediate interest is the schedule, which will tell us how often a filewatcher will run :

select schedule_type, repeat_interval
from dba_scheduler_schedules
where schedule_name = 'FILE_WATCHER_SCHEDULE'
/
SCHEDULE_TYPE        REPEAT_INTERVAL              
-------------------- ------------------------------
CALENDAR             freq=minutely;interval=10     

What this means is that the filewatcher will check every 10 minutes to see if any file it’s looking for has arrived since it last checked.
Actually, what it means is that the filewatcher will check to see if a file it’s looking for exists and has a timestamp later than the last time it checked.
This distinction will become more relevant when we start testing. Speaking of which, testing will be rather less tedious if we can make the filewatcher run more regularly.

Therefore, I’m going to set it to run every minute.
To achieve this, I need to connect as SYSDBA and run…

alter session set container = pdb1;
begin
dbms_scheduler.set_attribute( 
name => 'file_watcher_schedule',
attribute => 'repeat_interval',
value => 'freq=minutely;interval=1');
end;
/

…where pdb1 is the name of your container database.

If we check again we should see that the FILE_WATCHER_SCHEDULE now reflects the new repeat_interval :

select repeat_interval
from dba_scheduler_schedules
where schedule_name = 'FILE_WATCHER_SCHEDULE'
/
REPEAT_INTERVAL              
------------------------------
freq=minutely;interval=1    

Cautious Credential Creation

We’re going to need access to an OS account that can read the files as they come in.
If we’re going to do anything with the files (e.g. load data) then the OS user that started Oracle (usually oralce) will also need to read these files.

The easiest way to achieve this is simply to create a credential for the oracle OS user.
There was an exploit back in 11gR2, which meant that it was possible to discover the password of the OS user used in a credential.
This particular vulnerability seems to have been fixed in later versions. It doesn’t work on the 12c R2 database I’m using here for example.
Whilst I’m not aware of any such vulnerabilities extant for more recent versions, it would seem prudent to use a somewhat less privileged OS user for our credential…

Sorry about the bag, didn’t have time to shave!

You’re not obliged to laugh at that rather oblique Blackadder reference in this heading, which I thought might be suitable as a caption for that photo. Anyhow, it does explain the name of the OS user I’ve created to use for our credential…

sudo useradd -m ponsonby
sudo passwd ponsonby

…creating a password for ponsonby when prompted.

Whilst I’m here, I may as well create the directory to hold the files under ponsonby’s home directory, which will ensure the user has access to these files.
To allow oracle to also see the files, I’ll set the directory group to be oinstall, the oracle user’s primary group.
Note that I don’t need to make ponsonby a member of this group.
Finally, I need to set the others executable bit on ponsonby’s home directory so that oracle can access the in_files sub-directory :

sudo mkdir /home/ponsonby/in_files
sudo chown ponsonby /home/ponsonby/in_files
chgrp oinstall /home/ponsonby/in_files
sudo chmod a+x /home/ponsonby

Now we have a landing directory for our files, we can create a directory object for it in the database. Whilst this is not required for the filewatcher itself, I’m going to want to read the contents of the file from within the database at some point so…

create directory in_files as '/home/ponsonby/in_files'
/

Finally, we can create our credential…

begin
dbms_credential.create_credential(
credential_name => 'caption_cred',
username => 'ponsonby',
password => 'a-fiendishly-difficult-to-guess-password',
comments => 'Credentials for file watcher on IN_FILES directory');
end;
/

…replacing “a-fiendishly-difficult-to-guess-password” with the password we set for ponsonby.

We can now see our credential in USER_SCHEDULER_CREDENTIALS :

select credential_name, username, comments
from user_scheduler_credentials
where credential_name = 'CAPTION_CRED'
/
CREDENTIAL_NAME USERNAME   COMMENTS                                         
--------------- ---------- --------------------------------------------------
CAPTION_CRED    ponsonby   Credentials for file watcher on IN_FILES directory

Right, now that’s all sorted we can…

Create a File Watcher

To start with we’re going to look for files called captionsomething.txt :

begin
dbms_scheduler.create_file_watcher(
file_watcher_name => 'caption_competition_fw',
directory_path => '/home/ponsonby/in_files',
file_name => 'caption*.txt',
credential_name => 'fw_credential',
enabled => false,
comments => 'Watching for "witty" captions');
end;
/

Note that we’ve created the filewatcher in a disabled state at the moment, as we can see here :

select enabled, directory_path, file_name,
credential_name, comments
from user_scheduler_file_watchers
where file_watcher_name = 'CAPTION_COMP_FW'
/
ENABLED    DIRECTORY_PATH            FILE_NAME       CREDENTIAL_NAME COMMENTS                                         
---------- ------------------------- --------------- --------------- --------------------------------------------------
FALSE      /home/ponsonby/in_files   caption*.txt    CAPTION_CRED    Watching for "witty" captions                    

This is because we’ve still got to create all the other stuff we need for it to do anything useful.
Speaking of which…

Yes, that is my Death Star parked outside

When the file watcher runs it will return an object of type SCHEDULER_FILEWATCHER_RESULT.
We’re going to take the scalar properties of that object and dump them into a table :

create table incoming_files(
destination VARCHAR2(4000),
directory_path VARCHAR2(4000),
actual_file_name VARCHAR2(4000),
file_size NUMBER,
file_timestamp TIMESTAMP WITH TIME ZONE)
/   

…using this stored procedure :

create or replace procedure save_incoming_file( i_result sys.scheduler_filewatcher_result)
as
begin
insert into incoming_files(
destination,
directory_path,
actual_file_name,
file_size,
file_timestamp)
values(
i_result.destination,
i_result.directory_path,
i_result.actual_file_name,
i_result.file_size,
i_result.file_timestamp);
end;
/

NOTE – when this gets executed from the scheduler job, the transaction will commit.
You do not need to do this explicitly. Additionally if you attempt to use an autonomous transaction, your change is likely to disappear into the ether.

We need a means of calling this stored procedure when the filewatcher is triggered. This requires two further objects…

The Dentist will see you now

First, we need a scheduler program, which we’ll use to call the procedure. As we want to read the output of the filewatcher, we’ll need to add an argument to the program.
Then, we’ll create a scheduler job to tie the program to the file watcher.
I know, let’s do all of that in one fell swoop :

begin
dbms_scheduler.create_program(
program_name => 'caption_prog',
program_type => 'stored_procedure',
program_action => 'save_incoming_file',
number_of_arguments => 1,
enabled => false);
-- need to make sure this program can see the message sent by the filewatcher...
dbms_scheduler.define_metadata_argument(
program_name => 'caption_prog',
metadata_attribute => 'event_message',
argument_position => 1);
-- Create a job that links our filewatcher to our program...
dbms_scheduler.create_job(
job_name => 'caption_job',
program_name => 'caption_prog',
event_condition => null,
queue_spec => 'caption_comp_fw',
auto_drop => false,
enabled => false);
end;
/

We can now see the program is in the data dictionary…

select program_type, program_action, number_of_arguments,
enabled
from user_scheduler_programs
where program_name = 'CAPTION_PROG'
/
PROGRAM_TYPE         PROGRAM_ACTION       NUMBER_OF_ARGUMENTS ENABLED  
-------------------- -------------------- ------------------- ----------
STORED_PROCEDURE     save_incoming_file                     1 FALSE    

…along with the job…

select program_name, schedule_type, file_watcher_name,
enabled
from user_scheduler_jobs
where job_name = 'CAPTION_JOB'
/
PROGRAM_NAME         SCHEDULE_TYPE        FILE_WATCHER_NA ENABLED  
-------------------- -------------------- --------------- ----------
CAPTION_PROG         FILE_WATCHER         CAPTION_COMP_FW FALSE    

Now we’ve created all of the components, we can enable everything :

begin
dbms_scheduler.enable('caption_comp_fw');
dbms_scheduler.enable('caption_prog');
dbms_scheduler.enable('caption_job');
end;
/

To confirm everything in our stack is enabled :

select file_watcher_name as scheduler_object, enabled
from user_scheduler_file_watchers
where file_watcher_name = 'CAPTION_COMP_FW'
union all
select program_name, enabled
from user_scheduler_programs
where program_name = 'CAPTION_PROG'
union all
select job_name, enabled
from user_scheduler_jobs
where program_name = 'CAPTION_PROG'
and file_watcher_name = 'CAPTION_COMP_FW'
/
SCHEDULER_OBJECT               ENABLED            
------------------------------ --------------------
CAPTION_COMP_FW                TRUE               
CAPTION_PROG                   TRUE               
CAPTION_JOB                    TRUE               

To test it, we need to create a file with an appropriate name in the directory we’re watching :

cd /home/ponsonby/in_files
sudo su ponsonby
echo "Inside every old person is a young person wondering what happened.">caption01.txt

You’ll need to give it a minute from the file creation to the job running.

As this is a scheduler job, you can check how the run went in the usual places :

select log_id, job_name, log_date, status
from user_scheduler_job_log
where job_name = 'CAPTION_JOB'
/
LOG_ID JOB_NAME                       LOG_DATE                            STATUS        
---------- ------------------------------ ----------------------------------- ---------------
27010 CAPTION_JOB                    28-SEP-20 20.51.01.508628000 +01:00 SUCCEEDED     

For further details :

select job_name, run_duration, status,
error#, additional_info, errors
from user_scheduler_job_run_details
where log_id = 27010
/
JOB_NAME                       RUN_DURATION        STATUS              ERROR# ADDITIONAL_INFO      ERRORS             
------------------------------ ------------------- --------------- ---------- -------------------- --------------------
CAPTION_JOB                    +00 00:00:00.000000 SUCCEEDED                0                                         

Of course, the important bit here is what, if anything, has been inserted into the table :

select destination, directory_path, actual_file_name,
file_size, file_timestamp
from incoming_files
/
DESTINATION     DIRECTORY_PATH            ACTUAL_FILE_NAM  FILE_SIZE FILE_TIMESTAMP                 
--------------- ------------------------- --------------- ---------- --------------------------------
localhost       /home/ponsonby/in_files   caption01.txt           67 28-SEP-20 19.49.23.000000000 GMT

Wait for it !

All of this is fine if you want to process a small file, or even use it solely as a trigger to kick something off, but what about a larger file or, at least, one that takes more than a few seconds to transfer into the directory we’re watching ?

Fortunately, we can use the filewatcher’s steady_state_duration parameter to specify a minimum time the filewatcher should wait after the file is last written to before it considers the file to have arrived :

begin
dbms_scheduler.set_attribute(
name => 'caption_comp_fw',
attribute => 'steady_state_duration',
value => interval '20' second);
end;
/

To confirm the change :

select steady_state_duration
from user_scheduler_file_watchers
where file_watcher_name = 'CAPTION_COMP_FW'
/
STEADY_STATE_DURATION        
------------------------------
+00 00:00:20.000000

This means that the filewatcher should now wait until a file has not changed for 20 seconds before it kicks anything off.
Note that the steady_state_duration must be set to a value less than the interval at which the file watchers are set to poll ( 1 minute in our case).

Just as well as we’ve added a table to our application…

create table captions(
file_name varchar2(4000),
caption_text varchar2(4000))
/

…and made some changes to the procedure we’re calling…

create or replace procedure save_incoming_file( i_result sys.scheduler_filewatcher_result)
as
fh utl_file.file_type;
buffer varchar2(4000);
fname varchar2(4000);
C_DIR constant all_directories.directory_name%type := 'IN_FILES';
begin
insert into incoming_files(
destination,
directory_path,
actual_file_name,
file_size,
file_timestamp)
values(
i_result.destination,
i_result.directory_path,
i_result.actual_file_name,
i_result.file_size,
i_result.file_timestamp);
fname := i_result.actual_file_name;
fh := utl_file.fopen(C_DIR, fname, 'R');
loop
begin -- nested block to read the file
utl_file.get_line(fh, buffer);   
if buffer is null then
exit;
end if;
insert into captions( file_name, caption_text)
values( fname, buffer);
exception when no_data_found then
-- no more data to read
exit;
end; -- file read block
end loop;
utl_file.fclose(fh);
end;
/

Now we just need something to create our file…slowly. In my case I’m using this bash script ( running as ponsonby from /home/ponsonby) :

#!/bin/sh
outfile='in_files/caption02.txt'
echo 'Dress-down day at the Sith Order' >$outfile
sleep 19
echo 'Tall...and handsome (in the dark)'>>$outfile
exit

Once this has completed, we can check our application and confirm that the filewatcher has found the file…

select actual_file_name, file_size, file_timestamp
from incoming_files
where actual_file_name = 'caption02.txt'
/
ACTUAL_FILE_NAME      FILE_SIZE FILE_TIMESTAMP                 
-------------------- ---------- --------------------------------
caption02.txt                67 29-SEP-20 20.47.21.000000000 GMT

…and waited long enough for it to finish being written before processing it…

select caption_text
from captions
where file_name = 'caption02.txt'
/
CAPTION_TEXT                                               
------------------------------------------------------------
Dress down day at the Sith Order
Tall...and handsome( in the dark)

Acknowledgements

As ever Oracle Base was a veritable font of wisdom on the issue of file watchers.

I also found lots of useful info in Aychin’s Oracle RDBMS Blog.
Finally, this Stack Exchange article was invaluable in solving the mystery of *nix directory permissions.

Advertisements
Report this ad

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK