Generating CSV files from PL/SQL the Very Easy Way.
source link: https://mikesmithers.wordpress.com/2020/02/13/generating-csv-files-from-pl-sql-the-very-easy-way/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Generating CSV files from PL/SQL the Very Easy Way.
This post is dedicated to Morten Braten and William Robertson.
Thanks to both of you for saving me a lot of time (and typing)…
XML, YAML, JSON. When it comes to transferring data between systems, they’ve all been flavour of the month at one time or another. However, good old CSV is still the format of choice when the batch window opens.
Not only is it simple, it adds comparatively little overhead to the size of the data file being generated.
OK – CSV used to mean “Comma-delimited” but these days it’s become synonymous with delimited data.
Whatever separator character you use, generating CSV is considerably easier these days…unless you’re using PL/SQL in a batch (as opposed to interactive) process.
That said, Connor MacDonald does have a clever way of utilising the functionality available in a command line tool such as SQL*Plus by means of a DBMS_SCHEDULER job.
If for some reason that won’t work for you, it looks like you’re going to have to hack out some custom PL/SQL to get the job done…or you could have a look at a couple of the options available in what other technologies would refer to as “the ecosystem”.
What I’m going to cover is :
Hand-cranking delimited code
We’re working in a Data Warehouse running on Oracle. We need to provide a data feed to a downstream system.
The feed is comma separated with values enclosed by quotes.
The query to generate the data for the feed is reasonably straight forward :
select
dept.department_id, dept.department_name,
loc.city, coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
-- this will be passed as a parameter in the run.
All ok so far. However, when it comes to tweaking the query to generate the output, things get a bit more fiddly.
The query in the package will have to change to something like this :
select
'"'
||dept.department_id||
'",'
||
'"'
||dept.department_name||
'",'
||
'"'
||loc.city||
'",'
||
'"'
||coun.country_id||
'"'
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
Typically, these sorts of files are generated with dozens (sometimes hundreds) of attributes. If looking at all of those quotes and commas is going to give you a headache, you might seek to mitigate the worst effects by replacing the characters with their ASCII values :
select
chr(34)||dept.department_id||chr(34)||chr(44)
||chr(34)||dept.department_name||chr(34)||chr(44)
||chr(34)||loc.city||chr(34)||chr(44)
||chr(34)||coun.country_id||chr(34)
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
The double-quotes surrounding the data values are designed to escape any commas in the data and prevent them from being read as field terminators. Unfortunately, this doesn’t help much if the data also contains double-quotes. For this reason, I tend to prefer a pipe character as a delimiter. This allows us to dispense with the double-quotes. As a double-pipe is the SQL concatenation operator, let’s use the ASCII value instead for the delimiter itself :
select
dept.department_id||chr(124)
||dept.department_name||chr(124)
||loc.city||chr(124)
||coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
Wouldn’t it be nice if there was some way we could just persuade Oracle to magically write our delimiters for us.
Well, I don’t know about Oracle, but Morten Braten will have a go for you…
The Alexandria CSV_UTIL_PKG package
As with most Alexandria packages, there’s a demo on how to use CSV_UTIL_PKG in demos/csv_util_pkg_demo.sql.
However, as explained in Morten’s post on this subject, you need to create some types first …
create
type t_str_array
as
table
of
varchar2(4000);
/
create
type t_csv_line
as
object (
line_number number,
line_raw varchar2(4000),
c001 varchar2(4000),
c002 varchar2(4000),
c003 varchar2(4000),
c004 varchar2(4000),
c005 varchar2(4000),
c006 varchar2(4000),
c007 varchar2(4000),
c008 varchar2(4000),
c009 varchar2(4000),
c010 varchar2(4000),
c011 varchar2(4000),
c012 varchar2(4000),
c013 varchar2(4000),
c014 varchar2(4000),
c015 varchar2(4000),
c016 varchar2(4000),
c017 varchar2(4000),
c018 varchar2(4000),
c019 varchar2(4000),
c020 varchar2(4000)
);
/
create
type t_csv_tab
as
table
of
t_csv_line;
/
With these types in place, we can install the package from the Alexandria GitHub repo.
The files we’re looking for are under the ora directory:
- csv_util_pkg.pks
- csv_util_pkg.pkb
Download them and run them in the order they are listed here ( the .pks is the header and the .pkb is the body).
Now, we can take some of the commas out of our code…
select
csv_util_pkg.array_to_csv(
t_str_array(
dept.department_id, dept.department_name,
loc.city, coun.country_id))
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
By default, the results are output using a comma as the separator. However, we can change this easily enough:
select
csv_util_pkg.array_to_csv(
t_str_array(dept.department_id, dept.department_name, loc.city, coun.country_id),
chr(124))
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
;
The output is pretty much what we’re looking for :
60|IT|Southlake|US
50|Shipping|South San Francisco|US
10|Administration|Seattle|US
30|Purchasing|Seattle|US
90|Executive|Seattle|US
100|Finance|Seattle|US
110|Accounting|Seattle|US
120|Treasury|Seattle|US
130|Corporate Tax|Seattle|US
140|Control And Credit|Seattle|US
150|Shareholder Services|Seattle|US
160|Benefits|Seattle|US
170|Manufacturing|Seattle|US
180|Construction|Seattle|US
190|Contracting|Seattle|US
200|Operations|Seattle|US
210|IT Support|Seattle|US
220|NOC|Seattle|US
230|IT Helpdesk|Seattle|US
240|Government Sales|Seattle|US
250|Retail Sales|Seattle|US
260|Recruiting|Seattle|US
270|Payroll|Seattle|US
20|Marketing|Toronto|CA
Now all you need to do is put this into your PL/SQL procedure and write some code to save it to a file ( possibly using the Alexandria FILE_UTIL_PKG package).
If that seems like too much effort though…
William Robertson’s CSV package
Whilst the Alexandria package is a bit of an all-rounder, William Robertson has produced something that’s rather more tailored to producing CSV files.
The package – simply called CSV – has a function that produces delimited format from a refcursor. It also contains a procedure to write csv data to a file.
Let’s take a closer look…
Installation of the package simply involves downloading the code from here, and running it. Both the package header and body are in a single file – csv.pkg.
The package’s REPORT function takes a slightly different approach in that it takes a ref cursor as an argument :
select
column_value
from
table
(csv.report(
cursor
(
select
dept.department_id, dept.department_name,
loc.city, coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
)))
/
Once again, the delimiter character is comma by default. Once again, we can override this :
select
column_value
from
table
(csv.report(
cursor
(
select
dept.department_id, dept.department_name,
loc.city, coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
),
chr(124)))
/
We can go even further. You can add common things that you may need to include in a csv that’s serving as a feed file for another system. These include :
- a header record with a delimited list of attributes
- a label attribute for each row of data to make these rows easier to identify for the program loading the csv
- a rowcount as a trailer record
All of which can be accomplished thus :
select
column_value
from
table
(csv.report(
cursor
(
select
dept.department_id, dept.department_name,
loc.city, coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name =
'Americas'
),
p_separator => chr(124),
p_label =>
'DATA'
,
p_heading =>
'Y'
,
p_rowcount =>
'Y'
))
/
[show
output
]
Not only that, but the package also facilitates creating the file itself…
Generating a file from the CSV package
First of all, we need access to a directory. So, the DBA needs to do something like :
grant
read
, write
on
directory app_files
to
hr;
Now, connected as HR, we can create a PL/SQL procedure to generate our csv files like this :
create
or
replace
procedure
depts_by_region( i_region regions.region_name%type)
as
l_fname varchar2(100);
rc sys_refcursor;
begin
l_fname := i_region||
'_depts.csv'
;
open
rc
for
select
dept.department_id, dept.department_name,
loc.city, coun.country_id
from
departments dept
inner
join
locations loc
on
dept.location_id = loc.location_id
inner
join
countries coun
on
loc.country_id = coun.country_id
inner
join
regions reg
on
reg.region_id = coun.region_id
where
reg.region_name = i_region;
csv.write_file(
p_dataset => rc,
p_separator => chr(124),
p_label =>
'DATA'
,
p_heading =>
'Y'
,
p_rowcount =>
'Y'
,
p_directory =>
'APP_FILES'
,
p_filename => l_fname);
end
;
/
…and run it like this :
begin
depts_by_region(
'Americas'
);
end
;
/
Sure enough, when we look on the server, we can see :
[mike@frea ~]$
cd
/opt/oracle/app_files/
[mike@frea app_files]$
ls
-l *.csv
-rw-r--r-- 1 oracle oinstall 840 Feb 11 20:56 Americas_depts.csv
…and the file looks like this :
HEADING|DATA|DEPARTMENT_ID|DEPARTMENT_NAME|CITY|COUNTRY_ID
DATA|60|IT|Southlake|US
DATA|50|Shipping|South San Francisco|US
DATA|10|Administration|Seattle|US
DATA|30|Purchasing|Seattle|US
DATA|90|Executive|Seattle|US
DATA|100|Finance|Seattle|US
DATA|110|Accounting|Seattle|US
DATA|120|Treasury|Seattle|US
DATA|130|Corporate Tax|Seattle|US
DATA|140|Control And Credit|Seattle|US
DATA|150|Shareholder Services|Seattle|US
DATA|160|Benefits|Seattle|US
DATA|170|Manufacturing|Seattle|US
DATA|180|Construction|Seattle|US
DATA|190|Contracting|Seattle|US
DATA|200|Operations|Seattle|US
DATA|210|IT Support|Seattle|US
DATA|220|NOC|Seattle|US
DATA|230|IT Helpdesk|Seattle|US
DATA|240|Government Sales|Seattle|US
DATA|250|Retail Sales|Seattle|US
DATA|260|Recruiting|Seattle|US
DATA|270|Payroll|Seattle|US
DATA|20|Marketing|Toronto|CA
ROW_COUNT|DATA|24
There are some limitations. For example, William points out that any row that’s longer than 4000 characters will cause the package to break. However, unless you’re generating fairly “wide” lines, he’s pretty much written your code for you.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK