Generating CSV files from PL/SQL the Very Easy Way.

Posted on February 13, 2020

This post is dedicated to Morten Braten and William Robertson.
Thanks to both of you for saving me a lot of time (and typing)…

XML, YAML, JSON. When it comes to transferring data between systems, they’ve all been flavour of the month at one time or another. However, good old CSV is still the format of choice when the batch window opens.
Not only is it simple, it adds comparatively little overhead to the size of the data file being generated.

OK – CSV used to mean “Comma-delimited” but these days it’s become synonymous with delimited data.
Whatever separator character you use, generating CSV is considerably easier these days…unless you’re using PL/SQL in a batch (as opposed to interactive) process.
That said, Connor MacDonald does have a clever way of utilising the functionality available in a command line tool such as SQL*Plus by means of a DBMS_SCHEDULER job.

If for some reason that won’t work for you, it looks like you’re going to have to hack out some custom PL/SQL to get the job done…or you could have a look at a couple of the options available in what other technologies would refer to as “the ecosystem”.
What I’m going to cover is :

The pain of hand-cranking delimited SQL A solution offered in Morten Braten’s Alexandria PL/SQL Utility Library An alternative solution made available by William Robertson The final PL/SQL procedure

Hand-cranking delimited code

We’re working in a Data Warehouse running on Oracle. We need to provide a data feed to a downstream system.
The feed is comma separated with values enclosed by quotes.
The query to generate the data for the feed is reasonably straight forward :

select dept.department_id, dept.department_name,

loc.city, coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas'; -- this will be passed as a parameter in the run.

All ok so far. However, when it comes to tweaking the query to generate the output, things get a bit more fiddly.
The query in the package will have to change to something like this :

select '"'||dept.department_id||'",'

||'"'||dept.department_name||'",'

||'"'||loc.city||'",'

||'"'||coun.country_id||'"'

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas';

Typically, these sorts of files are generated with dozens (sometimes hundreds) of attributes. If looking at all of those quotes and commas is going to give you a headache, you might seek to mitigate the worst effects by replacing the characters with their ASCII values :

select chr(34)||dept.department_id||chr(34)||chr(44)

||chr(34)||dept.department_name||chr(34)||chr(44)

||chr(34)||loc.city||chr(34)||chr(44)

||chr(34)||coun.country_id||chr(34)

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas';

The double-quotes surrounding the data values are designed to escape any commas in the data and prevent them from being read as field terminators. Unfortunately, this doesn’t help much if the data also contains double-quotes. For this reason, I tend to prefer a pipe character as a delimiter. This allows us to dispense with the double-quotes. As a double-pipe is the SQL concatenation operator, let’s use the ASCII value instead for the delimiter itself :

select dept.department_id||chr(124)

||dept.department_name||chr(124)

||loc.city||chr(124)

||coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas';

Wouldn’t it be nice if there was some way we could just persuade Oracle to magically write our delimiters for us.
Well, I don’t know about Oracle, but Morten Braten will have a go for you…

The Alexandria CSV_UTIL_PKG package

As with most Alexandria packages, there’s a demo on how to use CSV_UTIL_PKG in demos/csv_util_pkg_demo.sql.
However, as explained in Morten’s post on this subject, you need to create some types first …

create type t_str_array as table of varchar2(4000);

/

create type t_csv_line as object (

line_number number,

line_raw varchar2(4000),

c001 varchar2(4000),

c002 varchar2(4000),

c003 varchar2(4000),

c004 varchar2(4000),

c005 varchar2(4000),

c006 varchar2(4000),

c007 varchar2(4000),

c008 varchar2(4000),

c009 varchar2(4000),

c010 varchar2(4000),

c011 varchar2(4000),

c012 varchar2(4000),

c013 varchar2(4000),

c014 varchar2(4000),

c015 varchar2(4000),

c016 varchar2(4000),

c017 varchar2(4000),

c018 varchar2(4000),

c019 varchar2(4000),

c020 varchar2(4000)

);

/

create type t_csv_tab as table of t_csv_line;

/

With these types in place, we can install the package from the Alexandria GitHub repo.
The files we’re looking for are under the ora directory:

csv_util_pkg.pks
csv_util_pkg.pkb

Download them and run them in the order they are listed here ( the .pks is the header and the .pkb is the body).

Now, we can take some of the commas out of our code…

select csv_util_pkg.array_to_csv(

t_str_array(

dept.department_id, dept.department_name,

loc.city, coun.country_id))

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas';

By default, the results are output using a comma as the separator. However, we can change this easily enough:

select csv_util_pkg.array_to_csv(

t_str_array(dept.department_id, dept.department_name, loc.city, coun.country_id),

chr(124))

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas';

The output is pretty much what we’re looking for :

60|IT|Southlake|US

50|Shipping|South San Francisco|US

10|Administration|Seattle|US

30|Purchasing|Seattle|US

90|Executive|Seattle|US

100|Finance|Seattle|US

110|Accounting|Seattle|US

120|Treasury|Seattle|US

130|Corporate Tax|Seattle|US

140|Control And Credit|Seattle|US

150|Shareholder Services|Seattle|US

160|Benefits|Seattle|US

170|Manufacturing|Seattle|US

180|Construction|Seattle|US

190|Contracting|Seattle|US

200|Operations|Seattle|US

210|IT Support|Seattle|US

220|NOC|Seattle|US

230|IT Helpdesk|Seattle|US

240|Government Sales|Seattle|US

250|Retail Sales|Seattle|US

260|Recruiting|Seattle|US

270|Payroll|Seattle|US

20|Marketing|Toronto|CA

Now all you need to do is put this into your PL/SQL procedure and write some code to save it to a file ( possibly using the Alexandria FILE_UTIL_PKG package).
If that seems like too much effort though…

William Robertson’s CSV package

Whilst the Alexandria package is a bit of an all-rounder, William Robertson has produced something that’s rather more tailored to producing CSV files.

The package – simply called CSV – has a function that produces delimited format from a refcursor. It also contains a procedure to write csv data to a file.
Let’s take a closer look…

Installation of the package simply involves downloading the code from here, and running it. Both the package header and body are in a single file – csv.pkg.

The package’s REPORT function takes a slightly different approach in that it takes a ref cursor as an argument :

select column_value

from table(csv.report(cursor(

select dept.department_id, dept.department_name,

loc.city, coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas')))

/

Once again, the delimiter character is comma by default. Once again, we can override this :

select column_value

from table(csv.report(cursor(

select dept.department_id, dept.department_name,

loc.city, coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas'),

chr(124)))

/

We can go even further. You can add common things that you may need to include in a csv that’s serving as a feed file for another system. These include :

a header record with a delimited list of attributes
a label attribute for each row of data to make these rows easier to identify for the program loading the csv
a rowcount as a trailer record

All of which can be accomplished thus :

select column_value

from table(csv.report(cursor(

select dept.department_id, dept.department_name,

loc.city, coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = 'Americas'),

p_separator => chr(124),

p_label => 'DATA',

p_heading => 'Y',

p_rowcount => 'Y'))

/

[show output]

Not only that, but the package also facilitates creating the file itself…

Generating a file from the CSV package

First of all, we need access to a directory. So, the DBA needs to do something like :

grant read, write on directory app_files to hr;

Now, connected as HR, we can create a PL/SQL procedure to generate our csv files like this :

create or replace procedure depts_by_region( i_region regions.region_name%type)

as

l_fname varchar2(100);

rc sys_refcursor;

begin

l_fname := i_region||'_depts.csv';

open rc for

select dept.department_id, dept.department_name,

loc.city, coun.country_id

from departments dept

inner join locations loc

on dept.location_id = loc.location_id

inner join countries coun

on loc.country_id = coun.country_id

inner join regions reg

on reg.region_id = coun.region_id

where reg.region_name = i_region;

csv.write_file(

p_dataset => rc,

p_separator => chr(124),

p_label => 'DATA',

p_heading => 'Y',

p_rowcount => 'Y',

p_directory => 'APP_FILES',

p_filename => l_fname);

end;

/

…and run it like this :

begin

depts_by_region('Americas');

end;

/

Sure enough, when we look on the server, we can see :

[mike@frea ~]$ cd /opt/oracle/app_files/

[mike@frea app_files]$ ls -l *.csv

-rw-r--r-- 1 oracle oinstall 840 Feb 11 20:56 Americas_depts.csv

…and the file looks like this :

DATA|60|IT|Southlake|US

DATA|50|Shipping|South San Francisco|US

DATA|10|Administration|Seattle|US

DATA|30|Purchasing|Seattle|US

DATA|90|Executive|Seattle|US

DATA|100|Finance|Seattle|US

DATA|110|Accounting|Seattle|US

DATA|120|Treasury|Seattle|US

DATA|130|Corporate Tax|Seattle|US

DATA|140|Control And Credit|Seattle|US

DATA|150|Shareholder Services|Seattle|US

DATA|160|Benefits|Seattle|US

DATA|170|Manufacturing|Seattle|US

DATA|180|Construction|Seattle|US

DATA|190|Contracting|Seattle|US

DATA|200|Operations|Seattle|US

DATA|210|IT Support|Seattle|US

DATA|220|NOC|Seattle|US

DATA|230|IT Helpdesk|Seattle|US

DATA|240|Government Sales|Seattle|US

DATA|250|Retail Sales|Seattle|US

DATA|260|Recruiting|Seattle|US

DATA|270|Payroll|Seattle|US

DATA|20|Marketing|Toronto|CA

ROW_COUNT|DATA|24

There are some limitations. For example, William points out that any row that’s longer than 4000 characters will cause the package to break. However, unless you’re generating fairly “wide” lines, he’s pretty much written your code for you.

Generating CSV files from PL/SQL the Very Easy Way.

Generating CSV files from PL/SQL the Very Easy Way.

Hand-cranking delimited code

The Alexandria CSV_UTIL_PKG package

William Robertson’s CSV package

Generating a file from the CSV package

Recommend

Attending and speaking at conferences

CALISDO: Threat Modeling for Distributed Designs

Is there a CAP theorem for Durability?

Why Companies Should Pay For Their Employees To Attend Conferences

Heuristic Traps for Systems Operators

Safely Creating And Using Temporary Files

Why Must Systems Be Operated?

Patching is hard. Knowing what to patch is harder still.

Reading an OS file from inside an Oracle Database using an External Table

Is the Mean Really Useless?

About Joyk