How To Use SQL to Query an XML Document Using Oracle External Tables
Oracle’s external table feature allows you to treat structured files on the database server as tables for query and DML purposes. To use an example from the linked documentation, a tab-delimited file like this:
56 november, 15, 1980 baker mary alice 09/01/2004
87 december, 20, 1970 roper lisa marie 01/01/1999
…could be queried like this:
SELECT employee_number,
employee_first_name,
substr(employee_middle_name, 1, 1),
employee_last_name,
employee_hire_date,
to_date(employee_dob,'month, dd, yyyy')
FROM employees;
After defining the external table like this:
CREATE TABLE employees
(employee_number CHAR(5),
employee_dob CHAR(20),
employee_last_name CHAR(20),
employee_first_name CHAR(15),
employee_middle_name CHAR(15),
employee_hire_date DATE)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY def_dir1
ACCESS PARAMETERS
(RECORDS DELIMITED BY NEWLINE
FIELDS (employee_number CHAR(2),
employee_dob CHAR(20),
employee_last_name CHAR(18),
employee_first_name CHAR(11),
employee_middle_name CHAR(11),
employee_hire_date CHAR(10) date_format DATE mask "mm/dd/yyyy"
)
)
LOCATION ('info.dat')
);
Assuming of course that the file’s name is info.dat
, and its path on the filesystem has already been created as a directory object DEF_DIR1
in the database.
A similar technique exists for querying simple XML files, taking advantage of Oracle’s XML DB features. HT to odie_63 on the XML DB forums for introducing me to this method.
Suppose you have a simple XML file (elements, but no attributes; no complex column types - it could work for the preceding, but I’m not sure) like so.
<Employees>
<Employee>
<id>1234</id>
<lastname>Jetson</lastname>
<firstname>George</firstname>
<department>10</department>
<salary>50000.00</salary>
</Employee>
<Employee>
<id>1235</id>
<lastname>Crackorn</lastname>
<firstname>James</firstname>
<department>10</department>
<salary>40000.00</salary>
</Employee>
<Employee>
<id>1236</id>
<lastname>Hoffman</lastname>
<firstname>Burl</firstname>
<department>20</department>
<salary>75000.00</salary>
</Employee>
</Employees>
Place it on the database server in a readable directory, e.g. /home/oracle
. Name it employees.xml
.
Now execute the following in SQL*Plus:
SQL> CREATE OR REPLACE DIRECTORY XML_DIR AS '/home/oracle';
SQL> CREATE OR REPLACE VIEW employees_v AS
SELECT *
FROM XMLTable('/Employees/Employee'
passing xmltype(
bfilename('XML_DIR','employees.xml')
, nls_charset_id('AL32UTF8')
)
columns id number path 'id'
, lastname varchar2(30) path 'lastname'
, firstname varchar2(30) path 'firstname'
, department number path 'department'
, salary number path 'salary'
);
SQL> SELECT * FROM employees_v;
ID LASTNAME FIRSTNAME DEPARTMENT SALARY
---- -------- --------- ---------- ------
1234 Jetson George 10 50000
1235 Crackorn James 10 40000
1236 Hoffman Burl 20 75000
You can modify construction of the view to read from multiple files by using a simple UNION
operator, or you can make the source filename dynamic so you can switch datasets at runtime. Here’s how you might do something like that:
SQL> CREATE OR REPLACE VIEW employees_v AS
SELECT *
FROM XMLTable('/Employees/Employee'
passing xmltype(
bfilename('XML_DIR', userenv('CLIENT_INFO'))
, nls_charset_id('AL32UTF8')
)
columns id number path 'id'
, lastname varchar2(30) path 'lastname'
, firstname varchar2(30) path 'firstname'
, department number path 'department'
, salary number path 'salary'
);
Then at runtime, specify the file you want to use like this:
SQL> exec dbms_application_info.set_client_info('employees_01.xml');
SQL> SELECT * FROM employees_v;
ID LASTNAME FIRSTNAME DEPARTMENT SALARY
---- -------- --------- ---------- ------
1234 Jetson George 10 50000
1235 Crackorn James 10 40000
1236 Hoffman Burl 20 75000
SQL> exec dbms_application_info.set_client_info('employees_02.xml');
SQL> SELECT * FROM employees_v;
ID LASTNAME FIRSTNAME DEPARTMENT SALARY
---- -------- --------- ---------- ------
1234 Ringwald Malory 30 63000
1235 Dulles Bob 10 100000
1236 Stein Frank 50 55000
Why might this be useful? Suppose you have an arbitrary number of identically structured XML files in a directory that you want to import into a relational database table. Using a shell script you can, for each file in the directory,
- log in to SQL*Plus
- set the
CLIENT_INFO
variable to the name of the current file - perform a simple insert into your table reading all records from the view
You might even create a cron job to monitor a particular directory for new XML files, processing them into tables and then deleting or archiving the files when done.