XML - Managing Data Exchange/Database and XML



Previous Chapter Next Chapter
XMLHTTP SyncML



Learning objectives
  • Learn about Native XML Databases
  • Learn about the conversion technology available
  • Create Table and retrieve information


Native XML Database edit

The term Native XML database: has become popular since 1999, after the company Software AG released the first version of its native XML server Tamino, which included a native XML database. A definition of a native databases is that it:

"[d]efines a (logical) model for an XML document and stores and retrieves documents according to that model." (Bourret, 2002)"

To model data in XML, two principle approaches are used: Data-centric documents and Document-centric documents.

  • Data-centric documents (for data transport) have fairly regular structure, order typically does not matter, and little or no mixed content.
  • Document-centric documents (usually for human consumption) have less regular or irregular structure, significant order of the elements, and lots of mixed content.


Examples of Native databases


Product Developer License DB Type
Tamino Software AG Commercial Proprietary. Relational through ODBC.
XediX Multimedia Solution XediX Tera Solution Commercial Proprietary
eXist Wolfgang Meier Open Source Relational
dbXML dbXML Group Open Source Proprietary
Xindice Apache Software Foundation Open Source Proprietary (Model-based)


eXist edit

eXist is an Open Source effort to develop a native XML database system, tightly integrated with existing XML development tools like Apache's Cocoon. The database may be easily deployed, running either standalone, inside a servlet engine, or directly embedded in an application.

Some features that are available in eXist and that can be found in most Native XML databases are :


  • Schema-less storage - Documents do not have to be associated to schema or document type, meaning they are allowed to be well formed only.
  • Collections - A collection plays a similar role to a directory in a file system. When submitting a query the user can choose a distinct part of the collection hierarchy or even all the documents contained in the database.
  • Query languages - The most popular query languages supported by Native XML databases are XPath (with extensions for queries over multiple documents) and XQuery.

Relational Databases edit

Database vendors such as IBM, Microsoft, Oracle, and Sybase have developed tools to assist in converting XML documents into relational tables.

Let us look at IBM and Oracle:


IBM Technology edit

DB2 XML Extender provides access, storage and transformation for XML data through user-defined functions and stored procedure. It offers 2 key storage models: XML Colums and XML Collections.


1. XML Column: stores and retrieves entire XML documents as DB2 column data. Use of XML Columns is recommended when XML documents already exist and/or when there is a need to store XML documents in their entity.

2. XML Collection: composes XML Documents from a collection of relational tables.


A data access definition (DAD) file is used for both XML Column and XML Collection approaches to define the "mapping" between the database tables and the structure of the XML document.

<Xcollection> Specifies that the XML data is either to be decomposed from XML documents into a collection of relational tables, or to be composed into XML documents from a collection of relational tables.

The DAD file defines the XML document tree structure, using the following kinds of nodes:

  • root_node - Specifies the root element of the document.
  • element_node - Identifies an element, which can be the root element or a child element.
  • text_node - Represents the CDATA text of an element.
  • attribute_node - Represents an attribute of an element.


<?xml version="1.0"?> 
<!DOCTYPE DAD SYSTEM ""c:\dxx\samples\db2xml\dtd\dad.dtd"> 
<DAD> 
  ...
<Xcollection> 
<SQL_stmt> 
       ...
</SQL_stmt> 
<prolog>?xml version="1.0"?</prolog> 
<doctype>!DOCTYPE Order SYSTEM
                  ""c:\dxx\samples\db2xml\dtd\getstart.dtd""</doctype>
<root_node> 
 <element_node name="Order">      --> Identifies the element <Order>
  <attribute_node name="key">     --> Identifies the attribute "key" 
   <column name="order_key"/>     --> Defines the name of the column, 
                                      "order_key", to which the
                                      element and attribute are
                                      mapped
  </attribute_node> 
  <element_node name="Customer">  --> Identifies a child element of 
                                      <Order> as <Customer>
   <text_node>                    --> Specifies the CDATA text for
                                      the element <Customer>
    <column name="customer">      --> Defines the name of the column,
                                      "customer", to which the child
                                      element is mapped
   </text_node> 
  </element_node> 
        ...
 </element_node>

      ...
</root_node> 
</Xcollection>
</DAD>



Oracle edit

Oracle's XML SQL Utility (XSU) uses a schematic mapping that defines how to map tables and views, including object-relational features, to XML documents. Oracle translates the chain of object references from the database into the hierarchical structure of XML elements.


CREATE TABLE Customers 
{
     FIRSTNAME      VARCHAR,
     LASTNAME       VARCHAR,
     PHONENO      INT,
     ADDRESS        AddressType, // object reference
     
}

CREATE TYPE AddressType as OBJECT
{  
     ZIP       VARCHAR (100),
     CITY      VARCHAR (100),
     STREET    VARCHAR (100),
}


A corresponding XML document generated from the given object-relational model looks like:

<?xml version="1.0"?>
<ROWSET>
     <ROW num="1"> 

        <FIRSTNAME>JOHN</FIRSTNAME>
        <LASTNAME>SMITH</LASTNAME>
        <PHONENO>7061234567</PHONENO>
       
        <ADDRESS>
   <ZIP>30601</ZIP>
   <CITY>ATHENS</CITY>
   <STREET>123 MAIN STREEET</STREET>
        </ADDRESS>

    </ROW>

    <!-- additional rows ... -->

 </ROWSET>


XSU can be used for executing queries in a Java environment and retrieve XML from the database.

import oracle.jdbc.driver.*;
import oracle.xml.sql.query.OracleXMLQuery;
import java.lang.*;
import java.sql.*;

// class to test XML document generation as String
class testXMLSQL {

   public static void main(String[] args)
   {
     try {
      // Create the connection
      Connection conn  = getConnection("root","");

      // Create the query class
      OracleXMLQuery qry = new OracleXMLQuery(conn,
         "SELECT  * FROM Customers");

      // Get the XML string
      String str = qry.getXMLString();

      // Print the XML output
      System.out.println("The XML output is:\n"+str);

      // Always close the query to get rid of any resources..
      qry.close();
     } catch(SQLException e) {
      System.out.println(e.toString());
     }
   }

   // Get the connection given the user name and password.!
   private static Connection getConnection(String username,
        String password)
        throws SQLException
   {
      // register the JDBC driver..
       DriverManager.registerDriver(new 
          oracle.jdbc.driver.OracleDriver());

      // Create the connection using the OCI8 driver
       Connection conn =
        DriverManager.getConnection(
           "jdbc:oracle:thin:@dlsun489:1521:ORCL",username,password);

      return conn;
   }
}

Query Languages edit

XPath edit

XPath is a language for addressing parts of an XML document, and is the common locator used by both XSLT and XPointer. An XPath expression is a series of location steps separated by " / ". Each step selects a set of nodes that become the current node(s) for the next step. The set of nodes selected by the expression are the nodes remaining after processing each step in order.


XQuery edit

XQuery is a query language under development by the World Wide Web Consortium (W3C). The ambitious task is to develop the first world standard for querying Web documents. XQuery is a versatile markup language, capable of labeling the information content of diverse data sources including structured and semi-structured documents, relational databases, and object repositories.

MySQL 5.1 edit

MySQL has a command line utility for executing queries against a MySQL database; it has an option for using XML as their output format. MySQL also allows convertion to XML; more information can be found in Converting MySQL to XML MySQL allows users to execute any SQL query. mysqldump allows users to specify which tables to dump and to specify a where clause to restrict the rows that are dumped. In its Beta release of MySQL 5.1, several features have been added including new XML functions.

In order to understand these New functions, we will use the following table:

CREATE TABLE Customers (doc VARCHAR(150));

INSERT INTO Customers VALUES
('
<person id="1">
      <firstname>John</firstname>
      <lastname>Smith</lastname>
      <phoneno>123-5678</phoneno>
</person>
');

INSERT INTO Customers VALUES
('
<person id="2">
      <firstname>Aminata</firstname>
      <lastname>Cisse</lastname>
      <phoneno>123-5679</phoneno>
</person>
');

INSERT INTO Customers VALUES
('
<person id="3">
      <firstname>Lamine</firstname>
      <lastname>Smith</lastname>
      <phoneno>123-5680</phoneno>
</person>
');


XML Functions edit

MySQL version 5.1 has functions for searching and changing XML documents: ExtractValue() and UpdateXML().


  • EXTRACTVALUE (XML_document, XPath_string);

This function takes 2 string arguments: The first parameter correspond to the XML_document string, and the 2nd Parameter XPath_string (XPath expression / locator). This will result in the return of the string containing a value from the document.

mysql> SELECT EXTRACTVALUE(doc,'//firstname') FROM Customers;

+------------------------------------------+
| EXTRACTVALUE(doc,'//firstname')          |
+------------------------------------------+
| John                                     | 
| Aminata                                  | 
| Lamine                                   | 
+------------------------------------------+
3 rows in set (0.01 sec)


mysql> SELECT ExtractValue(doc,'/person[@id="3"]/firstname') as fname FROM Customers;

+---------+
| fname   |
+---------+
|         | 
|         | 
| Lamine  | 
+---------+
3 rows in set (0.02 sec)



  • UPDATEXML (XML_document, XPath_string, new_value);

This function takes 3 string arguments: The first two paramaters are similar to the ones used with extractValue(), XML_document and XPath_string. The third parameter is the new value that will replace the one found. This function will then returns the changed XML.

mysql> SELECT UpdateXML(doc,'/person[@id="3"]/phoneno', '<phoneno>111-2233<phoneno>') FROM Customers;


+-------------------------------------------------------------------------------
----------------------------------------------------+
| UpdateXML(doc,'/person[@id="3"]/phoneno','<phoneno>111-2233<phoneno>')
                                                    |
+-------------------------------------------------------------------------------
----------------------------------------------------+
|
<person id="1">
      <firstname>John</firstname>
      <lastname>Smith</lastname>
      <phoneno>123-5678</phoneno>
</person>
         |
|
<person id="2">
      <firstname>Aminata</firstname>
      <lastname>Cisse</lastname>
      <phoneno>123-5679</phoneno>
</person>
      |
|
<person id="3">
      <firstname>Lamine</firstname>
      <lastname>Smith</lastname>
      <phoneno>111-2233<phoneno>
</person>
 |
+-------------------------------------------------------------------------------
----------------------------------------------------+
3 rows in set (0.00 sec)



Installation edit

Currently (04/05/06) MySQL 5.1 does not come with the installer (Beta Version).


Quick Windows installation.

Details information can be found in the online Manual:

Summary edit

Exercises edit

Answers edit