PHP and MySQL Programming/XML and PHP

Introduction edit

XML (eXtensible Markup Language) is used in mainstream development. It might have started off as an attempt at a web standard, but is now used even in more traditional applications as a document standard. For example, the Open Document Format employed by Sun in their StarOffice and OpenOffice suites is based on XML.

Because of its wide-spread use in the IT industry, it is fitting that we as PHP developers know how to make use of XML files in our PHP applications.

XML Structure edit

Since XML documents are extensible, there are no limits to the tags that you can create to define data with. Here is an example of a simple XML document:

<?xml version="1.0"?>
<document>
   <title>Isn't this simple!</title>
   <body>XML is as simple as pie. :-)</body>
</document>

The reason that it looks so simple, is because it is so simple! Just as in HTML, elements are enclosed by angled brackets: "<" and ">", where the start element differs from the end element by the exclusion of a forward slash: "/".

Creating an XML Parser in PHP edit

Defining the XML Parser edit

In PHP, you define an XML Parser by using the xml_parser_create() function as shown below.

<?
$parser = xml_parser_create(ENCODING);
?>

You can think of the $parser variable in terms of a parsing engine for the XML document. Note that the ENCODING can be either:

1. ISO-8859-1 (default)

2. US-ASCII

3. UTF-8

Defining the Element Handlers edit

Element handlers are defined by means of the xml_set_element_handler() function as follows:

<?
xml_set_element_handler(XML_PARSER, START_FUNCTION, END_FUNCTION);
?>

The three arguments accepted by the xml_set_element_handler() function are:

1. XML_PARSER - The variable that you created when you called the xml_parser_create() function.

2. START_FUNCTION - The name of the function to call when the parser encounters a start element.

3. END_FUNCTION - The name of the function to call when the parser encounters an end element.

e.g.:

 <?
 $parser = xml_parser_create();
 xml_set_element_handler($parser, "startElement", "endElement");
 ?>

Defining Character Handlers edit

Character handlers are created by means of the set_character_handler() function as follows:

<?
xml_set_character_handler(XML_PARSER, CHARACTER_FUNCTION);
?>

The two arguments accepted by the set_character_handler() function are:

1. XML_PARSER - The variable that you created when you called the xml_parser_create() function.

2. CHARACTER_FUNCTION - The name of the function to call when the parser encounters character data.

Starting the Parser edit

To finally start the parser, we call the xml_parse() function as follows:

<?
xml_parse(XML_PARSER, XML);
?>

The two arguments accepted by the xml_parse() function are:

1. The variable that you created when you called the xml_parser_create() function.

2. The XML that is to be parsed.

e.g.:

 <?
 $f = fopen ("simple.xml", 'r');
 $data = fread($f, filesize("simple.xml"));
 xml_parse($parser, $data);
 ?>

Cleaning Up edit

After parsing an XML document, it is considered good practice to free up the memory that is holding the parser. This is done by calling the xml_parser_free() function as follows:

<?
xml_parser_free(XML_PARSER);
?>

Example edit

 <?
 # --- Element Functions ---
 
 function startElement($parser, $name, $attributes){
    # ... some code
 }
 
 function endElement ($parser, $name){
    # ... some code
 }
 
 function characterData ($parser, $data){
    # ... some code
 }
 
 function load_data($file){
    $f = fopen ($file, 'r');
    $data = fread($f, filesize($file));
    return $data;
 } 
 
 # --- Main Program Body ---
 $file = "simple.xml";
 $parser = xml_parser_create();
 xml_set_element_handler($parser, "startElement", "endElement");
 xml_set_character_data_handler($parser, "characterData");
 xml_parse ($parser, load_data($file));
 xml_parser_free($parser);
 ?>

Parsing XML Documents edit

We have seen the steps needed to successfully parse a XML document with PHP. Lets take a moment to reflect on how these steps are interconnected.

When a XML parser is initialized, php will go through the XML file. When a starting tag is found, a predefined function created by you, the programmer, is called. The same thing happens when php encounters the text between tags, and the end tags.

Here is a complete example of parsing XML documents. This example is a RSS reader which can be used to display News Articles from any RSS feed which conforms to RSS 1.0 standards.

Example edit

 <html>
 <head>
 <title> Google Articles </title>
 </head>
 <body>
 &lt;h2>Google Articles&lt;/h2>
 &lt;dl>
 <?php 
 
 $insideitem = false;
 $tag = "";
 $title = "";
 $description = "";
 $link = "";
 
 function startElement($parser, $name, $attrs) {
         global $insideitem, $tag, $title, $description, $link; 
         if ($insideitem) {
                 $tag = $name;
         }
         elseif ($name == "ITEM") {
                 $insideitem = true;
         }
 }
 
 function endElement($parser, $name) {
         global $insideitem, $tag, $title, $description, $link;
         if ($name == "ITEM") {
                 printf("&lt;dt>&lt;b><a href='%s'>%s</a>&lt;/b>&lt;/dt>",
                 trim($link),trim($title));
                 printf("&lt;dd>%s&lt;/dd>", trim($description));
                 $title = "";
                 $description = "";
                 $link = "";
                 $insideitem = false;
         }
 }
 
 function characterData($parser, $data) {
        global $insideitem, $tag, $title, $description, $link;
         if ($insideitem) {
                 switch ($tag) {
                         case "TITLE":
                                 $title .= $data;
                                 break;
                         case "DESCRIPTION":
                                 $description .= $data;
                                 break;
                         case "LINK":
                                 $link .= $data;
                                 break;
                 }
         }
 }
 
 $xml_parser = xml_parser_create();
 xml_set_element_handler($xml_parser, "startElement", "endElement");
 xml_set_character_data_handler($xml_parser, "characterData");
 # $fp = fopen("http://www.newsforge.com/index.rss", 'r')
 $fp = fopen("http://news.google.co.za/nwshp?hl=en&tab=wn&q=&output=rss", 'r')
         or die("Error reading RSS data.");
 while ($data = fread($fp, 4096)) {
         xml_parse($xml_parser, $data, feof($fp))
        or die(sprintf("XML error: %s at line %d",
        xml_error_string(xml_get_error_code($xml_parser)),
         xml_get_current_line_number($xml_parser)));
 }
 fclose($fp);
 xml_parser_free($xml_parser);
 ?>
 &lt;/dl>
 </body>
 </html>

Dumping Database Contents into an XML File edit