XML - Managing Data Exchange/XQuery



Previous Chapter Next Chapter
XQL Exchanger XML Lite



1. Definition of XQuery

edit

XQuery is a query language under development by the World Wide Web Consortium (W3C) and makes possible to efficiently and easily extract information from native XML databases and relational databases that store XML data.

Every query consists of an introduction and a body. The introduction establishes the compile-time environment such as schema and module imports, namespace and function declarations, and user-defined functions. The body generates the value of the entire query. The structure of XQuery shows in Figure 1.

Figure 1. Structure of XQuery

Introduction

Comment:

(: Sample version 1.0  :)

Namespace Declaration:

declare namespace my = “urn:foo”;

Function Declaration:

declare function my:fact($n) {

 

if ($n < 2)

 

then 1

 

else $n * my:fact($n – 1)

 

};

Global Variable:

declare variable $my:ten {my:fact(10)};

 

Body

Constructed XML:

<table>{

FLWOR Expression:

for $i in 1 to 10

return

 

<tr>

Enclosed Expression:

<td>10!/{$i}! = {$my:ten div my:fact($i)} </td>

 

</tr>

 

} </table>


2. XQuery versus Other Query Languages

edit

2.1 XQuery versus XPath and XSLT

XQuery, XPath, XSLT, and SQL are good query languages. Each of these languages has their own advantages in diverse situations, so XQuery cannot substitute for them at every task. XQuery is built on XPath expressions. XQuery 1.0 and XPath 2.0 shares the same data model, the same functions, and the same syntax. Table 1 shows the advantages and the drawbacks of each query language.

Table 1. XQuery versus XPath and XSLT

 

Advantage

Drawback

XQuery

1.expressing joins and sorts

2.manipulating sequences of values and nodes in arbitrary order

3.easy to write user-defined functions including recursive ones

4.allows users to construct temporary XML results in the middle of a query, and then navigate into that

1.XQuery implementations are less mature than XSLT ones

XPath 1.0

1.convenient syntax for addressing parts of an XML document

2.selecting a node out of an existing XML document or database

1.cannot create new XML

2.cannot select only part of an XML node

3.cannot introduce variables or namespace bindings

4.cannot work with date values, calculate the maximum of a set of numbers, or sort a list of strings

XSLT 1.0

1.recursively processing an XML document or translating XML into HTML and text

2.creating new XML or part of existing nodes

3.introducing variables and namespaces

1.cannot be addressed without effectively creating a language like XQuery

2.cannot work with sequences of values


2.2 XQuery versus SQL

XQuery has similarities to SQL in both style and syntax. The main difference between XQuery and SQL is that SQL focuses on unordered sets of “flat” rows, while XQuery focuses on ordered sequences of values and hierarchical nodes.

3. XQuery Expressions

edit

3.1 FLWOR expressions

FLWOR expressions are important part of XQuery. FLWOR is pronounced "flower". This name comes from the FOR, LET, WHERE, ORDER BY, and RETURN clauses that organize the expressions. The FOR and LET clauses can come out any number of times in any order. The WHERE and ORDER BY clauses are optional. However, these clauses must be shown in the order given if they are used. The RETURN clause should exist.

XQuery permits you to use join queries in a similar way to SQL. This example is depicted in Example 1 as a join between the videos table and the actors table.

Example 1.

 let $doc := .
   for $v in $doc//video,
       $a in $doc//actors/actor
     where ends-with($a, 'Lisa')
         and $v/actorRef = $a/@id
   order by $v/year
 return $v/title 

The LET clause states a variable assignment. In this case, the query initializes it to doc ('videos.xml'), or a query’s result places a document in a database. The FOR clause describes a mechanism for iteration: one variable processes all the videos in turn, another variable processes all the actors in turn. In this case, the query processes the pairs of videos and actors. The WHERE clause selects tables in which you are interested. In this case, you want to know that the actor shows in video table with the name ending with “Lisa”. The ORDER BY clause obtains the results in sorted order. In this case, you desire to have a result with the videos in order of their release date. The RETURN clause at the end of an expression informs the system what information you want to get back. In this case, you want the video’s title.


3.2 Conditional expression

XQuery offers IF, THEN, and ELSE clause, conditional expression. The ELSE clause is obligatory. The reason is that each expression in XQuery should return a value. A query is showed at example 2 to retrieve all books and their authors. You desire to return additional authors as “et-al” after the first two authors.

Example 2.

 for $b in document("books.xml")/bib/book
   return
     if (count($b/author) <= 2) then $b
       else <book> { $b/@*, $b/title, $b/author[position() <= 2], <et-al/>,
         ...... $b/publisher, $b/price } </book>

This query reads book data from a books.xml. If the author count is less than 2 or equal to 2 for each book, then the query returns the book straightly. Otherwise the query makes a new book element including all the original data, excepting that the query contains only the first two authors and attaches an et-al element. Position() function is returned only the first two authors. $b/@*, XPath expression, refers to all the attributes on $b.


3.3 XQuery functions and operators

XQuery contains a huge set of functions and operators. Table 2 shows frequently used built-in functions. You are able to describe your own and many engines provide custom extensions as well.

Table 2. Commonly used built-in functions

Function

Commentary

Math:

+, -, *, div, idiv, mod, =, !=, <, >, <=, >= floor(), ceiling(), round(), count(), min(), max(), avg(), sum()

Division is done using div rather than a slash because a slash indicates an XPath step expression. idiv is a special operator for integer-only division that returns an integer and ignores any remainder.

Strings and Regular Expressions:

compare(), concat(), starts-with(), ends-with(), contains(), substring(), string-length(), substring-before(), substring-after(), normalize-space(), upper-case(), lower-case(), translate(), matches(), replace(), tokenize()

compare() dictates string ordering. translate() performs a special mapping of characters. matches(), replace(), and tokenize() use regular expressions to find, manipulate, and split string values.

Date and Time:

current-date(), current-time(), current-dateTime() +, -, div eq, ne, lt, gt, le, gt

XQuery has many special types for date and time values such as duration, dateTime, date, and time. On most you can do arithmetic and comparison operators as if they were numeric. The two-letter abbreviations stand for equal, not equal, less than, greater than, less than or equal, and greater than or equal.

XML node and QNames:

node-kind(), node-name(), base-uri() eq, ne, is, isnot, get-local-name-from-QName(), get-namespace-from-QName() deep-equal() >>, <<

node-kind() returns the type of a node (i.e. "element"). node-name() returns the QName of the node, if it exists. base-uri() returns the URI this node is from.

Nodes and QName values can also be compared using eq and ne (for value comparison), or is and isnot (for identity comparison). deep-equal() compares two nodes based on their full recursive content.

The << operator returns true if the left operand preceeds the right operand in document order. The >> operator is a following comparison.

Sequences:

item-at(), index-of(), empty(), exists(), distinct-nodes(), distinct-values(), insert(), remove(), subsequence(), unordered().position(), last()

item-at() returns an item at a given position while index-of() attempts to find a position for a given item. empty() returns true if the sequence is empty and exists() returns true if it's not. dictinct-nodes() returns a sequence with exactly identical nodes removed and distinct-values() returns a sequence with any duplicate atomic values removed. unordered() allows the query engine to optimize without preserving order. position() returns the position of the context item currently being processed. last() returns the index of the last item.

Type Conversion:

string(), data(), decimal(), boolean()

These functions return the node as the given type, where possible. data() returns the "typed value" of the node.

Booleans:

true(), false(), not()

There's no "true" or "false" keywords in XQuery but rather true() and false() functions. not() returns the boolean negation of its argument.

Input:

document(), input(), collection()

document() returns a document of nodes based on a URI parameter. collection() returns a collection based on a string parameter (perhaps multiple documents). input() returns s general engine-provided set of input nodes.

4. References

edit

The contents of this chapter were quoted from the following lists.

- X Is for XQuery, Jason Hunter: http://www.oracle.com/technology/oramag/oracle/03-may/o33devxml.html

- An Introduction to the XQuery FLWOR Expression, Michael Kay: http://www.stylusstudio.com/xquery_flwor.html

- Learn XQuery in 10 Minutes, Michael Kay: http://www.stylusstudio.com/xquery_primer.html

- XQuery: The XML Query Language, Michael Brundage, Addison-Wesley 2004

edit

- W3C XML Query (XQuery): http://www.w3.org/XML/Query

- XQuery Latest version: http://www.w3.org/TR/xquery/

- XQuery 1.0 and XPath 2.0 Functions and Operators: http://www.w3.org/TR/xpath-functions/

- XQuery 1.0 and XPath 2.0 Data Model (XDM): http://www.w3.org/TR/xpath-datamodel/

- XSLT 2.0 and XQuery 1.0 Serialization: http://www.w3.org/TR/xslt-xquery-serialization/

- XML Query Use Cases: http://www.w3.org/TR/xquery-use-cases/

- XML Query (XQuery) Requirements: http://www.w3.org/TR/xquery-requirements/

- XQuery: The XML Query Language, Michael Brundage, Addison-Wesley 2004

See Also

edit