Benefits of XQuery

edit

The principal benefits of XQuery are:

  • Expressiveness - XQuery can query many different data structures and its recursive nature makes it ideal for querying tree and graph structures
  • Brevity - XQuery statements are shorter than similar SQL or XSLT programs
  • Flexibility - XQuery can query both hierarchical and tabular data
  • Consistency - XQuery has a consistent syntax and can be used with other XML standards such as XML Schema datatypes

XQuery is frequently compared with two other languages, SQL and XSLT, but has a number of advantages over these.

Advantages over SQL

edit

Unlike SQL, XQuery returns not just tables but arbitrary tree structures. This allows XQuery to directly create XHTML structures that can be used in web pages. XQuery is for XML-based object databases, and object databases are much more flexible and powerful than databases which store in purely tabular format.

Unlike XSLT, XQuery can be learned by anyone familiar with SQL. Many of the constructs are very similar such as:

  • Ordering Results: Both XQuery and SQL add an order by clause to the query.
  • Selecting Distinct Values: Both XQuery and SQL have easy ways to select distinct values from a result set
  • Restricting Rows: Both XQuery and SQL have a WHERE X=Y clause that can be added to an XQuery

Another big advantage is that XQuery is essentially the native query language of the World Wide Web. One can query actual web pages with XQuery, but not SQL. Even if one uses SQL-based databases to store HTML/XHTML pages or fragments of such pages, one will miss many of the advantages of XQuery's simple tag/attribute search (which is akin to searching for column names within column names).

Advantages over XSLT

edit

Unlike XSLT, XQuery can be quickly learned by anyone familiar with SQL. XSLT has many patterns that are unfamiliar to many procedural software developers. Also, whereas XSLT is good for using as a static means to convert one type of document to another, for example RSS to HTML, XQuery is a much more dynamic querying tool, useful for pulling out sections of data from large documents and/or large number of documents.

The Debate about XQuery vs. XSLT for Document Transformation

edit

There has been a debate of sorts about the merits of the two languages for transforming XML: XSLT and XQuery. A common misconception is that "XQuery is best for querying or selecting XML, and XSLT is best for transforming it." In reality, both methods are capable of transforming XML. Despite XSLT's longer history and larger install base, the "XQuery typeswitch" method of transforming XML provides numerous advantages.

Most people who need to transform XML hear that they need to learn a language called XSLT. XSLT, whose first version was published by the W3C in 1999, was a huge innovation for its time and, indeed, remains dominant. It was one of the very first languages dedicated to transforming XML documents, and it was the first domain-specific language (DSL) to use advanced theories from the world of functional programming to create very reliable, side-effect free transformations. Many XML developers still feel strong indebted to this groundbreaking language, since it helped them see a new model of software development: one focused around the transformation of models and empowering them to fuse both the requirements and documentation of a transformation routing into a single, modular program.

On the other hand, learning XSLT requires overcoming a very substantial learning curve. XSLT's difficulty is due, in part, to one of the key design decisions by its architects: to express the transformation rules using XML itself, rather than creating a brand new syntax and grammar for storing the transformation rules. XSLT's unique approach to transformation rules also contributes to the steepness of the learning curve. The learning curve can be overcome, but it is fair to say that this learning curve has created an opening for an alternative approach.

XQuery has filled this demand for an alternative among a growing community of users: they find XQuery has a lower learning curve, it meets their needs for transforming XML, and, together with XQuery's other advantages, it has become a compelling "all-in-one" language. Like XSLT, XQuery was created by the W3C to handle XML. But instead of expressing the language in XML syntax, the architects of XQuery chose a new syntax that would be more familiar to users of server-side scripting languages such as PHP, Perl, or Python. XQuery was designed to be similar to users of relational database query languages such as SQL, while still remaining true to functional programming practices. Despite its relative youth (XQuery 1.0 was only released in 2007 when XSLT had already reached its version 2.0), XQuery was born remarkably mature. XML servers like eXist-db and MarkLogic were already using XQuery as their language for querying XML and performing web server operations (obviating the need for learning PHP, Perl, or Python).

So, in the face of the XSLT community's contention that "XSLT is best for transforming documents and XQuery is best for querying databases", this community of users was surprised to find that XQuery has entirely replaced their need for XSLT. They have come to argue unabashedly that they prefer XQuery for this purpose.

How does XQuery accomplish the task of transforming XML? The primary technique in XQuery for transforming XML is a little-known expression added by the authors of XQuery, called "typeswitch." Although it is quite simple, typeswitch enables XQuery to perform nearly the full set of transformations that XSLT does. A typeswitch expression quickly looks at a node's type, and depending on the node's type, performs the operation you specify for that type of node. What this means is that each distinct element of a document can have its own rule, and these rules can be stored in modular XQuery functions. This humble addition to the XQuery language allows developers to transform documents with complex content and unpredictable order - something commonly believed to be best reserved for the domain of XSLT. Despite the differences in syntax and approach to transformation, a growing community has actually come to see the XQuery typeswitch expression as a valid, even superior, way to store their document transformation logic.

By structuring a set of XQuery functions around the typeswitch expression, you can achieve the same result as XSLT-style transforms while retaining the benefits of XQuery: ease of learning and integration with native XML databases. Even more important for those users of native XML databases, the availability of typeswitch means that they only need to learn a single language for their database queries, web server operations, and document transformations. These XQuery typeswitch routines have proven easy to build, test, and maintain - some believe easier than XSLT. XQuery typeswitch has given these users a high degree of agility, allowing them to master XQuery fully rather than splitting their time and attention between XQuery and XSLT.

That said, there is still a large body of legacy XSLT transforms that work well, and there are XSLT developers who see little benefit from transitioning to a typeswitch-style XQuery. Both are valid approaches to document transformation. A natural tension has arisen between the proponents of XQuery typeswitch and XSLT, each promoting what they are most comfortable with and believe to be superior. In practice you might be best served by trying both techniques and determining what style is right for you and your organization. Without presuming a background or interest in XSLT, this article and its companion article help you to understand the key patterns for using XQuery typeswitch for your XML transformation needs.