XPath/Basic Syntax

Basic XPath Syntax

edit

Expressions that start with a forward slash "/" are called absolute expressions. They start at the root of the document. All other expressions are relative to the current position within an XML document.

Expressions are created by creating a list of step expressions of the form

  step[predicate]/step[predicate]/step[predicate]

You can think of the predicate as a filter or conditional expression that service like a WHERE clause in SQL.

Sample XML file

edit

Many of the examples use a "books" example such as the following:

http://raw.github.com/dmccreary/learn-xquery/master/data/books.xml

In general the books file has the following structure:

<books>
  <book>
    <title>XQuery</title>
    <format>wikibook</format>
  </book>
</books>

Basic XPath Expressions

edit

The root document node

 /

Note that the forward slash returns the document root, not the full books element.

The root node that contains all the books:

 /books

All book elements:

 /books/book
 //book

The first version is with an absolute path. The second uses a relative path - book elements at any level of the file.

Note that the first expression is faster in unindexed XML but within indexed native XML databases the second is faster.

A count of the number of books:

  count(//book)

All the book titles:

  //book/title

The second book in the collection:

  //book[2]

The title of the second book:

  //book[2]/title

The third author of the second book

  //book[2]/author[3]

All books with the format "wikibook":

  //book[format='wikibook']

Get a list of all the publishers

  //publisher

Get a distinct list of the publishers (duplicates removed)

  distinct-values(//publisher)

Books that have at least one price over 30

  //book[list-price > 30]

XPath abbreviations

edit

. represents the current node

.. represents the nearest parent node

@ represents the attribute delimiter

$ represents the variable delimiter

[n] represents the n-th child of the current node

ancestor::div represents the set of parent div nodes

normalize-space(firstname)="Paul" matches Paul regardless of whitespace delimiters

boolean(string($myvar) ) checks for empty strings

/ represents the absolute path of the root node

@* represents all attributes of the current node

-Return all values using a union of attributes, node names, and text values:

@*|node()|text()


-Return all of a node's siblings using a union of the preceding-sibling and following-sibling axes:

preceding-sibling::node() | following-sibling::node()


-Return the adjacent sibling of a specific type

//div/following-sibling::h3


-Check string value of current node

[. = "Matthew Bob"]


-Node identity can be checked using the count() function to see if the intersection of two node-sets of the same length equals the length of either of the node sets(or in the case of a single node set whether it is equal to 1). For example, the following query returns TRUE in this case because both nodes are the same:

count(/bk:books | /bk:books/bk:book[1]/parent::*) = 1