XPath/Basic Syntax
Basic XPath Syntax
editExpressions that start with a forward slash "/" are called absolute expressions. They start at the root of the document. All other expressions are relative to the current position within an XML document.
Expressions are created by creating a list of step expressions of the form
step[predicate]/step[predicate]/step[predicate]
You can think of the predicate as a filter or conditional expression that service like a WHERE clause in SQL.
Sample XML file
editMany of the examples use a "books" example such as the following:
http://raw.github.com/dmccreary/learn-xquery/master/data/books.xml
In general the books file has the following structure:
<books>
<book>
<title>XQuery</title>
<format>wikibook</format>
</book>
</books>
Basic XPath Expressions
editThe root document node
/
Note that the forward slash returns the document root, not the full books element.
The root node that contains all the books:
/books
All book elements:
/books/book //book
The first version is with an absolute path. The second uses a relative path - book elements at any level of the file.
Note that the first expression is faster in unindexed XML but within indexed native XML databases the second is faster.
A count of the number of books:
count(//book)
All the book titles:
//book/title
The second book in the collection:
//book[2]
The title of the second book:
//book[2]/title
The third author of the second book
//book[2]/author[3]
All books with the format "wikibook":
//book[format='wikibook']
Get a list of all the publishers
//publisher
Get a distinct list of the publishers (duplicates removed)
distinct-values(//publisher)
Books that have at least one price over 30
//book[list-price > 30]
XPath abbreviations
edit. represents the current node
.. represents the nearest parent node
@ represents the attribute delimiter
$ represents the variable delimiter
[n] represents the n-th child of the current node
ancestor::div represents the set of parent div nodes
normalize-space(firstname)="Paul" matches Paul regardless of whitespace delimiters
boolean(string($myvar) ) checks for empty strings
/ represents the absolute path of the root node
@* represents all attributes of the current node
-Return all values using a union of attributes, node names, and text values:
@*|node()|text()
-Return all of a node's siblings using a union of the preceding-sibling and following-sibling axes:
preceding-sibling::node() | following-sibling::node()
-Return the adjacent sibling of a specific type
//div/following-sibling::h3
-Check string value of current node
[. = "Matthew Bob"]
-Node identity can be checked using the count() function to see if the intersection of two node-sets of the same length equals the length of either of the node sets(or in the case of a single node set whether it is equal to 1). For example, the following query returns TRUE in this case because both nodes are the same:
count(/bk:books | /bk:books/bk:book[1]/parent::*) = 1