XQuery/Compare with XQuery
Motivation
editYou want to use XQuery to compare two lists of items and find out how the lists are different
Methods
editWe will use a variety of functions that iterate through the lists. For each example we will perform some comparison with the second list.
Simple iteration and test for missing elements
editIn this first example, we will use a simple for loop to go through each item on a linear list. We then will check to see if that item is anywhere on the second list (regardless of order). If it is on the second list we will display the item from the first list. If not we will output "missing". This is useful if you want to find out if a local collection is missing files from some remote collection.
xquery version "1.0";
(: Compare of two linear lists :)
let $list1 :=
<list1>
<item>a</item>
<item>b</item>
<item>c</item>
<item>d</item>
</list1>
let $list2 :=
<list1>
<item>a</item>
<item>c</item>
<item>e</item>
</list1>
return
<missing>{
for $item1 in $list1/item
let $item-text := $item1/text()
return
<test item="{$item-text}">
{if ($list2/item/text()=$item-text)
then ($item1)
else <missing>{$item-text}</missing>
}
</test>
}</missing>
Note that the conditional expression:
if ($list2/item/text() = $item-text)
Tests to see if the $item-text is anywhere in list2. If it occurs anywhere this expression will return true().
Sample Results
edit<missing>
<test item="a">
<item>a</item>
</test>
<test item="b">
<missing>b</missing>
</test>
<test item="c">
<item>c</item>
</test>
<test item="d">
<missing>d</missing>
</test>
</missing>
Note that this will not report any items on the second list that are missing from the first list.
Using Quantified Expressions
editThis can be rewritten using XQuery quantified expressions. There are two reasons for this. First the XQuery optimizer can frequently run quantified expressions much faster and some people feel they are easier to read. See XQuery/Quantified Expressions for more details.
In this second example the list assignments are the same but we will only display the items from list 1 that are missing from list 2.
<missing>{
for $item1 in $list1/item
return
if (some $item2 in $list2/item satisfies $item2/text() = $item1/text())
then ()
else $item1
}</missing>
This returns:
<missing>
<item>b</item>
<item>d</item>
</missing>
We are now ready to modularize this missing function so that we can pass any two lists to find missing elements.
Creating a Missing XQuery Function
editOur next step is to create an XQuery function that compare any two lists and returns the items in the second list that are not in the first list.
declare function local:missing($list1 as node()*, $list2 as node()*) as node()* {
for $item1 in $list1/item
let $item-text := $item1/text()
return
if (some $item2 in $list2/item satisfies $item2/text() = $item1/text())
then ()
else $item1
};
We can rewrite the output function to use this function:
<results>
<missing-from-2>{local:missing($list1, $list2)}</missing-from-2>
<missing-from-1>{local:missing($list2, $list1)}</missing-from-1>
</results>
Note that the order of the lists has been reversed in the second call to the missing() function. The second pass looks for items on list2 that are not on list1.
Running this query generates the following output:
<results>
<missing-from-2>
<item>b</item>
<item>d</item>
</missing-from-2>
<missing-from-1>
<item>e</item>
</missing-from-1>
</results>
Creating HTML Difference Lists
editWe can use CSS to style the output of these reports.
Screen Image
editSample Data
editThis example uses full words of items to show text highlighting:
let $list1 :=
<list>
<item>apples</item>
<item>bananas</item>
<item>carrots</item>
<item>kiwi</item>
</list>
let $list2 :=
<list>
<item>apples</item>
<item>carrots</item>
<item>grapes</item>
</list>
The following function uses HTML div and span elements and adds class="missing" to each div that is missing. The CSS file will highlight this background.
declare function local:missing($list1 as node()*, $list2 as node()*) as node()* {
for $item1 in $list1/item
return
if (some $item2 in $list2/item satisfies $item2/text() = $item1/text())
then <div>{$item1/text()}</div>
else
<div>
{attribute {'class'} {'missing'}}
{$item1/text()}
</div>
};
We then use the following CSS file to highlight the differences. Each missing element must have class="missing" attribute for the missing element to be highlighted in this report.
body {font-family: Ariel,Helvetica,sans-serif; font-size: large;}
h2 {padding: 3px; margin: 0px; text-align: center; font-size: large; background-color: silver;}
.left, .right {border: solid black 1px; padding: 5px;}
.missing {background-color: pink;}
.left {float: left; width: 190px}
.right {margin-left: 210px; width: 190px}
<body>
<h1>Missing Items Report</h1>
<div class="left">
<h2>List 1</h2>
{for $item in $list1/item return <div>{$item/text()}</div>}
</div>
<div class="right">
<h2>List 2</h2>
{for $item in $list2/item return <div>{$item/text()}</div>}
</div>
<br/>
<div class="left">
<h2>List 1 Missing from 2</h2>
{local:missing($list1, $list2)}
</div>
<div class="right">
<h2>List 2 Missing from 1</h2>
{local:missing($list2, $list1)}
</div>
</body>
Collation
editIf the lists are in sorted order, or can be sorted into order, an alternative approach is to recursively collate the two lists. The core algorithm looks like:
declare function local:merge($a, $b as item()* ) as item()* {
if (empty($a) and empty($b))
then ()
else if (empty ($b) or $a[1] lt $b[1])
then ($a[1], local:merge(subsequence($a, 2), $b))
else if (empty($a) or $a[1] gt $b[1])
then ($b[1],local:merge($a, subsequence($b,2)))
else (: a and b matched :)
($a[1], $b[1], local:merge(subsequence($a,2), subsequence($b,2)))
};
With the example above, we can merge two lists.
let $list1 :=
<list>
<item>apples</item>
<item>bananas</item>
<item>carrots</item>
<item>kiwi</item>
</list>
let $list2 :=
<list>
<item>apples</item>
<item>carrots</item>
<item>grapes</item>
</list>
return
<result>
{local:merge($list1/item,$list2/item) }
</result>
The actions on merge will depend on the application and the algorithm can be modified to output only mismatched items on one or other list, and handle matching items appropriately. For example, to display the merged list as HTML, we might modify the algorithm to:
declare function local:merge($a, $b as item()* ) as item()* {
if (empty($a) and empty ($b))
then ()
else if (empty ($b) or $a[1] lt $b[1])
then (<div class="left">{$a[1]/text()}</div>, local:merge(subsequence($a, 2), $b))
else if (empty ($a) or $a[1] gt $b[1])
then (<div class="right">{$b[1]/text()}</div>,local:merge($a, subsequence($b,2)))
else (<div class="match">{$a[1]/text()}</div>, local:merge(subsequence($a,2), subsequence($b,2)))
};