XQuery/Advanced Search

Motivation

edit

You have multiple fields that you would like to search on. You want to allow users to optionally search on specific fields and perform a boolean "AND" when multiple fields are used.

For example you may have a database of people. Each person has a first name, last name, e-mail and phone. You want to allow users to search on any single field or multiple fields together. If two fields are entered only records that match both fields will be returned.

Method

edit

We will use a standard HTML form with multiple input and selection fields. We will check each incoming search request for each parameter and if the parameter is not null we will concatenate a single query with many predicates and then evaluate it using the util:eval() function.

Example XML Data Set

edit

In the following example we will use an XML file that contains list of people. The format will be the following:

<people>
   <person>
      <id>123</id>
      <firstname>John</firstname>
      <lastname>Smith</lastname>
      <phone>(123) 456-7890</phone>
      <email>john.smith@example.com</email>
      <type>faculty</type>
   </person>
   <person>
      <id>456</id>
      <firstname>Sue</firstname>
      <lastname>Jones</lastname>
      <phone>(123) 654-0123</phone>
      <email>sue.jones@example.com</email>
      <type>staff</type>
   </person>
</people>

Background on Predicates

edit

If you have a single "where clause" (called a predicate) you can always place this predicate to the end of an XPath expression. For example the following FLWOR expression will return all person records in the system:

  for $person in collection('/db/apps/directory')//person
  return
     $person

You can now restrict this to only include faculty by adding a predicate:

  for $person in collection('/db/apps/directory')//person[type='faculty']
  return
     $person

You can now search for all faculty with a first name of "mark" buy just adding an additional predicate:

  for $person in collection('/db/apps/directory')//person[type='faculty'][firstName='mark']
  return
     $person

Sample Search Form

edit

 

Sample HTML code for advanced search form

edit

The following is an HTML form section for this form.

<form method="get" action="advanced-search.xq">            
    <label>First Name: </label>
    <input type="text" name="firstname"/>
    <br/>
    
    <label>Last Name: </label>
    <input type="text" name="lastname"/>
    <br/>
    
    <label>E-Mail: </label>
    <input type="text" name="email" size="40"/>
    <br/>
    
    <label>Phone: </label>
    <input type="text" name="phone"/>
    <br/>
    
    <label>Primary Type: </label>
    <select name="type">
        <option value="">- Select -</option>
        <option value="staff">Staff</option>
        <option value="faculty">Faculty</option>
        <option value="student">Students</option>
    </select>
    <br/>
    
    <input type="submit" name="Submit"/>
</form>

When the user adds a name of "John" to the first name field and selects a type of "staff" and then the submit query button is pressed the following is an example of the URL created by this form:

  advanced-search.xq?firstname=John&lastname=&email=&phone=&type=staff&Submit=Submit+Query

Note that most of the fields are null. Only firstname and type have a value to the right of the equal sign.

Sample Search Service

edit

The search service will have the following code sections.

Getting the URL parameters

edit

The following code fragment will get the URL parameters from the incoming URL request and assign them to XQuery variables.

let $firstname := lower-case(request:get-parameter('firstname', ''))
let $lastname := lower-case(request:get-parameter('lastname', ''))
let $email := lower-case(request:get-parameter('email', ''))
let $phone := lower-case(request:get-parameter('phone', ''))
let $type := lower-case(request:get-parameter('type', ''))

Note that each of the incoming parameters is first converted to lowercase before any comparisons are done.

Building the Predicate Strings

edit

We are now ready to start building our predicates. Since many of the fields will be empty we will only construct a predicate if the variable exists.

let $firstname-predicate := if ($firstname) then concat('[lower-case(firstname)', " = '", $firstname, "']") else ()
let $lastname-predicate := if ($lastname) then concat('[lower-case(lastname)', " = '", $lastname, "']") else ()
let $email-predicate := if ($email) then concat('[lower-case(email)', " = '", $email, "']") else ()
let $phone-predicate := if ($phone) then concat("[contains(phone, '", $phone, "')]") else ()
let $type-predicate := if ($type) then concat('[type', " = '", $type, "']") else ()

For the firstname, lastname, and email we are comparing the incoming parameter with the lowercase string in the XML file. With the phone number we are using the contains() function to return all records that have a string somewhere in the phone number. The type is using an exact match since both the case of the data and keyword are known precisely.

The most challenging aspect of this program is learning how to get the order of the quotes correct. In general I use single quotes for enclosing static strings unless that string itself must contain a single quote. Then we use double quotes. The most difficult part is to assemble a string such as [type = 'staff'] and to remember to put the single quotes around the word staff. If you can figure this out the rest will be easy.

If you are having trouble you can also break the concat into multiple lines:

  concat(
     '[type',
     " = '",
     $type,
     "']"
  )

Where each line clearly must start and end with the same type of quote.

Concatenating the Query

edit

To create an eval string we just need to create a single long string starting with the collection and add each of the predicates. If there was no argument the predicate strings will be null.

let $eval-string := concat
   ("collection('/db/apps/directory/data')//person", 
    $firstname-predicate,
    $lastname-predicate,
    $email-predicate,
    $phone-predicate,
    $type-predicate
   )

The query with just a lastname and type would then look like this:

collection('/db/apps/directory/data')//person[lower-case(firstname) = 'John'][lower-case(type) = 'faculty']

Note that some advanced system will modify the order of the predicates based on the most likely to narrow the search. Since there are fewer records with the first name John than there are faculty it is always more efficient to put the first name before the type. This means that fewer nodes need to be moved from hard disk into RAM and the query will execute much faster.

Executing the Query

edit

The execution of the query is done by passing the eval string to the util:eval function.

  let $personsĀ := util:eval($eval-string) 

Displaying the Results

edit

We are now ready to display all of the results. We do this by creating a FLWOR statement for each person and returning a <div> element for each hit. The <div> elements each have single link with the lastname, firstname and type as the link content. When the user clicks on each link an item viewer is used and the ID of the person is passed to the item viewer.

for $person in $persons
   let $id := $person/id
   let $lastname := $person/lastname
   let $firstname := $person/firstname
   order by $lastname, $firstname
   return
      <div class="hit">
         <a href="../views/view-item.xq?id={$id}">
           {$lastname},
           {$person//firstname/string()} {' '}
           {$person/type/string()}
         </a>
      </div>

NGram Searching

edit

In your conf.xml module make sure the following line is uncommented:

<module uri="http://exist-db.org/xquery/ngram"  class="org.exist.xquery.modules.ngram.NGramModule" />

Here is the page on NGram elements in your collection.xconf file:

NGram Configuration File

After you edit then reindex.

You can now use any of the following functions:

NGram Functions

Acknowledgments

edit

This example has been provided by Eric Palmer and his staff at the University of Richmond, USA.