XRX/Subset Generator

< XRX

Motivation edit

You have multiple XML documents and you want the semantics and business rules for all the shared data elements to be driven from a single model.

Method edit

We will use two types of XML Schemas. A semantic schema or subset schema is a sequence of all of the elements of a document exchange including the meaning of the enumerated values of the elements. A constraint schema or exchange schema will import the semantic schema and include the document constraints such the order, grouping and what cardinality each of the elements in the document.

We will leverage many of the concepts of ISO/IEC 11179 metadata registry. We will use a single XQuery that takes in a wantlist and generates a semantic subschema from this registry. This assures that all leaf-level elements in a family of XML Schemas have consistent datatypes and precise semantic definitions that all refer back to the metadata registry. This automatic generation of structures from a central data model is known as model-driven design.

Background on Subset Generation edit

Subset generation techniques were first developed by the Georgia Institute of Technology in support of the GJXDM and NIEMstandards. The tools took the form of a web site that allowed a user to shop for data elements in the registry and keep a list of those data elements in a file called a wantlist. This wantlist was then used to generate a subset of the GJXDM system that is guaranteed to be consistent with the GJXDM. These tools were carried forward for use in the National Information Exchange Model as well as the Minnesota Department of Revenue subschema generators.

Method edit

A subset generator is part two of a three part process.

 
  • Step one: Use a metadata shopping cart to shop for data elements in your registry and to create a wantlist. A shopping cart tool with an integrated search function allows non-programmers to take an active role in schema creation.
  • Step two: is to generate a Subset Schema that conforms to the Liskov Substitution Principle using the wantlist generated in part one as the input.
  • Step three: Import that subset XML Schema into a Constraint Schema. This can be done my referencing the REST parameters to the wantlist directly in the import or it can be done by saving the subset into a discrete file.

Wantlist edit

A Wantlist is a simple enumeration of all the data elements you will place at leaf-elements in your XML Schema. Wantlists do not contain any cardinality (number for repetitions) or complex branch structures. They are much like the grocery list you take shopping in which the quantities of items are not specified.

GTRI Wantlist XML Schema

Sample Wantlist edit

The following is a sample wantlist generated from the US NIEM metadata shopping cart tool. Not only is the three-part property specified (ObjectClass, Property, Representation Term) but the isReference indicators are also available.

<?xml version="1.0" encoding="UTF-8"?>
<w:WantList xmlns:w="http://niem.gov/niem/wantlist/1" w:release="2.0" w:product="NIEM">
    <w:Element w:prefix="nc" w:name="LocationCityName" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationCounty" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationCountyCode" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationPostalCode" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationStateFIPS5-2AlphaCode" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationStateName" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="LocationStateUSPostalServiceCode" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetCategoryText" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetExtensionText" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetFullText" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetName" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetNumberText" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetPostdirectionalText" w:isReference="false"/>
    <w:Element w:prefix="nc" w:name="StreetPredirectionalText" w:isReference="false"/>
</w:WantList>

This wantlist can be saved from the web-shopping-cart tool to your local file system and then reloaded at a future date. In our example, any wantlist can be saved on the eXist server and passed as a parameter to a subset generator. This allows developers to reuse and version wantlists in addition to other XML Schema components.

XQuery Based Subschema Generator edit

This XQuery takes as a single argument a wantlist file name stored in a wantlist collection. It iterates through each element in the wantlist and generates the appropriate XML Schema types.

The format of the URL for the Query is:

[WEBSERVER]/db/wantlists/subset.xq?wantlist=WANTLISTNAME

In this case the .xml file extension can be omitted.


xquery version "1.0";
import module namespace mdr = "http://mdr.example.com" at "/db/mdr/modules/mdr.xq";
declare namespace exist = "http://exist.sourceforge.net/NS/exist"; 
declare namespace xsd = "http://www.w3.org/2001/XMLSchema";
declare namespace system="http://exist-db.org/xquery/system";
declare namespace request="http://exist-db.org/xquery/request";
declare option exist:serialize "method=xhtml media-type=text/xml indent=yes";

let $wantlist := request:get-parameter('wantlist', '')
return
if (string-length($wantlist) < 1)
   then (
        <results>
           <message>Error: "wantlist" is a required parameter.</message>
        </results>
    )
    else (
        let $wantlist-collection := '/db/mdr/wantlists/data'
        let $file-path := concat($wantlist-collection, '/', $wantlist, '.xml')
        let $doc := doc($file-path)
        return
            <xsd:schema 
            xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
            xmlns:e="http://metadata.example.com" 
            targetNamespace="{$doc//NamespaceURI/text()}">
               <xsd:annotation>
                   <xsd:documentation>
                      <wantlist>{$wantlist}</wantlist>
                      <path>{$file-path}</path>
                   </xsd:documentation>
                     </xsd:annotation>
               {for $element in $doc//Element
                   let $prefix := 'p:'
                   let $simpleType := 
                      if (ends-with($element, 'Code'))
                              then
                                 (<xsd:simpleType name="{concat($element/text(), 'Type')}">
                                     <xsd:annotation>
                                            <xsd:documentation>{mdr:get-definition-for-element($element)}</xsd:documentation>
                                     </xsd:annotation>
                                     {mdr:get-restrictions-for-element($element/text())}
                               </xsd:simpleType>
                               )
                      else ()                    
                    let $elementDef := <xsd:element name="{$element/text()}" 
                    type="{if (ends-with($element, 'Code')) 
                       then (concat($prefix, $element/text(), 'Type'))
                       else (mdr:get-xml-schema-datatype-for-element($element))}" nillable="true">
                         <xsd:annotation>
                                <xsd:documentation>{mdr:get-definition-for-element($element)}</xsd:documentation>
                         </xsd:annotation>
                    </xsd:element>

                    return ($simpleType, $elementDef)
               }
            </xsd:schema>
)

Using an XML Schema to validate your XForms edit

Some XForms systems allow you to use an XML Schema to validate your form. For example, with Orbeon forms you can do the following:

<xf:model schema="MyTypeLibrary.xsd">

Back: Metadata Shopper Next: XForms Generator