Apache Ant/XMLvalidate

Motivation edit

You want a command-line interface to validate one or more XML files.

Instructors Note: This file is used as a lab exercise for an Apache Ant class that includes extensive use of XML.

Method edit

You can use Apache ant to check a file or group of files for their validity. This is done by using the <xmlvalidate> Apache Ant task. The xmlvalidate ant task will use a standard ant <fileset> and go through and check each file. In the example below, we specify a directory called "in" using a property. We then use the fileset to find all XML files in that directory and all subdirectories of that directory. Each file is successfully checked for validity against an XML schema.

Sample Ant Task to Validate All XML Files in a Folder edit

<project default="ValidateXML">

   <property name="MYROOTDIR" value="in"/>
   <target name="ValidateXML" description="Checks that all files at or below MYROOTDIR are well formed">
     <xmlvalidate>
        <fileset dir="${MYROOTDIR}" includes="**/*.xml"/>
        <attribute name="http://xml.org/sax/features/validation" value="true"/>
        <attribute name="http://apache.org/xml/features/validation/schema"  value="true"/>
        <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
     </xmlvalidate>
   </target>
 
 </project>

In the above example, we assume that each XML file has a directive that tells it where to get its XML Schema.

This target will run the default XML parser that comes with Ant (usually Xerces) and report any file that is not well-formed.

Sample XML Schema MyMessages.xsd edit

To test this you will need a small XML Schema file. The following file read a files of three messages:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
    <xs:element name="MyMessage" type="xs:string"/>
    <xs:element name="MyMessages">
    <xs:complexType>
        <xs:sequence>
           <xs:element ref="MyMessage" maxOccurs="3"/>
        </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Sample Valid Data File edit

Here is a sample message file:

<?xml version="1.0" encoding="UTF-8"?>
<MyMessages xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="MyMessages.xsd">
   <MyMessage>Hello World!</MyMessage>
   <MyMessage>ANT AND XML Schema ROCK</MyMessage>
</MyMessages>

Note that the noNamespaceSchemaLocation attribute of the root element tells it to look in the current directory to find the XML schema file (MyMessages.xsd)

Sample Invalid Data File edit

If you add a fourth message the file should fail validation according to the rules in the XML Schema above.

<?xml version="1.0" encoding="UTF-8"?>
<MyMessages xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
xsi:noNamespaceSchemaLocation="MyMessages.xsd">
   <MyMessage>Hello From XSLT</MyMessage>
   <MyMessage>From input: Hi</MyMessage>
   <MyMessage>ANT AND XSLT ROCK</MyMessage>
   <MyMessage>I am the fourth message.</MyMessage>
</MyMessages>

To test this example, add a folder called "in" and put several XML files in the folder that are not valid. In this case we created a invalid file called MyInputBad.xml. When we type "build" at the command line the following was the output:

Sample Output edit

 ValidateXML:
 [xmlvalidate] C:\XMLClass\Ant\validate\in\MyInput.xml:6:15: cvc-complex type.2.4.d:    
 Invalid content was found starting with element 'MyMessage'. No child element is expected at this point.

This is a sample output. Note that the error message does not indicate that you exceed a count of 3 data elements.

Supplying an XML Schema definition file edit

If you are working in the null namespace add the following attribute:

  <attribute name="http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation" value="${xsd.file}"/>

If your documents have a namespace use the following:

  <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
  <attribute name="http://apache.org/xml/properties/schema/external-schemaLocation" value="${xsd.file}"/>

If the XML files do not include their own schema, you can also create an ant task that includes where to find the XML schema. This is done using an special ant property.

 <property
   name="http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation"
   value="${xsd.file}"/>
 
 <xmlvalidate file="xml/endpiece-noSchema.xml" lenient="false" failonerror="true" warn="true">
    <attribute name="http://apache.org/xml/features/validation/schema" value="true"/>
    <attribute name="http://xml.org/sax/features/namespaces" value="true"/>
 </xmlvalidate>

Schematron Validate edit

Apache ant also has an element to validate against a schematron rules file

<taskdef name="schematron"
classname="com.schematron.ant.SchematronTask"
classpath="lib/ant-schematron.jar"/>

<schematron schema="rules.sch" failonerror="false">
   <fileset includes="schmatron-input.xml"/>
</schematron>

See http://www.schematron.com/resource/Using_Schematron_for_Ant.pdf for more details.

Navigation edit

Previous Chapter, Next Chapter

See also edit

References edit