Last modified on 14 March 2012, at 19:25

XQuery/DocBook to PDF

MotivationEdit

You want to convert your DocBook 5 files to PDF format. PDF standardized page-layout format that allows you to to print books using standards.

MethodEdit

We will create an XQuery module with one main TypeSwitch statement for each of the main elements of DocBook. This will create an XSL-FO file that can then be converted directly to PDF using the Apache-FO 1.0 processor.

This will be done entirely using XQuery. No XSLT will be required.

Sample Input DocumentEdit

We will start with a simple DocBook 5 document. This document used the DocBook namespace and includes the xlink namespace. A very small sample of the document might have the following structure:

<book xmlns="http://docbook.org/ns/docbook"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
    <info>
        <title>Converting DocBook to PDF using XQuery</title>
        <author>
            <orgname>Kelly McCreary &amp; Associates</orgname>
            <address>
                <city>Minneapolis</city>
                <country>USA</country>
            </address>
            <email>user@example.com</email>
        </author>
    </info>
    <part>
        <title>Introduction</title>
        <subtitle>Why DocBook and PDF</subtitle>
        <chapter>
            <title>Introduction to DocBook</title>
            <subtitle>Getting Started with DocBook Version</subtitle>
            <sect1>
                <title>Page Layout vs. Scrolling HTML</title>
                <subtitle>Why PDF is Used For Printing</subtitle>
                <para>Printing and pagination will still be important till ePub becomes standardized.</para>
            </sect1>
        </chapter>
        <chapter>
            <title>XQuery Typeswitch Transforms</title>
            <subtitle>How To Get Comfortable with Recursive Programs</subtitle>
            <sect1>
                <title>Why XQuery Can Replace XSLT</title>
                <subtitle>One language for the server</subtitle>
                <para>Text</para>
            </sect1>
            <sect1>
                <title>XSL-FO</title>
                <subtitle>A language for paginated layout</subtitle>
                <para>Text</para>
            </sect1>
        </chapter>
    </part>
</book>