Last modified on 18 June 2009, at 04:14

ETD Guide/Technical Issues/Conversions from LaTeX to SGML\XML

Looking at this problem, it first gives the impression that because writing in LaTeX is also a kind of structured writing, a conversion into SGML or XML compatible documents can quite easily be done.

Often people use as software Onmimark, Balise or simple Perl scripts for conversion.


Problems

Problem 1 The LaTeX format itself enables due to the structural approach an easier conversion into SGML/XML. But often the pure LaTeX approach goes hand in hand with sophisticated macro programming, so that in many cases the pure structure of a LaTeX document will be destroyed by macros or the conversion will be much harder to perform. Also the lack of a parser that checks the correct usage of structure elements like chapter, section, subsection make a conversion more complicated.

Problem 2 If an author wants to define a mathematical formula in LaTeX, he has 2 basic opportunities:

  • Producing the mathematical formula as picture
  • Defining them with the appropriate mathematical LaTeX features as text formulas.


First and Second Versions

The first version prevents from any secondary usage of the formula. The second version allows the reusability of a formula in different contexts. So it makes it easy to prove correctness of a statement by importing the LaTeX formula into a mathematical software package like Maple or Mathematica. Formulas coded in LaTeX can be displayed in a rendered form in an browser by software or plug-ins like IBM Techexplorer, Math Viewer. As LaTeX coded formulas still have the disadvantage, that they are not encoded using so called sematic tags, the usage of MathML is highly advised MathML is an XML document type definition for mathematics developed by the W3C.


The Letter e

In LaTeX authors often don't distinguish between the letter 'e', that may stand for a variable and the Euler constant. In MathML there is a huge difference whether something is encoded as variable e or as the Euler e (2,718. . . ). Therefor, the usage of layout definition in LaTeX for mathematics complicats the conversion into MathML and therefore into any SGML/XML format.


Possible Solutions

In order to prepare mathematical formulas in LaTeX for a conversion, many universities and the TeX. User Groups around the world are working soon the definition of certain macros that can be transformed into the appropriate MathML definitions.

See University of Montréal at www.theses.umontreal.ca or University of the Bundeswehr in Munich.



Next Section: Rendering-style sheets