## The ProblemEdit

If ETDs should be archived for he next 20–50 years and still be readable and usable, it in necessary, that equally to the approach for MS Word, we use predefined style sheets for LaTeX. Only by standardising the usage of Latex, a quick and sustainable solution for a conversion into XML can be designed. As LaTeX is mostly used within the natural sciences and mathematics, the encoding of complex mathematical symbols, formulas and expressions is one of the major problems for such a conversion. As there are XML document type definitions or schematics for mathematics, MathML (see http://www.w3.org/math ) and most math software, like Maple, Mathematica, etc. supports an export into MathML, this standards has to be used as an output from LaTeX as well.

The LaTeX format should enable an easier conversion to XML, because of its structured approach to text processing. But the usage habits of LaTeX users, which tend to program complex macro packages in order to style a sophisticated print layout make it much more difficult to receive homogeneously structured documents in most cases. Also does the not parseability for structural and syntactical correctness complicates a conversion.

Converting mathematical expressions into XML can be done using 3 different strategies:

- Convert them into graphics that are easily interpreted and presented by common Internet

browsers. Here a search within formulas or a further usage is excluded.

- To convert them into MathML,
- To leave them in a LaTeX encoding within the XML file. Then Plugins like IBM

Techexplorer, or Math Viewer are able to interpret the LaTeX code and produce an on-the –fly rendering of formulas and mathematical expressions.

There are semantic differences in LaTeX between the encoding of formulas. So authors have to be aware of the differences of LaTeX –tags or commands that are on a semantic level and on those, which are on a layout, level.

*Example:*

Pi represents the mathematical constant, which is the ratio of a circle's circumference to its diameter, approximately 3.141592653.

Encoding this in MathML <pi>

<apply>

<approx/>

<pi/>

- <cn type = "rational">22<sep/>7</cn>

</apply>

This will be rendered as follows:

Instead of coding it simply as letter pi, which may stay as a name for a variable:

<apply>

- <approx/>
- <pi/>
- <cn type = "rational">22<sep/>7</cn>

</apply>

This would be rendered as: pi ≈ 22 / 7

## Software and ToolsEdit

In order to produce an XML document out of a LaTeX document there are several possibilities:

- TeX4ht is a highly configurable TeX-based authoring system for producing hypertext. It interacts with TeX-based applications through style files and postprocessors, leaving the processing of the source files to the native TeX compiler. Consequently, TeX4ht can handle the features of TeX-based systems in general, and of the LaTeX and AMS style files in particular. (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html )
- WebEQ : a Java-based collection of tools for authoring and rendering MathML, including a visual editor, a WebTeX to MathML translator, and a rendering applet for interactive mathematics on Web pages. WebEQ also provides Java Programmers with API documentation and libraries for other MathML aware applications. (http://www.dessci.com/de/features/win/default.stm#TeX or http://www.dessci.com/features/win/default.stm#TeX )

For further information on the usage of different tools, please use:

Michael Goosens; Sebastian Rahtz: The LaTeX Web Companion, Addison-Wesley, 1999: ISBN 0-201- 43311-7

Next Section: Checking and correcting