LaTeX/Export To Other Formats
Help and Recommendations
Strictly speaking, LaTeX source can be used to directly generate two formats:
- DVI using latex, the first one to be supported;
- PDF using pdflatex, more recent.
Using other software freely available on Internet, you can easily convert DVI and PDF to other document formats. In particular, you can obtain the PostScript version using software which is included in your LaTeX distribution. Some LaTeX IDE will give you the possibility to generate the PostScript version directly (even if it uses internally a DVI mid-step, e.g. LaTeX → DVI → PS). It is also possible to create PDF from DVI and vice versa. It doesn't seem logical to create a file with two steps when you can create it straight away, but some users might need it because, as you remember from the first chapters, the format you can generate depends upon the formats of the images you want to include (EPS for DVI, PNG and JPG for PDF). Here you will find sections about different formats with description about how to get it.
Other formats can be produced, such as RTF (which can be used in Microsoft Word) and HTML. However, these documents are produced from software that parses and interprets the LaTeX files, and do not implement all the features available for the primary DVI and PDF outputs. Nonetheless, they do work, and can be crucial tools for collaboration with colleagues who do not edit documents with LaTeX.
This chapter features a lot of third-party tools; most of them are installed independently of your TeX distribution.
Some tools are Unix-specific (*BSD, GNU/Linux and Mac OS X), but it may be possible to make them work on Windows. If you have the choice, it is often easier with Unix systems for command line tools.
Some tools may already be installed. For instance, you can check if dvipng is installed and ready to use (Unix only):
You get a directory if it is OK. [[w Most of these tools are installable using your package manager or portage tree (Unix only).
This section describes how to generate a screenshot of a LaTeX page or of a specific part of the page using the LaTeX package preview. Screenshots are useful, for example, if you want to include a LaTeX generated formula on a presentation using you favorite slideware like Powerpoint, Keynote or LibreOffice Impress. First, start by making sure you have preview. See Installing Extra Packages.
Say you want to take a screenshot of
Write this formula in the preview environment:
Note the active option in the package declaration and the preview environment around the equation's code. Without any of these two, you won't get any output.
This package is also very useful to export specific parts to other format, or to produce graphics (e.g. using PGF/TikZ) and then including then in other documents. You can also automate the previewing of specific environments:
This will produce a PDF containing only the listing content, the page layout will depend on the shape of the source code.
Convert to PDF
DVI to PDF
will create my_file.pdf. Another way is to pass through PS generation:
dvi2ps myfile.dvi ps2pdf myfile.ps
you will get also a file called my_file.ps that you can delete.
If you have created different PDF documents and you want to merge them into one single PDF file you can use the following command-line command. You need to have Ghostscript installed:
gswin32 -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=Merged.pdf -dBATCH 1.pdf 2.pdf 3.pdf
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=Merged.pdf -dBATCH 1.pdf 2.pdf 3.pdf
Alternatively, PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. This program may be avaliable in your Linux distribution's repository.
Another option to check out is pdftk (or PDF toolkit), which is a command-line tool that can manipulate PDFs in many ways. To merge one or more files, use:
pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf
Note: If you are merging external PDF documents into a LaTeX document which is compiled with
pdflatex, a much simpler option is to use the pdfpages package, e.g.:
Three simple shell scripts using the pdfpages package are provided in the pdfjam bundle by D. Firth. They include options for merge several pdf (pdfjoin), put several pages in one physical sheet (pdfnup) and rotate pages (pdf90).
See also Modular Documents
You can also use XeTeX (or, more precisely, XeLaTeX), which works in the same way as pdflatex: it creates a PDF file directly from LaTeX source. One advantage of XeTeX over standard LaTeX is support for Unicode and modern typography. See its Wikipedia entry for more details.
Customization of PDF output in XeTeX (setting document title, author, keywords etc.) is done using the configuration of hyperref package.
Convert to RTF
LaTeX can be converted into an RTF file, which in turn can be opened by a word processor such as OpenOffice.org Writer or Microsoft Word. This conversion is done through latex2rtf, which can run on any computer platform. The program operates by reading the LaTeX source, and mimicking the behaviour of the LaTeX program.
latex2rtf supports most of the standard implementations of LaTeX, such as standard formatting, some math typesetting, inclusion of EPS, PNG or JPG graphics, and tables. As well, it has some limited support for packages, such as varioref, and natbib. However, many other packages are not supported.
latex2rtf is simple to use. The Windows version has a GUI (l2rshell.exe), which is straightforward to use. The command-line version is offered for all platforms, and can be used on an example mypaper.tex file:
latex mypaper bibtex mypaper # if you use bibtex latex2rtf mypaper
latex and (if needed)
bibtex commands need to be run before
latex2rtf, because the .aux and .bbl files are needed to produce the proper output. The result of this conversion will create myfile.rtf, which you may open in many word processors such as Microsoft Word or LibreOffice.
Convert to HTML
There are many converters to HTML.
latex2html -html_version 4.0,latin1,unicode -split 1 -nonavigation -noinfo -title "MyDocument" MyDocument.tex
TeX4ht is a very powerful conversion program, but its configuration is not straightforward. Basically a configuration file has to be prepared, and then the program is called.
Convert to image formats
It is sometimes useful to convert LaTeX output to image formats for use in systems that do not support DVI nor PDF files, such as Wikipedia.
There is two family of graphics:
- Vector graphics can be scaled to any size, thus do not suffer from quality loss. SVG is a vector format.
- Raster graphics define every pixel explicitly. PNG is a raster format.
So vector graphics are usually preferred. There is still some cases where raster graphics are used:
- The target system does not handle vector graphics, only raster graphics are supported.
- SVG can not embed fonts. So either the font will be rendered using a local .ttf or .otf font (which will mostly change the output), or all characters must be turned to vector graphics. This last method makes the SVG big and slow. If the input LaTeX file contains a lot of text which formatting must be preserved, SVG is not that great.
So SVG is great for drawings and a small amount of text. JPG is a well known raster formats, however it is usually not as good as PNG for text.
In some cases it may be sufficient to simply copy a region of a PDF (or PS) file using the tools available in a PDF viewer (for example using LaTeX to typeset a formula for pasting into a presentation). This however will not generally have sufficient resolution for whole pages or large areas.
There is pdftocairo featured in the poppler toolset.
pdftocairo -svg latexdoc.pdf output.svg
pdftocairo also supports various raster graphic formats.
Direct conversion from PDF to SVG can be done using the command line tool pdf2svg.
pdf2svg file.pdf file.svg
Alternatively DVI or PDF can be converted to PS as described before, then the bash script ps2svg.sh can be used (as all the software used by this script is multiplatform, this is also possible in Windows, a step-by-step guide could be written).
One can also use dvisvgm, an open source utility that converts from DVI to SVG.
Inkscape is able to convert to SVG, PDF, EPS, and other vector graphic formats.
inkscape --export-area-drawing --export-ps=OUTPUT INPUT inkscape --export-area-page --export-plain-svg=OUTPUT INPUT
Open your file with GIMP. It will ask you which page you want to convert, whether you want to use anti-aliasing (choose strong if you want to get something similar to what you see on the screen). Try different resolutions to fit your needs, but 100 dpi should be enough. Once you have the image within GIMP, you can post-process it as you like and save it to any format supported by GIMP, as PNG for example.
A method for DVI files is dvipng. Usage is the same as dvipdfm.
Run latex as usual to generate the dvi file. Now, we want an X font size formula, where X is measure in pixels. You need to convert this, to dots per inch (dpi). The formula is: <dpi> = <font_px>*72.27/10. If you want, for instance, X = 32, then the size in dpi corresponds to 231.26. This value will be passed to dvipng using the flag -D. To generate the desired png file run the command as follows:
dvipng -T tight -D 231.26 -o foo.png foo.dvi
The flag -T sets the size of the image. The option tight will only include all ink put on the page. The option -o sends the output to the file name foo.png.
The convert command from the ImageMagick suite can convert both DVI and PDF files to PNG.
convert input.pdf output.png
You can optimize the resulting image using optipng so that it will take up less space.
Convert to plain text
If you are thinking of converting to plain text for spell-checking or to count words, there may be an easier way -- read Tips and Tricks first.
Most LaTeX distributions come with detex program, which strips LaTeX commands. It can handle multi-file projects, so all you need is to give one command:
(note the omission of .tex extension). This will output result to standard output. If you want the plain text go to a file, use
detex yourfile > yourfile.txt
If the output from detex does not satisfy you, you can try a newer version available on Google Code, or use HTML conversion first and then copy text from your browser.
If you want to keep the formating, you can use a DVI-to-plain text converter, like catdvi. Example:
catdvi yourfile.dvi | fmt -u
The use of fmt -u (available on most Unices) will remove the justification.