LaTeX/Export To Other Formats
Strictly speaking, LaTeX source can be used to directly generate two formats:
- DVI using latex, the first one to be supported;
- PDF using
pdflatex
, more recent.
Using other software freely available on Internet, you can easily convert DVI and PDF to other document formats. In particular, you can obtain the PostScript version using software which is included in your LaTeX distribution. Some LaTeX IDE will give you the possibility to generate the PostScript version directly (even if it uses internally a DVI mid-step, e.g. LaTeX → DVI → PS). It is also possible to create PDF from DVI and vice versa. It doesn't seem logical to create a file with two steps when you can create it straight away, but some users might need it because, as you remember from the first chapters, the format you can generate depends upon the formats of the images you want to include (EPS for DVI, PNG and JPG for PDF). Here you will find sections about different formats with description about how to get it.
Other formats can be produced, such as RTF (which can be used in Microsoft Word) and HTML. However, these documents are produced from software that parses and interprets the LaTeX files, and do not implement all the features available for the primary DVI and PDF outputs. Nonetheless, they do work, and can be crucial tools for collaboration with colleagues who do not edit documents with LaTeX.
Tools installation
editThis chapter features a lot of third-party tools; most of them are installed independently of your TeX distribution.
Some tools are Unix-specific (*BSD, GNU/Linux and Mac OS X), but it may be possible to make them work on Windows. If you have the choice, it is often easier with Unix systems for command line tools.
Some tools may already be installed. For instance, you can check if dvipng is installed and ready to use (Unix only):
type dvipng
Most of these tools are installable using your package manager or portage tree (Unix only).
Preview mode
editThis section describes how to generate a screenshot of a LaTeX page or of a specific part of the page using the LaTeX package preview. Screenshots are useful, for example, if you want to include a LaTeX generated formula on a presentation using your favorite slideware like Powerpoint, Keynote or LibreOffice Impress. First, start by making sure you have preview. See Installing Extra Packages.
Say you want to take a screenshot of
Write this formula in the preview environment:
\documentclass{article}
\usepackage[active]{preview}
\begin{document}
\begin{preview}
\[
\pi = \sqrt{12}\sum^\infty_{k=0} \frac{ (-3)^{-k} }{ 2k+1 }
\]
\end{preview}
\end{document}
|
Note the active option in the package declaration and the preview environment around the equation's code. Without any of these two, you won't get any output.
This package is also very useful to export specific parts to other format, or to produce graphics (e.g. using PGF/TikZ) and then including them in other documents. You can also automate the previewing of specific environments:
\usepackage[active,tightpage]{preview}
\PreviewEnvironment{lstlisting}
\setlength{\PreviewBorder}{10pt}%
% ...
\begin{lstlisting}
int main()
{
/* ... */
}
\end{lstlisting}
|
This will produce a PDF containing only the listing content, the page layout will depend on the shape of the source code.
Directly
editpdflatex my_file
DVI to PDF
editdvipdfm my_file.dvi
will create my_file.pdf. Another way is to pass through PS generation:
dvi2ps myfile.dvi ps2pdf myfile.ps
you will get also a file called my_file.ps that you can delete.
Merging PDF
editIf you have created different PDF documents and you want to merge them into one single PDF file you can use the following command-line command. You need to have Ghostscript installed:
Using Windows
editgswin32 -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=Merged.pdf -dBATCH 1.pdf 2.pdf 3.pdf
Using Linux
editgs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=Merged.pdf -dBATCH 1.pdf 2.pdf 3.pdf
Alternatively, PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. This program may be available in your Linux distribution's repository.
Another option to check out is pdftk (or PDF toolkit), which is a command-line tool that can manipulate PDFs in many ways. To merge one or more files, use:
pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf
Using pdfLaTeX
editNote: If you are merging external PDF documents into a LaTeX document which is compiled with pdflatex
, a much simpler option is to use the pdfpages package, e.g.:
\usepackage{pdfpages}
...
\includepdf[pages=-]{Document1.pdf}
\includepdf[pages=-]{Document2.pdf}
...
|
Three simple shell scripts using the pdfpages package are provided in the pdfjam bundle by D. Firth. They include options to merge several pdf files (pdfjoin), put several pages in one physical sheet (pdfnup) and rotate pages (pdf90).
See also Modular Documents
XeTeX
editYou can also use XeTeX (or, more precisely, XeLaTeX), which works in the same way as pdflatex: it creates a PDF file directly from LaTeX source. One advantage of XeTeX over standard LaTeX is support for Unicode and modern typographic technologies such as TrueType/OpenType fonts. See its Wikipedia entry for more details.
Customization of PDF output in XeTeX (setting document title, author, keywords etc.) is done using the configuration of hyperref package.
Convert to PostScript
edit- from PDF
pdf2ps my_file.pdf
- from DVI
dvi2ps my_file.dvi
LaTeX can be converted into an RTF file, which in turn can be opened by a word processor such as LibreOffice Writer or Microsoft Word. This conversion is done through latex2rtf, which may run on any computer platform, however is only actively supported on Windows, Linux and BSD, with the last mac update being from 2001 (a recent version for OSX is available via MacPorts). The program operates by reading the LaTeX source, and mimicking the behaviour of the LaTeX program. latex2rtf
supports most of the standard implementations of LaTeX, such as standard formatting, some math typesetting, inclusion of EPS, PNG or JPG graphics, and tables. As well, it has some limited support for packages, such as varioref, and natbib. However, many other packages are not supported.
latex2rtf is simple to use. The Windows version has a GUI (l2rshell.exe), which is straightforward to use. The command-line version is offered for all platforms, and can be used on an example mypaper.tex file:
latex mypaper bibtex mypaper # if you use bibtex latex2rtf mypaper
Both latex
and (if needed) bibtex
commands need to be run before latex2rtf
, because the .aux and .bbl files are needed to produce the proper output. The result of this conversion will create myfile.rtf, which you may open in many word processors such as Microsoft Word or LibreOffice.
Convert to HTML
editThere are many converters to HTML. Some of them use an intermediate file which then will be converted to the destination format.
hevea mylatexfile
- latex2html
latex2html -html_version 4.0,latin1,unicode -split 1 -nonnavigation -noinfo -title "MyDocument" MyDocument.tex
latexmlc paper.tex --destination=paper.html
pdf2htmlEX [options] <input.pdf> [<output.html>]
pdf2htmlEX can convert PDF to HTML without losing text or format. It is designed as a general PDF to HTML converter, not only restricted to the PDF generated by LaTeX source. LaTeX users can compile the LaTeX source code to PDF, and then convert the PDF to HTML via pdf2htmlEX. Some introductions of pdf2htmlEX can be found on its own wiki page. More technical details can be found on the paper published on TUGboat: Online publishing via pdf2htmlEX HTML / PDF. The Figure 3 of the paper gives different work-flows of publishing HTML online.
- TeX4ht
TeX4ht has many options and possible configurations, but for a basic conversion,
htlatex myfile.tex
will usually result in a reasonable HTML approximation. An introduction by the original author was published in TUGboat [1].
For exporting the BibTeX file only.
bibtex2html mybibtexfile
Convert to image formats
editIt is sometimes useful to convert LaTeX output to image formats for use in systems that do not support DVI nor PDF files, such as Wikipedia.
There are two families of graphics:
- Vector graphics can be scaled to any size, thus do not suffer from quality loss. SVG is a vector format.
- Raster graphics define every pixel explicitly. PNG is a raster format.
So vector graphics are usually preferred. There is still some cases where raster graphics are used:
- The target system does not handle vector graphics, only raster graphics are supported.
- SVG can not embed fonts. So either the font will be rendered using a local .ttf or .otf font (which will mostly change the output), or all characters must be turned to vector graphics. This last method makes the SVG big and slow. If the input LaTeX file contains a lot of text which formatting must be preserved, SVG is not that great.
So SVG is great for drawings and a small amount of text. JPG is a well known raster formats, however it is usually not as good as PNG for text.
In some cases it may be sufficient to simply copy a region of a PDF (or PS) file using the tools available in a PDF viewer (for example using LaTeX to typeset a formula for pasting into a presentation). This however will not generally have sufficient resolution for whole pages or large areas.
Multiple formats
edit- pdftocairo
There is pdftocairo featured in the poppler toolset.
pdftocairo -svg latexdoc.pdf output.svg
pdftocairo also supports various raster graphic formats.
Vector graphics
edit- pdf2svg
Direct conversion from PDF to SVG can be done using the command line tool pdf2svg.
pdf2svg file.pdf file.svg
- ps2svg
Alternatively DVI or PDF can be converted to PS as described before, then the bash script ps2svg.sh can be used (as all the software used by this script is multiplatform, this is also possible in Windows, a step-by-step guide could be written).
- dvisvgm
One can also use dvisvgm, an open source utility that converts from DVI to SVG.
dvisvgm -n file.dvi
- Inkscape
Inkscape is able to convert to SVG, PDF, EPS, and other vector graphic formats.
inkscape --export-area-drawing --export-ps=OUTPUT INPUT inkscape --export-area-page --export-plain-svg=OUTPUT INPUT
Raster graphics
edit- JPEG
- Run ghostscript on the PostScript file created by pdf2ps as follows:MacOS: macTex distribution come with handy cli for "printing":
echo "quit" | gs -sDEVICE=jpeg -sOutputFile=document.jpg -r300 document.ps
pdftoppm is flexible on the manipulation - you can provide quality, dimention etc. which fits most needs of typical user. It can also print PDF into PNG and PPM files, read more in the manual for the utility. It is best on non interactive batch jobs.pdftoppm yourpdf.pdf -progress -jpeg yourpdf.jpg
- Run ghostscript on the PostScript file created by pdf2ps as follows:
- GIMP
- Open your file with GIMP. It will ask you which page you want to convert, whether you want to use anti-aliasing (choose strong if you want to get something similar to what you see on the screen). Try different resolutions to fit your needs, but 100 dpi should be enough. Once you have the image within GIMP, you can post-process it as you like and save it to any format supported by GIMP, as PNG for example.
- dvipng
- A method for DVI files is dvipng. Usage is the same as dvipdfm. Run latex as usual to generate the dvi file. Now, we want an X font size formula, where X is measure in pixels. You need to convert this, to dots per inch (dpi). The formula is: <dpi> = <font_px>*72.27/10. If you want, for instance, X = 32, then the size in dpi corresponds to 231.26. This value will be passed to dvipng using the flag -D. To generate the desired png file run the command as follows:
dvipng -T tight -D 231.26 -o foo.png foo.dvi
The flag -T sets the size of the image. The option tight will only include all ink put on the page. The option -o sends the output to the file name foo.png.
- ImageMagick
- The convert command from the ImageMagick suite can convert both DVI and PDF files to PNG.
convert input.pdf output.png
- optipng
- You can optimize the resulting image using optipng so that it will take up less space.
Convert to plain text
editIf you are thinking of converting to plain text for spell-checking or to count words, there may be an easier way — read Tips and Tricks first. Following are available tools
- [detex] - comes with latex distribution
Most LaTeX distributions come with detex program, which strips LaTeX commands. It can handle multi-file projects, so all you need is to give one command:
detex yourfile
(note the omission of .tex extension). Outputs result to the standard output. Redirect output to file using
detex yourfile > yourfile.txt
Output of detex can contain undesired elements, the tool does not claims perfect conversion - make sure you use latest verion opendetex, or use HTML conversion first and then copy text from your browser.
- [catdvi]
If you want to keep the formatting, you can use a DVI-to-plain text converter, like catdvi. Example:
catdvi yourfile.dvi | fmt -u
The use of fmt -u (available on most Unices) will remove the justification.