R Programming/Publication quality output

Formatting numbers

edit

You can use the format() function to control the number of digits and other characteristics of a displayed object.

> df <- data.frame(x = rnorm(10), y = rnorm(10))
> print(df)
            x          y
1  -0.4350953 -0.6426477
2  -0.5947293 -0.2389625
3  -0.7061850 -2.4382016
4  -0.3384038 -0.6322842
5   0.2713353  0.5396409
6  -1.1144711 -2.0321274
7  -1.0356184  1.7217443
8  -2.6665278 -0.3621377
9   0.2975570  0.1598905
10  1.4631458 -0.7995652
> print(format(df, digits=3, scientific=T))
           x         y
1  -4.35e-01 -6.43e-01
2  -5.95e-01 -2.39e-01
3  -7.06e-01 -2.44e+00
4  -3.38e-01 -6.32e-01
5   2.71e-01  5.40e-01
6  -1.11e+00 -2.03e+00
7  -1.04e+00  1.72e+00
8  -2.67e+00 -3.62e-01
9   2.98e-01  1.60e-01
10  1.46e+00 -8.00e-01

Sweave

edit

Sweave[1] is a literate programming language which integrates LaTeX and R code. The Sweave file generates a LaTeX file and an R file which can in turn be compiled. Roger Koenker[2], Meredith and Racine (2009)[3] and Charles Geyer[4] argue that Sweave favors reproducible econometric/statistical research.

There are some alternatives to Sweave for literate programming. One of them is Babel which is included in Emacs Orgmode[5]. This tool allow export to LaTeX and HTML. It is also possible to include code chunks for various programming languages (R, Ruby, etc).

Syntax

edit

The main idea is that you write a file which includes LaTeX and R code. LaTeX code begins with @ and R code with <<>>= (some options can be included between << and >>).

@
% Some LaTeX code
\section{Results}
I show that ...
<<>>=
# Some R code
qnorm(.975)
@
% Some LaTeX code
$$
\Phi^{-1}(.975) = 1.96 
$$

The file is stored with extension .Rnw or .rnw. At the end, you extract from this file an R file using Stangle() and a LaTeX file using Sweave(). Here is an example with a file called file.Rnw which generates file.tex and file.R

> Sweave("file.Rnw")
Writing to file file.tex
Processing code chunks ...
 1 : echo keep.source term verbatim pdf
 2 : echo keep.source term verbatim pdf
> Stangle("file.Rnw")
Writing to file file.R

Then you can run LaTeX on your file.tex. This can be done using the system() function or texi2dvi().

# Example under Windows :
system("pdflatex.exe -shell-escape file.tex") # runs pdflatex
system("open file.pdf") # opens the pdf

Note that you may need to download Sweave.sty from the internet since it is not part of the standard MikTeX distribution.

You can also add your results in your text using the \Sexpr{} function.

$
\Phi^{-1}(.975) = \Sexpr{qnorm(.975)} 
$

Options

edit

There are some options. These options can be included for each code chunk or in the Sweave command.

  • For figures, you can either include them in the tex file using fig=T or not include them using fig=F.

By default, figures are exported as pdf and eps files. If you only want one format suppress the other one with pdf=F or eps=F option.

  • The R code can be displayed in the tex file using echo=T. If you don't want to include it in the tex file, use echo=F.
  • The R code can be evaluated using eval=T. If you don't want to evaluate the R code, use eval=F.
  • The results :
    • results=tex treats the output as LaTeX code
    • results=verbatim treats the output as Verbatim (the default)
    • results=hide does not include the results in the LaTeX output

These options can be passed to the Sweave() function.

Sweave("file.Rnw", pdf = T, eps=F, echo = F, results = "verbatim")

They can also be passed to each code chunk.

<<fig=T,pdf=T,eps=F>>=
plot(rnorm(100), col = "red")
@

Text editor for Sweave

edit

The main issue with Sweave is that few text editors include syntax highlighting for Sweave. Here are some exceptions :

  • RStudio is a very good solution. It is easy to install and use and it includes buttons to run Sweave files.
  • Vim provides syntax highlighting for Sweave file (R no web syntax)
  • Emacs + ESS (Emacs Speaks Statistics) provides full support for Sweave file. It includes a keyboard shortcut to run Sweave files and syntax highlighting switching between LaTeX and R.
  • Eclipse StatET plugin provides support for Sweave (LaTeX/R) documents with all basic features (syntax highlighting, bracket matching, toggle comment, ...) and with detection of R chunks.

See also

edit

Some example of Sweave documents :

  • Charles Geyer foo.Rnw example
  • Julien Barnier's introduction to R (document in french)
  • trick : type filetype:Rnw or filetype:Snw in Google to get Sweave files
  • Notice that you can find lots of examples by browsing in the R library folder. The documentation is often written using Sweave and the Sweave file is often included in the package. See for instance in the np package the doc folder.

Some handouts :

  • "Literate Programming with Sweave and DOCSTRIP" (pdf) by Michael Lundholm
  • Charles Geyer 2008 "An Sweave Demo" (pdf) (short)
  • Learning To Sweave in APA Style[6]

Some packages

  • pgfSweave package
  • ascii package
  • cacheSweave
  • exam automatic generation of exams

Some alternative literate programming packages :

  • odfWeave package to Sweave with OpenOffice.
  • knitr package
  • decumar, a literate programming interface for R by Hadley Wickham[7]
  • relax package
  • wikirobot[8] is similar to Sweave but works with MediaWiki.

Pubprint

edit

Pubprint is a small utility that is able to transform the output of statistical tests to publication ready output. Pubprint is able to export outputs to severall formats (HTML, LaTeX, Markdown and plain text), but unfortunately supports only the APA style (publication style of the American Psychological Association). However, this style is widely used and may be appropriate in more cases.

Example

edit
> library("pubprint")
> pprint(t.test(rnorm(30), rnorm(30)))
[1] "(\\ensuremath{M\\ifmmode_{x}\\else\\textsubscript{x}\\fi=-0.05,M\\ifmmode_{y}\\else\\textsubscript{y}\\fi=0.09,t[57.74]=-0.49,p=.628})"

Obviously pubprint prints a LaTeX formatted string, but changing the output format is possible (according to the manual pubprint is intended to use with knitr and detects output format automatically if it is used with it):

> pp_opts_out$set(pp_init_out("plain"))
> pprint(t.test(rnorm(30), rnorm(30)))
[1] "(M_x=-0.14,M_y=-0.24,t[57.4]=0.41,p=.682)"
> pprint(cor.test(rnorm(30), rnorm(30)))
[1] "(r=-.08,p=.693)"

The output can be pasted into a documented or may included in a knitr/sweave \Sexpr{} statement.

Export to LaTeX

edit

R has lots of functions which allow it to export results to LaTeX[9].

General functions

edit

toLatex() in the utils package.

  • Note that toLatex() does not handle matrices.
  • toLatex() has been adapted to handle matrices and ftables in the memisc package.
> toLatex(sessionInfo())
\begin{itemize}
  \item R version 2.2.0, 2005-10-06, \verb|powerpc-apple-darwin7.9.0|
  \item Base packages: base, datasets, grDevices,
    graphics, methods, stats, utils
\end{itemize}
  • mat2tex() (sfsmisc) exports matrix to LaTeX.
  • tex.table() (cwhmisc) package exports a dataframe into a LaTeX table.
> tex.table(mydat)
\begin{table}[ht]
\begin{center}
\begin{footnotesize}
\begin{tabular}{r|rrr}
\hline
 & y & x1 & x2\\ \hline
1 & -0.09 & -0.37 & -1.04\\ 
2 & 0.31 & 0.19 & -0.09\\ 
3 & 3.78 & 0.58 & 0.62\\ 
4 & 2.09 & 1.40 & -0.95\\ 
5 & -0.18 & -0.73 & -0.54\\ 
6 & 3.16 & 1.30 & 0.58\\ 
7 & 2.78 & 0.34 & 0.77\\ 
8 & 2.59 & 1.04 & 0.46\\ 
9 & -1.96 & 0.92 & -0.89\\ 
10 & 0.91 & 0.72 & -1.1\\ 
\hline
\end{tabular}
\end{footnotesize}
\end{center}
\end{table}


  • xtable() (xtable) exports various objects, including tables, data frames, lm, aov, and anova, to LaTeX.
> # lm example
> library(xtable)
> x <- rnorm(100)
> y <- 2*x + rnorm(100)
> lin <- lm(y~x)
> xtable(lin)
% latex table generated in R 2.15.1 by xtable 1.7-0 package
% Sun Sep 23 21:54:04 2012
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrrrr}
  \hline
 & Estimate & Std. Error & t value & Pr($>$$|$t$|$) \\ 
  \hline
(Intercept) & -0.0407 & 0.0984 & -0.41 & 0.6803 \\ 
  x & 2.0466 & 0.1043 & 19.63 & 0.0000 \\ 
   \hline
\end{tabular}
\end{center}
\end{table}

> # table example
> x <- sample(1:10, 30, replace = T)
> tab <- table(x)
> tab <- cbind(tab, prop.table(tab))
> colnames(tab) <- c("N.", "Prop.")
> xtable(tab, digits = c(0, 0, 2))
% latex table generated in R 2.15.1 by xtable 1.7-0 package
% Sun Sep 23 22:06:36 2012
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
  \hline
 & N. & Prop. \\ 
  \hline
1 & 5 & 0.17 \\ 
  3 & 1 & 0.03 \\ 
  4 & 3 & 0.10 \\ 
  5 & 6 & 0.20 \\ 
  6 & 5 & 0.17 \\ 
  7 & 3 & 0.10 \\ 
  8 & 2 & 0.07 \\ 
  9 & 2 & 0.07 \\ 
  10 & 3 & 0.10 \\ 
   \hline
\end{tabular}
\end{center}
\end{table}

See also :

  • The highlight package by Romain François exports R code to LaTeX and HTML.
  • format.df() and latex() in the Hmisc package.
  • The MEMISC and the quantreg packages include other latex() function.

Descriptive statistics

edit
  • estout package.
  • The reporttools package include some functions for table of descriptive statistics[10].

Estimation results

edit
  • The stargazer package provides an easy way to export the results of regressions to LaTeX[11]
  • texreg provides the same kind of features[12].
  • The estout package provides functions similar to the Stata's esttab and estout utilities[13]. Estimates are stored using eststo() and printed using esttab(). They can be exported to CSV and LaTeX. These functions support lm, glm and plm objects (see plm package).
  • apsrtable() (apsrtable) exports the results of multiple regression to LaTeX in a way similar to the American Political Science Review publication standard.
  • The xtable (xtable package) exports dataframes, matrix, estimation results[14]. xtable() can also be used to export the results to an HTML file.
  • The outreg() function[15] developped by Paul Johnson is similar to the Stata outreg[16] function. See "R you ready ?" post on this topic.
  • mtable() and toLatex() in the 'memisc package.
N <- 10^3
u <- rnorm(N)
x1 <- rnorm(N)
x2 <- x1 + rnorm(N)
y <- 1 + x1 + x2 + u
lm1 <- lm(y ~ x1 + x2 )
lm2 <- lm(y ~ x1 + x2 + I(x1*x2))

library(estout)
estclear() # clear all the eststo objects
eststo(lm1) 
eststo(lm2)
esttab() # print it

library("apsrtable")
apsrtable(lm1,lm2)

library(xtable)
xtable(lm1)
tab <- xtable(lm1)
print(tab,type="html")

source("http://pj.freefaculty.org/R/WorkingExamples/outreg-worked.R")
outreg(list(lm1,lm2))

library("memisc")
toLatex(mtable(lm1,lm2))

Export to HTML

edit

The rpublisher[17] is a literate programming language which publish results in HTML (it is based on python and was last updated in 2008).


See R2HTML, xtable, hwriter, prettyR, highlight, HTMLUtils


wiki.table() in the hacks package export a matrix or a dataframe into Mediawiki table markup (as used on this wiki and many others).

> wiki.table(matrix(1:16,4),caption="Test")
{|  
|+ Test 
| 1 || 5 || 9 || 13 
|-
| 2 || 6 || 10 || 14 
|-
| 3 || 7 || 11 || 15 
|-
| 4 || 8 || 12 || 16 
|}

References

edit
Previous: Text Processing Index