Data Science: An Introduction/250 R Commands

Data Science: An Introduction

Appendix 2: 250 R Commands

Data Science: An Introduction

Welcome to Data Science
Thinking about the World
Analyzing and Visualizing, Part One
- 13: Single Variable Analysis
- 14: Single Variable Tables and Plots
Setting up the Problem
Collecting, Ingesting, Transforming Data
Analyzing and Visualizing, Part Two
Emergent Answers to Free Form Problems
- 24: Non-Theory-Based Inquiry
- 25: Exploratory Analysis
Analyzing and Visualizing, Part Three
Presenting Results
Appendices

Edit This Box

Chapter Summary

This is copied verbatim from Jeromy Anglim's Blog^[1]

Discussion

Dr. Anglim writes:

The R programming language includes many abbreviations. Abbreviations exist in function names, argument names, and allowed values for arguments. This post expands on over 150 R abbreviations with the aim of making it easier for users new to R who are trying to memorise R commands.

Abbreviations save time when typing and can make for less cumbersome code. However, abbreviations often make it more difficult to remember a command. This is especially true when the user does not know what the abbreviation stands for.

R has been developed by a group of technical experts with backgrounds in Linux and Unix, mathematics, statistics, and statistical computing. With gaining popularity, R is now being used by people with little to none of this background. Abbreviations which are intuitive to the experts are not necessarily intuitive to this broader audience.

The R help system does a reasonable job of explaining the abbreviations in R. However, I thought it would be useful to write a post listing some of the common abbreviations along with the expansion of the abbreviation. Whereas R sometimes errs on the side of assuming expertise, I thought I'd err on the side of assuming naivety. Thus, the table includes many abbreviations which are probably obvious to most readers.

I thank Tom Short for his R reference Card^[2] which provided some inspiration for a starting list of R commands. Feel free to reproduce or adapt this table elsewhere. For example, perhaps it could be included in an R Wiki with additional entries. If you spot an error in the table, let me know in the comments of this post.

I might expand the table in the future. At the moment, it's mainly function names with not many arguments or values of arguments. I also haven't put much time into grouping and ordering the functions.

Table of R Commands

R Command	Abbreviation Expanded	Comments
`ls`	[L]i[S]t objects	common command in Unix-like operating systems
`rm`	[R]e[M]ove objects	common command in Unix-like operating systems
`str`	[STR]ucture of an object
`unz`	[UNZ]ip
`getwd`	[GET] [W]orking [D]irectory
`dir`	[DIR]ectory
`sprintf`	[S]tring [PRINT] [F]ormatted
`c`	[C]ombine values
`regexpr`	[REG]ular [EXPR]ession	Why "regular"? See regular sets, regular language
`diag`	[DIAG]onal values of a matrix
`col`	[COL]umn
`lapply`	[L]ist [APPLY]	Apply function to each element and return a list
`sapply`	[S]implify [APPLY ]	Apply function to each element and attempt to return a vector (i.e., a vector is "simpler" than a list)
`mapply`	[M]ultivariate [APPLY]	Multivariate version of sapply
`tapply`	[T]able [APPLY]	Apply function to sets of values as defined by an index
`apply`	[APPLY] function to sets of values as defined by an index
`MARGIN = 1 or 2 in apply`	rows [1] come before columns [2]	e.g., a 2 x 3 matrix has 2 rows and 3 columns (note: row count is stated first)
`rmvnorm`	[R]andom number generator for [M]ulti[V]ariate [NORM]al data
`rle`	[R]un [L]ength [E]ncoding
`ftable`	[F]ormat [TABLE]
`xtabs`	Cross (i.e., [X]) [TAB]ulation	[X] is the symbol of a cross; [X] is sometimes spoken as "by". Cross-tabulating means to cross one variable with another
`xtable`	[TABLE] of the object [X]
`formatC`	[FORMAT] using [C] style formats	i.e., [C] the programming language
`Sweave`	[S] [WEAVE]	The R Programming language is a dialect of S. Weaving involves combining code and documentation
`cor`	[COR]relation
`ancova`	[AN]alysis [O]f [COVA]riance
`manova`	[M]ultivariate [AN]alysis [O]f [VA]riance
`aov`	[A]nalysis [O]f [V]ariance
`TukeyHSD`	[T]ukey's [H]onestly [S]ignificant [D]ifference
`hclust`	[H]ierarchical [CLUST]er analysis
`cmdscale`	[C]lassical metric [M]ulti[D]imensional [SCAL]ing
`factanal`	[FACT]or [ANAL]ysis
`princomp`	[PRIN]cipal [COMP]onents analysis
`prcomp`	[PR]incipal [COMP]onents analysis
`lme`	[L]inear [M]ixed [E]ffects model
`resid`	[RESID]uals
`ranef`	[RAN]dom [EF]fects
`anova`	[AN]alysis [O]f [VA]riance
`fixef`	[FIX]ed [EF]ffects
`vcov`	[V]ariance-[COV]ariance matrix
`logLik`	[LOG] [LIK]elihood
`BIC`	[B]ayesian [I]nformation [C]riteria
`mcmcsamp`	[M]arkov [Chain] [Monte] [C]arlo [SAMP]ling
`eval`	[EVAL]uate an R expression
`cat`	con[CAT]enate	standard Unix command
`apropos`	Search documentation for a purpose or on a topic (i.e., [APROPOS])	Unix command for search documentation;
`read.csv`	[READ] a file in [C]omma [S]eperated [V]alues format	i.e., in each row of the data commas separate values for each variable
`read.fwf`	[READ] a file in [F]ixed [W]idth [F]ormat
`seq`	Generate [SEQ]uence
`rep`	[REP]licate values of x	perhaps also [REP]eat
`dim`	[DIM]ension of an object	Typically, number of rows and columns in a matrix
`gl`	[G]enerate factor [L]evels
`rbind`	[R]ows [BIND]
`cbind`	[C]olumns [BIND]
`is.na`	[IS] [N]ot [A]vailable
`nrow`	[N]umber of [ROW]s
`ncol`	[N]umber of [COL]umns
`attr`	[ATTR]ibute
`rev`	[REV]erse
`diff`	[DIFF]erence between x and a lag of x
`prod`	[PROD]uct
`var`	[VAR]iance
`sd`	[S]tandard [D]eviation
`cumsum`	[CUM]ulative [SUM]
`cumprod`	[CUM]ulative [PROD]uct
`setdiff`	[SET] [DIFF]erence
`intersect`	[INTERSECT]ion
`Re`	[RE]al part of a number
`Im`	[IM]aginary part of a number
`Mod`	[MOD]ulo operation	remainder of division of one number by another
`t`	[T]ranspose of a vector or matrix
`substr`	[SUBSTR]ing
`strsplit`	[STR]ing [SPLIT]
`grep`	[G]lobal / [R]egular [E]xpression / [P]rint	Etymology based on text editor instructions in programs such as ed
`sub`	[SUB]stitute identified pattern found in string
`gsub`	[G]lobal [SUB]stitute identified pattern found in string
`pmatch`	[P]artial string [MATCH]ing
`nchar`	[N]umber of [CHAR]acters in a string
`ps.options`	[P]ost-[S]cript [OPTIONS]
`win.metafile`	[WIN]dows [METAFILE] graphic
`dev.off`	[DEV]ice [OFF]
`dev.cur`	[CUR]rent [DEV]ice
`dev.set`	[SET] the current [DEV]ice
`hist`	[HIST]ogram
`pie`	[PIE] Chart
`coplot`	[CO]nditioning [PLOT]
`matplot`	[PLOT] columns of [MAT]rices
`assocplot`	[ASSOC]iation [PLOT]
`plot.ts`	[PLOT] [T]ime [S]eries
`qqnorm`	[Q]uantile-[Q]uantile [P]lot based on normal distribution
`persp`	[PERSP]ective [P]lot
`xlim`	[LIM]it of the [X] axis
`ylim`	[LIM]it of the [Y] axis
`xlab`	[LAB]el for the [X] axis
`ylab`	[LAB]el for the [Y] axis
`main`	[MAIN] title for the plot
`sub`	[SUB] title for the plot
`mtext`	[M]argin [TEXT]
`abline`	[LINE] on plot often of the form y = [A] + [B] x
`h argument in abline`	[H]orizontal line
`v argument in abline`	[V]ertical line
`par`	Graphics [PAR]ameter
`adj as par`	[ADJ]ust text [J]ustification
`bg as par`	[B]ack[G]round colour
`bty as par`	[B]ox [TY]pe
`cex as par`	[C]haracter [EX]tension or [EX]pansion of plotting objects
`cex.sub as par`	[C]haracter [EX]tension or [EX]pansion of [SUB]title
`cex.axis as par`	[C]haracter [EX]tension or [EX]pansion of [AXIS] annotation
`cex.lab as par`	[C]haracter [EX]tension or [EX]pansion X and Y [LAB]els
`cex.main as par`	[C]haracter [EX]tension or [EX]pansion of [MAIN] title
`col as par`	Default plotting [COL]our
`las as par`	[L]abel of [A]xis [S]tyle
`lty as par`	[L]ine [TY]pe
`lwd as par`	[L]ine [W]i[D]th
`mar as par`	[MAR]gin width in lines
`mfg as par`	Next [G]raph for [M]atrix of [F]igures
`mfcol as par`	[M]atrix of [F]igures entered [COL]umn-wise
`mfrow as par`	[M]atrix of [F]igures entered [ROW]-wise
`pch as par`	[P]lotting [CH]aracter
`ps as par`	[P]oint [S]ize of text	Point is a printing measurement
`pty as par`	[P]lot region [TY]pe
`tck as par`	[T]i[CK] mark length
`tcl as par`	[T]i[C]k mark [L]ength
`xaxs as par`	[X] [AX]is [S]tyle
`yaxs as par`	[Y] [AX]is [S]tyle
`xaxt as par`	[X] [AX]is [T]ype
`yaxt as par`	[Y] [AX]is [T]ype
`asp as par`	[ASP]etc. ratio
`xlog as par`	[X] axis as [LOG]arithm scale
`ylog as par`	[Y] axis as [LOG]arithm scale
`omi as par`	[O]uter [M]argin width in [I]nches
`mai as par`	[MA]rgin width in [I]nches
`pin as par`	[P]lot size in [IN]ches
`xpd as par`		Perhaps: [X = Cut] [P]lot ? Perhaps D for device
`xyplot`	[X] [Y] [PLOT]	[X] for horizontal axis; [Y] for vertical axis
`bwplot`	[B]ox and [W]hisker plot
`qq`	[Q]uantile-[Quantile] plot'
`splom`	[S]catter[PLO]t [M]atrix
`optim`	[OPTIM]isation
`lm`	[L]inear [M]odel
`glm`	[G]eneralised [L]inear [M]odel
`nls`	[N]onlinear [L]east [S]quare parameter estimation
`loess`	[LO]cally [E]stimated [S]catterplot [S]moothing
`prop.test`	[TEST] null hypothesis that [PROP]ortions in several groups are the same
`rnorm`	[R]andom number drawn from [NORM]al distribution
`dnorm`	[D]ensity of a given quantile in a [NORM]al distribution
`pnorm`	[D]istribution function for [NORM]al distribution returning cumulative [P]robability
`qnorm`	[Q]uantile function based on [NORM]al distribution
`rexp`	[R]andom number generation from [EXP]onential distribution
`rgamma`	[R]andom number generation from [GAMMA] distribution
`rpois`	[R]andom number generation from [POIS]on distribution
`rweibull`	[R]andom number generation from [WEIBULL] distribution
`rcauchy`	[R]andom number generation from [CAUCHY] distribution
`rbeta`	[R]andom number generation from [BETA] distribution
`rt`	[R]andom number generation from [t] distribution
`rf`	[R]andom number generation from [F] distribution	F for Ronald [F]isher
`rchisq`	[R]andom number generation from [CHI] [SQ]uare distribution
`rbinom`	[R]andom number generation from [BINOM]ial distribution
`rgeom`	[R]andom number generation from [EXP]onential distribution
`rhyper`	[R]andom number generation from [HYPER]geometric distribution
`rlogis`	[R]andom number generation from [LOGIS]tic distribution
`rlnorm`	[R]andom number generation from [L]og [NOR]mal distribution
`rnbinom`	[R]andom number generation from [N]egative [BINOM]ial distribution
`runif`	[R]andom number generation from [UNIF]orm distribution
`rwilcox`	[R]andom number generation from [WILCOX]on distribution
`ggplot in ggplot2`	[G]rammar of [G]raphics [PLOT]	See Leland Wilkinson (1999)
`aes in ggplot2`	[AES]thetic mapping
`geom_ in ggplot2`	[GEOM]etric object
`stat_ in ggplot2`	[STAT]istical summary
`coord_ in ggplot2`	[COORD]inate system
`qplot in ggplot2`	[Q]uick [PLOT]
`x as argument`	[X] is common letter for unknown variable in math
`FUN as argument`	[FUN]ction
`pos as argument`	[POS]ition
`lib.loc in library`	[LIB]rary folder [LOC]ation
`sep as argument`	[SEP]erator character
`comment.char in read.table`	[COMMENT] [CHAR]acter(s)
`I`	[I]nhibit [I]nterpretation or [I]nsulate
`T value`	[T]rue
`F value`	[F]alse
`na.rm as argument`	[N]ot [A]vailable [R]e[M]oved
`fivenum`	[FIVE] [NUM]ber summary
`IQR`	[I]nter [Q]uartile [R]ange
`coef`	Model [COEF]ficients
`dist`	[DIST]ance matrix
`df as argument`	[D]egrees of [F]reedom
`mad`	[M]edian [A]bsolute [D]eviation
`sink`		Divert R output to a connection (i.e., like connecting a pipe to a [SINK])
`eol in write.table`	[End] [O]f [L]ine character(s)
`R as software`	[R]oss Ihaka and [R]obert Gentleman or [R] is letter before S
`CRAN as word`	[C]omprehensive [R] [A]rchive [N]etwork	As I understand it: Inspired by CTAN (Comprehensive TeX Archive Network); pronunciation of CRAN rhymes with CTAN (i.e., "See" ran as in Iran; "See tan")
`Sexpr`	[S] [EXPR]ession
`ls.str`	Show [STR]ucture of [L]i[S]ted objects
`browseEnv`	[BROWSE] [ENV]ironment
`envir as argument`	[ENVIR]onment
`q`	[Q]uit
`cancor`	[CAN]onical [COR]relation
`ave`	[AVE]rage
`min`	[MIN]imum
`max`	[MAX]imum
`sqrt`	[SQ]uare [R]oo[T]
`%o%`	[O]uter product
`&`		& is ampersand meaning [AND]
`\|`		\| often used to represent OR in computing (http://en.wikipedia.org /wiki /Logical_disjunction)
`:`		sequence generator; also used in MATLAB
`nlevels`	[N]umber of [LEVELS] in a factor
`det`	[DET]erminant of a matrix
`crossprod`	Matrix [CROSSPROD]uct
`gls`	[G]eneralised [L]east [S]quares
`dwtest in lmtest`	[D]urbin-[W]atson Test
`sem in sem`	[S]tructural [E]quation [M]odel
`betareg in betareg`	[BETA] [REG]ression
`log`	Natural [LOG]arithm	Default base is e consistent with most mathematics (http://en.wikipedia.org /wiki /Logarithm#Implicit_bases)
`log10`	[LOG]arithm base 10
`fft`	[F]ast [F]ourier [T]ransform
`exp`	[EXP]onential function	i.e., e^x
`df.residual`	[D]egrees of [F]reedom of the [R]esidual
`sin`	[SIN]e function
`cos`	[COS]ine function
`tan`	[TAN]gent function
`asin`	[A]rc[SIN]e function
`acos`	[A]rc[COS]ine function
`atan`	[A]rc[TAN]gent function
`deriv`	[DERIV]ative
`chol`	[Choleski] decomposition
`chol2inv`	[CHOL]eski [2=TO] [INV]erse
`svd`	[S]ingular [V]alue [D]ecomposition
`eigen`	[EIGEN]value or [EIGEN]vector
`lower.tri`	[LOWER] [TRI]angle of a matrix
`upper.tri`	[UPPER] [TRI]angle of a matrix
`acf`	[A]uto [C]orrelation or [C]ovariance [F]unction
`pacf`	[P]artial A]uto [C]orrelation or [C]ovariance [F]unction
`ccf`	[C]ross [C]orrelation or [C]ovariance [F]unction
`Rattle as software`	[R] [A]nalytical [T]ool [T]o [L]earn [E]asily	Perhaps, easy like a baby's rattle
`StatET as software`		Anyone know? Statistics Eclipse?
`JGR as software`	[J]ava [G]UI for [R]	pronounced "Jaguar" like the cat
`ESS as software`	[E]macs [S]peaks [S]tatistics
`Rcmdr package`	[R] [C]o[m]man[d]e[r] GUI
`prettyNum`	[PRETTY] [NUM]ber
`Inf value`	[Inf]inite
`NaN value`	[N]ot [A] [N]umber
`is.nan`	[IS] [N]ot [A] [N]umber
`S3`		R is a dialect of [S]; 3 is the version number
`S4`		R is a dialect of [S]; 4 is the version number
`Rterm as program`	[R] [TERM]inal
`R CMD as program`		I think: [R] [C]o[m]man[D] prompt
`repos as option`	[REPOS]itory locations
`bin folder`	[BIN]aries	Common Unix folder for "essential command binaries"
`etc folder`	[et cetera]	Common Unix folder for "host-specific system-wide configuration files
`src folder`	[S]ou[RC]e [C]ode	Common Unix folder
`doc folder`	[DOC]umentation
`RGUI program`	[R] [G]rapical [U]ser [I]nterface
`.site file extension`	[SITE] specific file	e.g., RProfile.site
`Hmisc package`	Frank [HARRELL]'s package of [MISC]elaneous functions
`n in debug`	[N]ext step
`c in debug`	[C]ontinue
`Q in debug`	[Q]uit
`MASS package`	[M]odern [A]pplied [S]tatistics with [S]	Based on book of same name by Venables and Ripley
`plyr package`	PL[Y=ie][R]	Double play on words: (1) package manipulates data like pliers manipulate materials; (2) last letter is R as in the program
`aaply`	input [A]rray output [A]rray using [PLY]r package
`daply`	input [D]ata frame output [A]rray using [PLY]r package
`laply`	input [L]ist output [A]rray using [PLY]r package
`adply`	input [A]rray output [D]ata frame using [PLY]r package
`alply`	input [A]rray output [L]ist using [PLY]r package
`a_ply`	input [A]rray output Discarded (i.e., _ is blank) using [PLY]r package
`RODBC package`	[R] [O]bject [D]ata[B]ase [C]onnectivity
`psych package`	[PSYCH]ology related functions
`zelig package`		"Zelig is named after a Woody Allen movie about a man who had the strange ability to become the physical and psychological reflection of anyone he met and thus to fit perfectly in any situation." - http://gking. harvard.edu/ zelig/
`strucchange package`	[STRUC]tural [CHANGE]
`relaimpo package`	[RELA]tive [IMPO]rtance
`car package`	[C]ompanion to [A]pplied [R]egression	Named after book by John Fox
`OpenMx package`	[OPEN] Source [M]atri[X] algebra interpreter	Need confirmation that [Mx] means matrix
`df in write.foreign`	[D]ata [F]rame
`GNU S word`	[GNU] is [N]ot [U]nix [S]
`R FAQ word`	R [F]requently [A]sked [Q]uestions
`DVI format`	[D]e[V]ice [I]ndependent file format
`devel word`	[DEVEL]opment	as in code under development
`GPL word`	[G]eneral [P]ublic [L]icense
`utils package`	[UTIL]itie[S]
`mle`	[M]aximum [L]ikelihood [E]stimation
`rpart package`	[R]ecursive [PART]itioning
`sna package`	[S]ocial [N]etwork [A]nalysis
`ergm package`	[E]xponential [R]andom [G]raph [M]odels
`rbugs package`	[R] interface to program [B]ayesian inference [Using] [G]ibbs [S]ampling

References

↑ Jeromy Anglim (May 10, 2010). "Abbreviations of R Commands Explained: 250+ R Abbreviations". Jeromy Anglim's Blog: Psychology and Statistics. Retrieved 8 August 2012.
↑ Tom Short (11 July 2004). "R Reference Card" (PDF). The Comprehensive R Archive Network (CRAN). Retrieved 8 August 2012.

Copyright Notice

You are free:

to Share — to copy, distribute, display, and perform the work (pages from this wiki)
to Remix — to adapt or make derivative works

Under the following conditions:

Attribution — You must attribute this work to Wikibooks. You may not suggest that Wikibooks, in any way, endorses you or your use of this work.
Share Alike — If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.
Waiver — Any of the above conditions can be waived if you get permission from the copyright holder.
Public Domain — Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
Other Rights — In no way are any of the following rights affected by the license:

Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations;
The author's moral rights;
Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights.

Notice — For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to the following web page.

http://creativecommons.org/licenses/by-nc-sa/3.0/

[1] Jeromy Anglim (May 10, 2010). "Abbreviations of R Commands Explained: 250+ R Abbreviations". Jeromy Anglim's Blog: Psychology and Statistics. Retrieved 8 August 2012.

[2] Tom Short (11 July 2004). "R Reference Card" (PDF). The Comprehensive R Archive Network (CRAN). Retrieved 8 August 2012.

[1]

[2]

Data Science: An Introduction/250 R Commands

Contents

Chapter Summary

Discussion

Table of R Commands

More Reading

References

Copyright Notice