Statistical Analysis: an Introduction using R/R/Graphics

As argued throughout this book, an extremely important part of an analysis is visualising data. Fortunately, R has extensive data visualisation capabilities: indeed all the graphics in the book have been produced in R, often in only a few lines[1].

There are 2 major methods of producing plots in R:

1. Traditional R graphics. This basic graphical framework is what we will describe in this topic. We will use it to produce similar plots to those in Figures 1.1 and 1.2.
2. "Trellis" graphics. This is a more complicated framework, useful for producing multiple similar plots on a page. In R, this functionality is provided by the "lattice" package (type `help("Lattice", package=lattice)` for details).

Details of how to produce specific types of plot are given in later chapters; this topic introduces only the very basic principles, of which there are 3 main ones to bear in mind:

• Plots in R are produced by typing specific graphical commands. These commands are of two types
1. Commands which set up an entirely new plot. The most common function of this type is simply called `plot()`. In the simplest case, this is likely to replace any previous plots with the new plot.
2. Commands which add graphics (lines, text, points etc) to an existing plot. A number of functions do this: the most useful are `lines()`, `abline()`, `points()`, and `text()`.
• R always outputs graphics to a device. Usually this is a window on the screen, but it can be to a pdf or other graphics file (a full list can be found by `?device`). This is one way in which to save plots for incorporation into documents etc. To save graphics in (say) a pdf file, you need to activate a new pdf device using `pdf()`, run your normal graphics commands, then close the device using `dev.off()`. This is illustrated in the last example below.
• Different functions are triggered depending on the first argument to `plot()`. By default, these are intended to produce sensible output. For example, if it is given a function, say the `sqrt` function, `plot()` will produce a graph of `x` against `sqrt(x)`; if it is given a dataset it will attempt to plot points of data in a sensible manner (see `?plot.function` and `?plot.data.frame` for more details). Graphical nicities such as the colour, style, and size of items, as well as axis labels, titles, etc, can mostly be controlled by further arguments to the `plot()` functions[2].

The examples below start simply, but become more detailed. Beginners are advised to paste each line into R one at a time, to see the effect of each command. Experimenting is also recommended!

Input:
1. ```plot(sqrt)                              #Here we use plot() to plot a function
```
2. ```
```
3. ```plot(cars)                              #Here a dataset (axis names are taken from column names)
```
4. ```
```
5. ```###Adding to an existing plot usually requires us to specify where to add
```
6. ```abline(a=-17.6, b=3.9, col="red")       #abline() adds a straight line (a:intercept, b:slope)
```
7. ```lines(lowess(cars), col="blue")         #lines() adds a sequence of joined-up lines
```
8. ```text(15, 34, "Smoothed (lowess) line", srt=30, col="blue")  #text() adds text at the given location
```
9. ```text(15, 45, "Straight line (slope 3.9, intercept -17.6)", srt=32, col="red") #(srt rotates)
```
10. ```title("1920s car stopping distances (from the 'cars' dataset)")
```
11. ```
```
12. ```###plot() takes lots of additional arguments (e.g. we can change to log axes), some examples here
```
13. ```plot(cars, main="Cars data", xlab="Speed (mph)", ylab="Distance (ft)", pch=4, col="blue", log="xy")
```
14. ```grid()                                  #Add dotted lines to the plot to form a background grid
```
15. ```lines(lowess(cars), col="red")          #Add a smoothed (lowess) line to the plot
```
16. ```
```
17. ```###to plot to a pdf file, simply switch to a pdf device first, then issue the same commands
```
18. ```pdf("car_plot.pdf", width=8, height=8)  #Open a pdf device (creates a file)
```
19. ```plot(cars, main="Cars data", xlab="Speed (mph)", ylab="Distance (ft)", pch=4, col="blue", log="xy")
```
20. ```grid()                                  #Add dotted lines to the pdf to form a background grid
```
21. ```lines(lowess(cars), col="red")          #Add a smoothed (lowess) line to the plot
```
22. ```dev.off()                               #Close the pdf device
```
Result:
```The plots produced should look somthing like the following
```
```
```
Note that the cars dataset gives stopping distance in feet and speed in mph, so the plots produced here differ from those in Figures 1.1 and 1.2, where the data have been converted to metric units.

NotesEdit

1. Not enough of R has yet been introduced to explain fully the commands used for the plots in this chapter. Nevertheless, for those who are interested, for any plot, the commands used to generate it are listed in the image summaries (which can be seen by clicking on the image).
2. Unfortunately, details of the bewildering array of arguments available, many of which are common to other graphics-producing routines) are scattered around a number of help files. For example, to see the options for `plot()` when called on a dataset, see `?plot`, `?plot.default`, and `?par`. To see the options for `plot()` when called on a function, see `?plot.function`. The numbers given to the `pch` argument, specifying various plotting symbols, are listed in the help file for `points()` (the function for adding points to a plot): they can be seen via `example(points)`.