Statistical Analysis: an Introduction using R/R/Vectors
TRUE
or FALSE
[1]). In this topic we will use some example vectors provided by the "datasets" package, containing data on States of the USA (see ?state
).
R is an inherently vector-based program; in fact the numbers we have been using in previous calculations are just treated as vectors with a single element. This means that most basic functions in R will behave sensibly when given a vector as a argument, as shown below.state.area #a NUMERIC vector giving the area of US states, in square miles
state.name #a CHARACTER vector (note the quote marks) of state names
sq.km <- state.area*2.59 #Arithmetic works on numeric vectors, e.g. convert sq miles to sq km
sq.km #... the new vector has the calculation applied to each element in turn
sqrt(sq.km) #Many mathematical functions also apply to each element in turn
range(state.area) #But some functions return different length vectors (here, just the max & min).
length(state.area) #and some, like this useful one, just return a single value.
[1] 51609 589757 113909 53104 158693 104247 5009 2057 58560 58876 6450 83557 56400
[14] 36291 56290 82264 40395 48523 33215 10577 8257 58216 84068 47716 69686 147138 [27] 77227 110540 9304 7836 121666 49576 52586 70665 41222 69919 96981 45333 1214 [40] 31055 77047 42244 267339 84916 9609 40815 68192 24181 56154 97914 > state.name #a CHARACTER vector (note the quote marks) of state names
[1] "Alabama" "Alaska" "Arizona" "Arkansas" [5] "California" "Colorado" "Connecticut" "Delaware" [9] "Florida" "Georgia" "Hawaii" "Idaho"
[13] "Illinois" "Indiana" "Iowa" "Kansas" [17] "Kentucky" "Louisiana" "Maine" "Maryland" [21] "Massachusetts" "Michigan" "Minnesota" "Mississippi" [25] "Missouri" "Montana" "Nebraska" "Nevada" [29] "New Hampshire" "New Jersey" "New Mexico" "New York" [33] "North Carolina" "North Dakota" "Ohio" "Oklahoma" [37] "Oregon" "Pennsylvania" "The smallest state" "South Carolina" [41] "South Dakota" "Tennessee" "Texas" "Utah" [45] "Vermont" "Virginia" "Washington" "West Virginia" [49] "Wisconsin" "Wyoming" > sq.km <- state.area*2.59 #Standard arithmatic works on numeric vectors, e.g. convert sq miles to sq km > sq.km #... giving another vector with the calculation performed on each element in turn
[1] 133667.31 1527470.63 295024.31 137539.36 411014.87 269999.73 12973.31 5327.63 [9] 151670.40 152488.84 16705.50 216412.63 146076.00 93993.69 145791.10 213063.76
[17] 104623.05 125674.57 86026.85 27394.43 21385.63 150779.44 217736.12 123584.44 [25] 180486.74 381087.42 200017.93 286298.60 24097.36 20295.24 315114.94 128401.84 [33] 136197.74 183022.35 106764.98 181090.21 251180.79 117412.47 3144.26 80432.45 [41] 199551.73 109411.96 692408.01 219932.44 24887.31 105710.85 176617.28 62628.79 [49] 145438.86 253597.26 > sqrt(sq.km) #Many mathematical functions also apply to each element in turn
[1] 365.60540 1235.90883 543.16140 370.86299 641.10441 519.61498 113.90044 72.99062 [9] 389.44884 390.49819 129.24976 465.20171 382.19890 306.58390 381.82601 461.58830
[17] 323.45487 354.50609 293.30334 165.51263 146.23826 388.30328 466.62203 351.54579 [25] 424.83731 617.32278 447.23364 535.06878 155.23324 142.46136 561.35100 358.33202 [33] 369.04978 427.81111 326.74911 425.54695 501.17940 342.65503 56.07370 283.60615 [41] 446.71213 330.77479 832.11058 468.96955 157.75712 325.13205 420.25859 250.25745 [49] 381.36447 503.58441 > range(state.area) #But some functions return different length vectors (here, just the max & min). [1] 1214 589757 > length(state.area) #and some, like this useful one, just return a single value. [1] 50
c()
, so named because it concatenates objects together. However, if you wish to create vectors consisting of regular sequences of numbers (e.g. 2,4,6,8,10,12, or 1,1,2,2,1,1,2,2) there are several alternative functions you can use, including seq()
, rep()
, and the :
operator.c("one", "two", "three", "pi") #Make a character vector
c(1,2,3,pi) #Make a numeric vector
seq(1,3) #Create a sequence of numbers
1:3 #A shortcut for the same thing (but less flexible)
i <- 1:3 #You can store a vector
i
i <- c(i,pi) #To add more elements, you must assign again, e.g. using c()
i
i <- c(i, "text") #A vector cannot contain different data types, so ...
i #... R converts all elements to the same type
i+1 #The numbers are now strings of text: arithmetic is impossible
rep(1, 10) #The "rep" function repeats its first argument
rep(3:1,10) #The first argument can also be a vector
huge.vector <- 0:(10^7) #R can easily cope with very big vectors
#huge.vector #VERY BAD IDEA TO UNCOMMENT THIS, unless you want to print out 10 million numbers
rm(huge.vector) #"rm" removes objects. Deleting huge unused objects is sensible
[1] "one" "two" "three" "pi" > c(1,2,3,pi) #Make a numeric vector [1] 1.000000 2.000000 3.000000 3.141593 > seq(1,3) #Create a sequence of numbers [1] 1 2 3 > 1:3 #A shortcut for the same thing (but less flexible) [1] 1 2 3 > i <- 1:3 #You can store a vector > i [1] 1 2 3 > i <- c(i,pi) #To add more elements, you must assign again, e.g. using c() > i [1] 1.000000 2.000000 3.000000 3.141593 > i <- c(i, "text") #A vector cannot contain different data types, so ... > i #... R converts all elements to the same type [1] "1" "2" "3" "3.14159265358979" "text" > i+1 #The numbers are now strings of text: arithmetic is impossible Error in i + 1 : non-numeric argument to binary operator > rep(1, 10) #The "rep" function repeats its first argument
[1] 1 1 1 1 1 1 1 1 1 1
> rep(3:1,10) #The first argument can also be a vector
[1] 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1
> huge.vector <- 0:(10^7) #R can easily cope with very big vectors > #huge.vector #VERY BAD IDEA TO UNCOMMENT THIS, unless you want to print out 10 million numbers > rm(huge.vector) #"rm" removes objects. Deleting huge unused objects is sensible
Notes
edit- ↑ These are special words in R, and cannot be used as names for objects. The objects
T
andF
are temporary shortcuts forTRUE
andFALSE
, but if you use them, watch out: since T and F are just normal object names you can change their meaning by overwriting them.