R Programming/Control Structures

Conditional execution

edit
  • Help for programming :
> ?Control

if accepts a unidimensional condition.

> if (condition){
+     statement  
+     } 
> else{
+     alternative
+     }

The unidimensional condition may be one of TRUE or FALSE, T or F, 1 or 0 or a statement using the truth operators:

  • x == y "x is equal to y"
  • x != y "x is not equal to y"
  • x > y "x is greater than y"
  • x < y "x is less than y"
  • x <= y "x is less than or equal to y"
  • x >= y "x is greater than or equal to y"

And may combine these using the & or && operators for AND. | or || are the operators for OR.

> if(TRUE){
+     print("This is true")
+     }
  [1] "This is true"
> x <- 2  # x gets the value 2
> if(x==3){
+     print("This is true")
+     } else {
+     print("This is false")
+     }
 [1] "This is false"
> y <- 4 # y gets the value 4
> if(x==2 && y>2){
+     print("x equals 2 and y is greater than 2")
+     }
 [1] "x equals 2 and y is greater than 2"

The ifelse() command takes as first argument the condition, as second argument the treatment if the condition is true and as third argument the treatment if the condition is false. In that case, the condition can be a vector. For instance we generate a sequence from 1 to 10 and we want to display values which are lower than 5 and greater than 8.

> x <- 1:10 
> ifelse(x<5 | x>8, x, 0)
 [1]  1  2  3  4  0  0  0  0  9 10

Sets

edit

R has some very useful handlers for sets to select a subset of a vector:

> x = runif(10)
> x<.5
 [1]  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE
> x
 [1] 0.32664759 0.57826623 0.98171138 0.01718607 0.24564238 0.62190808 0.74839301 
 [8] 0.32957783 0.19302650 0.06013694
> x[x<.5]
[1] 0.32664759 0.01718607 0.24564238 0.32957783 0.19302650 0.06013694

to exclude a subset of a vector:

> x = 1:10
> x
 [1]  1  2  3  4  5  6  7  8  9 10
> x[-1:-5]
[1]  6  7  8  9 10

Loops

edit

Implicit loops

edit
 
Example of fast code using vectorisation

R has support for implicit loops, which is called vectorization. This is built-in to many functions and standard operators. for example, the + operator can add two arrays of numbers without the need for an explicit loop.

Implicit Loops are generally slow, and it is better to avoid them when it is possible.

  • apply() can apply a function to elements of a matrix or an array. This may be the rows of a matrix (1) or the columns (2).
  • lapply() applies a function to each column of a dataframe and returns a list.
  • sapply() is similar but the output is simplified. It may be a vector or a matrix depending on the function.
  • tapply() applies the function for each level of a factor.
> N <- 10
> x1 <- rnorm(N)
> x2 <- rnorm(N) + x1 + 1
> male <- rbinom(N,1,.48)
> y <- 1 + x1 + x2 + male + rnorm(N)
> mydat <- data.frame(y,x1,x2,male)
> lapply(mydat,mean) # returns a list
$y
[1] 3.247

$x1
[1] 0.1415

$x2
[1] 1.29

$male
[1] 0.5

> sapply(mydat,mean) # returns a vector
     y     x1     x2   male 
3.2468 0.1415 1.2900 0.5000 
> apply(mydat,1,mean) # applies the function to each row
 [1]  1.1654  2.8347 -0.9728  0.6512 -0.0696  3.9206 -0.2492  3.1060  2.0478  0.5116
> apply(mydat,2,mean) # applies the function to each column
     y     x1     x2   male 
3.2468 0.1415 1.2900 0.5000 
> tapply(mydat$y,mydat$male,mean) # applies the function to each level of the factor
    0     1 
1.040 5.454
  • See also aggregate() which is similar to tapply() but is applied to a dataframe instead of a vector.

Explicit loops

edit

R provides three ways to write loops: for, repeat and while. The for statement is excessively simple. You simply have to define index (here k) and a vector (in the example below the vector is 1:5) and you specify the action you want between braces.

> for (k in 1:5){
+ print(k)
+ }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

When it is not possible to use the for statement, you can also use break or while by specifying a breaking rules. One should be careful with this kind of loops since if the breaking rules is misspecified the loop will never end. In the two examples below the standard normal distribution is drawn in as long as the value is lower than 1. The cat() function is used to display the present value on screen.

> repeat { 
+ 	g <- rnorm(1) 
+ 	if (g > 1.0) break 
+ 	cat(g,"\n")
+ 	} 
-1.214395 
0.6393124 
0.05505484 
-1.217408 
> g <- 0
> while (g < 1){
+ 	g <- rnorm(1) 
+ 	cat(g,"\n")
+ 	}
-0.08111594 
0.1732847 
-0.2428368 
0.3359238 
-0.2080000 
0.05458533 
0.2627001 
1.009195

The next statement can be used to discontinue one particular cycle and skip to the “next”.

> for (k in 1:10) { 
+   if(k==8) {
+     print("skipped")
+     next
+   }
+   print(k)
+ }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] "skipped"
[1] 9
[1] 10

Iterators

edit

References

edit
Previous: Random Number Generation Index Next: Data Management