Biomedical Engineering Theory And Practice/R Language

Data Types edit

R has a various objects for holding data, including scalars, vectors, matrices, arrays, data frames, and lists.

Scalar and Constant edit

"Scalar" generally means "one-dimensional"vector. Constants only have one value ever. You can constants is similar to zero-dimensional values (a single point).

  • Scalar
> x<-3
> y<-6
> z<-x+y
> z
[1] 9
  • Constant
> 2+3
[1] 5
> 5-4
[1] 1
> 6*4
[1] 24

Vector edit

Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. The combine function c() is used to form the vector. Here are examples of each type of vector:

> a<-c(1,2,5,-3,-6,5) #nummeric vector
> b<-c("one","two","three") #character vector
> d<-c(TRUE,FALSE,TRUE,FALSE,TRUE,TRUE) #logical vector
> a[c(2,4)]
[1]  2 -3
> a[4]
[1] -3
> a[2:4]
[1]  2  5 -3

Matrix edit

A matrix is a two-dimensional array where each element has the same mode (numeric, character, or logical). Matrices are created with the "matrix" function . The general format is as follows:

> mymatrix <- matrix(vector, nrow=number of rows, ncol=number of columns,byrows=logical value, dimnames=list(vector-of-rownames,vector-of-colnames))
> A<-matrix(1:9,nrow=3)
> A
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
> A[2,1]
[1] 2
> A<-matrix(1:9,nrow=3,byrow=T)
> A
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Array edit

> myarray <-array(vector, dimensions, dimnames)
> dim1 <- c("A1","A2","A3")
> dim2 <- c("B1","B2","B3","B4")
> dim3 <- c("C1","C2")
> x<-array(1:24,c(3,4,2),dimnames=list(dim1,dim2,dim3))
> x
, , C1

   B1 B2 B3 B4
A1  1  4  7 10
A2  2  5  8 11
A3  3  6  9 12

, , C2

   B1 B2 B3 B4
A1 13 16 19 22
A2 14 17 20 23
A3 15 18 21 24

Data Frame edit

>mydata <-data.frame(col1,col2,col3....)
> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> patientDATA<-data.frame(patientID,age,diabetes,stats,row.names=letters[1:4])
> patientDATA
  patientID age diabetes     stats
a         A  24    Type1      Poor
b         B  35    Type2  Improved
c         C  28    Type1      Poor
d         D  52    Type2 Excellent

Factors edit

> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> status <- factor(stats, order=TRUE)
> patientdata <- data.frame(patientID, age, diabetes, status)
> str(patientdata)
'data.frame':	4 obs. of  4 variables:
 $ patientID: Factor w/ 4 levels "A","B","C","D": 1 2 3 4
 $ age      : num  24 35 28 52
 $ diabetes : Factor w/ 2 levels "Type1","Type2": 1 2 1 2
 $ status   : Ord.factor w/ 3 levels "Excellent"<"Improved"<..: 3 2 3 1
> summary(patientdata)
 patientID      age         diabetes       status 
 A:1       Min.   :24.00   Type1:2   Excellent:1  
 B:1       1st Qu.:27.00   Type2:2   Improved :1  
 C:1       Median :31.50             Poor     :2  
 D:1       Mean   :34.75                          
           3rd Qu.:39.25                          
           Max.   :52.00

Lists edit

>mylist <- list(name1=object1,name2=object2,...)
> x<-"TheList"
> y<-c(25,19,20)
> z<-matrix(1:10,nrow=2,byrow=TRUE)
> theta<-LETTERS[1:10]
> delta<-c(2+3i,4-6i)
> mylist<-list(title=x,components=y,z,theta,delta)
> mylist
$title
[1] "TheList"

$components
[1] 25 19 20

[[3]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10

[[4]]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

[[5]]
[1] 2+3i 4-6i

> mylist[[2]]
[1] 25 19 20
> mylist[["components"]]
[1] 25 19 20

> lapply(mylist,length)
$title
[1] 1

$components
[1] 3

[[3]]
[1] 10

[[4]]
[1] 10

[[5]]
[1] 2

> lapply(mylist,class)
$title
[1] "character"

$components
[1] "numeric"

[[3]]
[1] "matrix"

[[4]]
[1] "character"

[[5]]
[1] "complex"

> lapply(mylist,mean)
$title
[1] NA

$components
[1] 21.33333

[[3]]
[1] 5.5

[[4]]
[1] NA

[[5]]
[1] 3-1.5i

Warning messages:
1: In mean.default(X[[1L]], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(X[[4L]], ...) :

Basic Functions edit

Arithmetic Operators edit

The arithmetic operators and their examples which are used in R programming are listed in the table below.

Function R Command Example
Exponentiation,   > a^b > 3+6

[1] 9

Multiplication,   > a*b > 22*5

[1] 110

Division,   > a/b > 30/3

[1] 10

Addition,   > a+b > 10+9

[1] 19

Subtraction,   > a-b > 10-3

[1] 7

Integer(Quotient) > a%/%b > 20%/%3

[1] 6

Modulo(Remainder) > a%%b > 20%%3

[1] 2

Complex Number edit

> x<-5.2-3i
R command R command
Complex number
> Re(x)
[1] 5.2
Real part
> Im(x)
[1] -3
Imaginary part
> Im(x)
[1] -3
Modulus
> Mod(x)
[1] 6.003332
Argument
> Arg(x)
[1] -0.5232783
Conjugate
> Conj(x)
[1] 5.2+3i
Membership
 
> is.complex(x)
[1] TRUE
Coercion
 
> as.complex(19.6)
[1] 19.6+0i

Rounding edit

Function R Command Function R Command
Greatest integer less than
> floor(9.9)
[1] 9
> floor(-9.9)
[1] -10
Next integer
> ceiling(9.9)
[1] 10
> ceiling(-9.9)
[1] -9
Rounding function
> round(9.9)
[1] 10
> round(9.2)
[1] 9
Strip off the decimal
> trunc(8.6)
[1] 8
> trunc(-8.6)
[1] -8

Trigonometric Functions edit

Function Trigometric Function Trigometric Inverse Function Hyperbolic Function Hyperbolic Inverse Function
sine sin(x) asin(x) sinh(x) asinh(x)
cosine cos(x) acos(x) cosh(x) acosh(x)
tangent tan(x) atan(x) tanh(x) atanh(x)

Log and Exponential Functions edit

Function R command R Example
Absolute,  abs(x) > abs(-7.4)

[1] 7.4

Log to the base e,  > log(10)

[1] 2.302585

Log to the base 10,  log10(x) > log10(100)

[1] 2

Log to the base n of x log(x,n) > log(64,4)

[1] 3

  exp(x) > exp(3)

[1] 20.08554

  sqrt(x) > sqrt(25)

[1] 5

  factorial(x) > factorial(10)

[1] 3628800

  combinations(n,r) > choose(5,4)

[1] 5

Relational Operators and Logical Variables edit

Relational Operators edit

Relational Operator
Equal ==
Not equal !=
Less than <
Greater than >
Less than or equal <=
Greater than or equal >=
  • TRUE=1,FALSE=0
> x<-c(6,3,4)
> y<-c(5,15,9)
> z<-(x<y)
> z
[1] FALSE  TRUE  TRUE
> z<-(x<y)+5
> z
[1] 5 6 6

Logical Operators edit

        &     
False(0) False(0) True(1) False(0) False(0) False(0)
False(0) True(1) True(1) False(0) True(1) True(1)
True(1) False(0) False(0) False(0) True(1) True(1)
True(1) True(1) False(0) True(1) True(1) False(0)
> x<-c(6,2,8)
> y<-c(14,6,7)
> z<-c(4,5,11)
> z1<-x>y
> z1
[1] FALSE FALSE  TRUE
> z2<-y>z
> z2
[1]  TRUE  TRUE FALSE
> z3<-(x>y) & (y>z)
> z3
[1] FALSE FALSE FALSE

> z1<-xor(x,y) > z1 [1] FALSE FALSE FALSE

Sequence Generation and Repeats edit

> x1
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0
> x2 <- seq(from=0.4,by=0.01,length=15)
> x2
 [1] 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54
> x3<-seq(1.4,2.1,0.3)
> x3
[1] 1.4 1.7 2.0
> x4<-rep(15,7)
> x4
[1] 15 15 15 15 15 15 15
> x5<-rep(1:4,3)
> x5
 [1] 1 2 3 4 1 2 3 4 1 2 3 4
> x6<-rep(1:3,each=2,times=3)
> x6
 [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
> x7<-rep(c("a","b","c"),c(1,2,3))
> x7
[1] "a" "b" "b" "c" "c" "c"

Random Number Generation edit

> set.seed(100)
> runif(5)
[1] 0.5465586 0.1702621 0.6249965 0.8821655 0.2803538
> runif(5)
[1] 0.3984879 0.7625511 0.6690217 0.2046122 0.3575249
> x<-c(5,10,8,6,9,11,14,16,18)
> sample(x)
[1]  6 11 18  9  8 16 10  5 14
> sample(x)
[1] 16  9 10  8 18 14 11  6  5
> sample(x,4)
[1] 10 11 14  5

Vector Functions edit

Length and Statistics edit

> x<-c(6,9,11,14,12,2,33,76,0,90)
Function R command Function R command
Length
> length(x)
[1] 10
Mean
> mean(x)
[1] 25.3
Max
> max(x)
[1] 90
Min
> min(x)
[1] 0
Distribution
> quantile(x)
   0%   25%   50%   75%  100% 
 0.00  6.75 11.50 28.25 90.00
Sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
Function R command
Reference the 5th element of Vector from the vector
> x[5]
[1] 12
Delete the 3rd element of vector from the vector
> x1<-x[-3]
> x1
[1]  6  9 14 12  2 33 76  0 90
Delete the last element of vector from the vector
> x2<-x[-length(x)]
> x2
[1]  6  9 11 14 12  2 33 76  0
Delete 1st and the last element of vector from the vector
> x3<-x[c(-1,-length(x))]
> x3
[1]  9 11 14 12  2 33 76  0
Remove the smallest 2 and the largest 3 element from the vector
> trim <-function(x)sort(x)[-c(1,2,length(x)-2,length(x)-1,length(x))]
> trim(x)
[1]  6  9 11 12 14
R code
Sum
> sum(x)
[1] 253
Mean,Median
> mean(x)
[1] 25.3
> median(x)
[1] 11.5
Range > range(x)

[1] 0 90

Standard Deviation,variance
> sd(x)
[1] 31.87841
> var(x)
[1] 1016.233
Which is the largest and smallest number
> which(x==max(x))
[1] 10
 
> which(x==min(x))
[1] 9
sort and reverse sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
> rev(sort(x))
 [1] 90 76 33 14 12 11  9  6  2  0
> x<-matrix(rpois(15,1.2),nrow=3)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    2    1    0    3    3
[2,]    0    2    3    1    2
[3,]    2    2    0    1    1
> mean(x[,5])
[1] 2
> var(x[3,])
[1] 0.7
> rowSums(x)
[1] 9 8 6
> colSums(x)
[1] 4 5 3 5 6
> rowMeans(x)
[1] 1.8 1.6 1.2
> colMeans(x)
[1] 1.333333 1.666667 1.000000 1.666667 2.000000

Parallel min and max edit

> x<-c(2,5,10,-6,29,45)
> y<-c(5,9,15,-22,38,88)
> z<-c(9,10,2,7,55,24)
> q<-c(22,3,5,6,-23,88)
> pmin(x,y,z,q)
[1]   2   3   2 -22 -23  24
> pmax(x,y,z,q)
[1] 22 10 15  7 55 88

'table' and 'tapply' edit

> data(ChickWeight)
weight Time Chick Diet
1     42    0     1    1
2     51    2     1    1
3     59    4     1    1
4     64    6     1    1
5     76    8     1    1
6     93   10     1    1
.......................
576    234   18    50    4
577    264   20    50    4
578    264   21    50    4
> tapply(ChickWeight$weight,ChickWeight$Time,mean)
        0         2         4         6         8        10        12        14 
 41.06000  49.22000  59.95918  74.30612  91.24490 107.83673 129.24490 143.81250 
       16        18        20        21 
168.08511 190.19149 209.71739 218.68889 
> tapply(ChickWeight$weight,ChickWeight$Diet,median)
    1     2     3     4 
 88.0 104.5 125.5 129.5
> codon1=c("UUU","UUC","UUA","UUG","UUA","UUG","UUC")
> table(codon1)
codon1
UUA UUC UUG UUU 
  2   2   2   1 
> aminoacid=list(Phe=c("UUU","UUC"),Leu=c("UUA","UUG"))
> codon=as.factor(codon1)
> levels(codon)=aminoacid
> codon
[1] Phe Phe Leu Leu Leu Leu Phe
Levels: Phe Leu
> table(codon)
codon
Phe Leu 
  3   4

'apply' edit

> x<-matrix(1:15,nrow=3,byrow=T)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10
[3,]   11   12   13   14   15
> apply(x,1,sum)
[1] 15 40 65
> apply(x,2,sum)
[1] 18 21 24 27 30
> apply(x,1,sqrt)
         [,1]     [,2]     [,3]
[1,] 1.000000 2.449490 3.316625
[2,] 1.414214 2.645751 3.464102
[3,] 1.732051 2.828427 3.605551
[4,] 2.000000 3.000000 3.741657
[5,] 2.236068 3.162278 3.872983
> apply(x,2,sqrt)
         [,1]     [,2]     [,3]     [,4]     [,5]
[1,] 1.000000 1.414214 1.732051 2.000000 2.236068
[2,] 2.449490 2.645751 2.828427 3.000000 3.162278
[3,] 3.316625 3.464102 3.605551 3.741657 3.872983

Closets edit

> x<-c(3,22,15,11,50,85)
> x-10
[1] -7 12  5  1 40 75
> abs(x-10)
[1]  7 12  5  1 40 75
> min(abs(x-10))
[1] 1
> which(abs(x-10)==min(abs(x-10)))
[1] 4

Sort,Rank,Order edit

> x<-c(2,5,10,-6,29,45)
> # rank: the rank of unsorted vector
> rank(x)
[1] 2 3 4 1 5 6
> # order:the rank of the sorted vector
> order(x)
[1] 4 1 2 3 5 6

Unique and Duplicated edit

> x<-c("a","b","c","a","a","a","b","c")
> table(x)
x
a b c 
4 2 2 
> unique(x)
[1] "a" "b" "c"
> duplicated(x)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> x[!duplicated(x)]
[1] "a" "b" "c"

Run length edit

> x<-rpois(20,0.5)
> x
 [1] 2 0 0 1 0 1 0 0 1 1 0 0 2 0 0 0 0 0 0 0
> rle(x)
Run Length Encoding
  lengths: int [1:10] 1 2 1 1 1 2 2 2 1 7
  values : int [1:10] 2 0 1 0 1 0 1 0 2 0

Set functions edit

> setA <-c("I","II","III","IV","V")
> setB <-c("III","IV","V","VI")
> union(setA,setB)
[1] "I"   "II"  "III" "IV"  "V"   "VI" 
> intersect(setA,setB)
[1] "III" "IV"  "V"  
> setdiff(setA,setB)
[1] "I"  "II"
> setdiff(setB,setA)
[1] "VI"

Practise edit

Reference edit