Biomedical Engineering Theory And Practice/R Language

Data Types

edit

R has a various objects for holding data, including scalars, vectors, matrices, arrays, data frames, and lists.

Scalar and Constant

edit

"Scalar" generally means "one-dimensional"vector. Constants only have one value ever. You can constants is similar to zero-dimensional values (a single point).

  • Scalar
> x<-3
> y<-6
> z<-x+y
> z
[1] 9
  • Constant
> 2+3
[1] 5
> 5-4
[1] 1
> 6*4
[1] 24

Vector

edit

Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. The combine function c() is used to form the vector. Here are examples of each type of vector:

> a<-c(1,2,5,-3,-6,5) #nummeric vector
> b<-c("one","two","three") #character vector
> d<-c(TRUE,FALSE,TRUE,FALSE,TRUE,TRUE) #logical vector
> a[c(2,4)]
[1]  2 -3
> a[4]
[1] -3
> a[2:4]
[1]  2  5 -3

Matrix

edit

A matrix is a two-dimensional array where each element has the same mode (numeric, character, or logical). Matrices are created with the "matrix" function . The general format is as follows:

> mymatrix <- matrix(vector, nrow=number of rows, ncol=number of columns,byrows=logical value, dimnames=list(vector-of-rownames,vector-of-colnames))
> A<-matrix(1:9,nrow=3)
> A
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
> A[2,1]
[1] 2
> A<-matrix(1:9,nrow=3,byrow=T)
> A
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Array

edit
> myarray <-array(vector, dimensions, dimnames)
> dim1 <- c("A1","A2","A3")
> dim2 <- c("B1","B2","B3","B4")
> dim3 <- c("C1","C2")
> x<-array(1:24,c(3,4,2),dimnames=list(dim1,dim2,dim3))
> x
, , C1

   B1 B2 B3 B4
A1  1  4  7 10
A2  2  5  8 11
A3  3  6  9 12

, , C2

   B1 B2 B3 B4
A1 13 16 19 22
A2 14 17 20 23
A3 15 18 21 24

Data Frame

edit
>mydata <-data.frame(col1,col2,col3....)
> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> patientDATA<-data.frame(patientID,age,diabetes,stats,row.names=letters[1:4])
> patientDATA
  patientID age diabetes     stats
a         A  24    Type1      Poor
b         B  35    Type2  Improved
c         C  28    Type1      Poor
d         D  52    Type2 Excellent

Factors

edit
> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> status <- factor(stats, order=TRUE)
> patientdata <- data.frame(patientID, age, diabetes, status)
> str(patientdata)
'data.frame':	4 obs. of  4 variables:
 $ patientID: Factor w/ 4 levels "A","B","C","D": 1 2 3 4
 $ age      : num  24 35 28 52
 $ diabetes : Factor w/ 2 levels "Type1","Type2": 1 2 1 2
 $ status   : Ord.factor w/ 3 levels "Excellent"<"Improved"<..: 3 2 3 1
> summary(patientdata)
 patientID      age         diabetes       status 
 A:1       Min.   :24.00   Type1:2   Excellent:1  
 B:1       1st Qu.:27.00   Type2:2   Improved :1  
 C:1       Median :31.50             Poor     :2  
 D:1       Mean   :34.75                          
           3rd Qu.:39.25                          
           Max.   :52.00

Lists

edit
>mylist <- list(name1=object1,name2=object2,...)
> x<-"TheList"
> y<-c(25,19,20)
> z<-matrix(1:10,nrow=2,byrow=TRUE)
> theta<-LETTERS[1:10]
> delta<-c(2+3i,4-6i)
> mylist<-list(title=x,components=y,z,theta,delta)
> mylist
$title
[1] "TheList"

$components
[1] 25 19 20

[[3]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10

[[4]]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

[[5]]
[1] 2+3i 4-6i

> mylist[[2]]
[1] 25 19 20
> mylist[["components"]]
[1] 25 19 20

> lapply(mylist,length)
$title
[1] 1

$components
[1] 3

[[3]]
[1] 10

[[4]]
[1] 10

[[5]]
[1] 2

> lapply(mylist,class)
$title
[1] "character"

$components
[1] "numeric"

[[3]]
[1] "matrix"

[[4]]
[1] "character"

[[5]]
[1] "complex"

> lapply(mylist,mean)
$title
[1] NA

$components
[1] 21.33333

[[3]]
[1] 5.5

[[4]]
[1] NA

[[5]]
[1] 3-1.5i

Warning messages:
1: In mean.default(X[[1L]], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(X[[4L]], ...) :

Basic Functions

edit

Arithmetic Operators

edit

The arithmetic operators and their examples which are used in R programming are listed in the table below.

Function R Command Example
Exponentiation,   > a^b > 3+6

[1] 9

Multiplication,   > a*b > 22*5

[1] 110

Division,   > a/b > 30/3

[1] 10

Addition,   > a+b > 10+9

[1] 19

Subtraction,   > a-b > 10-3

[1] 7

Integer(Quotient) > a%/%b > 20%/%3

[1] 6

Modulo(Remainder) > a%%b > 20%%3

[1] 2

Complex Number

edit
> x<-5.2-3i
R command R command
Complex number
> Re(x)
[1] 5.2
Real part
> Im(x)
[1] -3
Imaginary part
> Im(x)
[1] -3
Modulus
> Mod(x)
[1] 6.003332
Argument
> Arg(x)
[1] -0.5232783
Conjugate
> Conj(x)
[1] 5.2+3i
Membership
 
> is.complex(x)
[1] TRUE
Coercion
 
> as.complex(19.6)
[1] 19.6+0i

Rounding

edit
Function R Command Function R Command
Greatest integer less than
> floor(9.9)
[1] 9
> floor(-9.9)
[1] -10
Next integer
> ceiling(9.9)
[1] 10
> ceiling(-9.9)
[1] -9
Rounding function
> round(9.9)
[1] 10
> round(9.2)
[1] 9
Strip off the decimal
> trunc(8.6)
[1] 8
> trunc(-8.6)
[1] -8

Trigonometric Functions

edit
Function Trigometric Function Trigometric Inverse Function Hyperbolic Function Hyperbolic Inverse Function
sine sin(x) asin(x) sinh(x) asinh(x)
cosine cos(x) acos(x) cosh(x) acosh(x)
tangent tan(x) atan(x) tanh(x) atanh(x)

Log and Exponential Functions

edit
Function R command R Example
Absolute,  abs(x) > abs(-7.4)

[1] 7.4

Log to the base e,  > log(10)

[1] 2.302585

Log to the base 10,  log10(x) > log10(100)

[1] 2

Log to the base n of x log(x,n) > log(64,4)

[1] 3

  exp(x) > exp(3)

[1] 20.08554

  sqrt(x) > sqrt(25)

[1] 5

  factorial(x) > factorial(10)

[1] 3628800

  combinations(n,r) > choose(5,4)

[1] 5

Relational Operators and Logical Variables

edit

Relational Operators

edit
Relational Operator
Equal ==
Not equal !=
Less than <
Greater than >
Less than or equal <=
Greater than or equal >=
  • TRUE=1,FALSE=0
> x<-c(6,3,4)
> y<-c(5,15,9)
> z<-(x<y)
> z
[1] FALSE  TRUE  TRUE
> z<-(x<y)+5
> z
[1] 5 6 6

Logical Operators

edit
        &     
False(0) False(0) True(1) False(0) False(0) False(0)
False(0) True(1) True(1) False(0) True(1) True(1)
True(1) False(0) False(0) False(0) True(1) True(1)
True(1) True(1) False(0) True(1) True(1) False(0)
> x<-c(6,2,8)
> y<-c(14,6,7)
> z<-c(4,5,11)
> z1<-x>y
> z1
[1] FALSE FALSE  TRUE
> z2<-y>z
> z2
[1]  TRUE  TRUE FALSE
> z3<-(x>y) & (y>z)
> z3
[1] FALSE FALSE FALSE

> z1<-xor(x,y) > z1 [1] FALSE FALSE FALSE

Sequence Generation and Repeats

edit
> x1
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0
> x2 <- seq(from=0.4,by=0.01,length=15)
> x2
 [1] 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54
> x3<-seq(1.4,2.1,0.3)
> x3
[1] 1.4 1.7 2.0
> x4<-rep(15,7)
> x4
[1] 15 15 15 15 15 15 15
> x5<-rep(1:4,3)
> x5
 [1] 1 2 3 4 1 2 3 4 1 2 3 4
> x6<-rep(1:3,each=2,times=3)
> x6
 [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
> x7<-rep(c("a","b","c"),c(1,2,3))
> x7
[1] "a" "b" "b" "c" "c" "c"

Random Number Generation

edit
> set.seed(100)
> runif(5)
[1] 0.5465586 0.1702621 0.6249965 0.8821655 0.2803538
> runif(5)
[1] 0.3984879 0.7625511 0.6690217 0.2046122 0.3575249
> x<-c(5,10,8,6,9,11,14,16,18)
> sample(x)
[1]  6 11 18  9  8 16 10  5 14
> sample(x)
[1] 16  9 10  8 18 14 11  6  5
> sample(x,4)
[1] 10 11 14  5

Vector Functions

edit

Length and Statistics

edit
> x<-c(6,9,11,14,12,2,33,76,0,90)
Function R command Function R command
Length
> length(x)
[1] 10
Mean
> mean(x)
[1] 25.3
Max
> max(x)
[1] 90
Min
> min(x)
[1] 0
Distribution
> quantile(x)
   0%   25%   50%   75%  100% 
 0.00  6.75 11.50 28.25 90.00
Sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
Function R command
Reference the 5th element of Vector from the vector
> x[5]
[1] 12
Delete the 3rd element of vector from the vector
> x1<-x[-3]
> x1
[1]  6  9 14 12  2 33 76  0 90
Delete the last element of vector from the vector
> x2<-x[-length(x)]
> x2
[1]  6  9 11 14 12  2 33 76  0
Delete 1st and the last element of vector from the vector
> x3<-x[c(-1,-length(x))]
> x3
[1]  9 11 14 12  2 33 76  0
Remove the smallest 2 and the largest 3 element from the vector
> trim <-function(x)sort(x)[-c(1,2,length(x)-2,length(x)-1,length(x))]
> trim(x)
[1]  6  9 11 12 14
R code
Sum
> sum(x)
[1] 253
Mean,Median
> mean(x)
[1] 25.3
> median(x)
[1] 11.5
Range > range(x)

[1] 0 90

Standard Deviation,variance
> sd(x)
[1] 31.87841
> var(x)
[1] 1016.233
Which is the largest and smallest number
> which(x==max(x))
[1] 10
 
> which(x==min(x))
[1] 9
sort and reverse sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
> rev(sort(x))
 [1] 90 76 33 14 12 11  9  6  2  0
> x<-matrix(rpois(15,1.2),nrow=3)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    2    1    0    3    3
[2,]    0    2    3    1    2
[3,]    2    2    0    1    1
> mean(x[,5])
[1] 2
> var(x[3,])
[1] 0.7
> rowSums(x)
[1] 9 8 6
> colSums(x)
[1] 4 5 3 5 6
> rowMeans(x)
[1] 1.8 1.6 1.2
> colMeans(x)
[1] 1.333333 1.666667 1.000000 1.666667 2.000000

Parallel min and max

edit
> x<-c(2,5,10,-6,29,45)
> y<-c(5,9,15,-22,38,88)
> z<-c(9,10,2,7,55,24)
> q<-c(22,3,5,6,-23,88)
> pmin(x,y,z,q)
[1]   2   3   2 -22 -23  24
> pmax(x,y,z,q)
[1] 22 10 15  7 55 88

'table' and 'tapply'

edit
> data(ChickWeight)
weight Time Chick Diet
1     42    0     1    1
2     51    2     1    1
3     59    4     1    1
4     64    6     1    1
5     76    8     1    1
6     93   10     1    1
.......................
576    234   18    50    4
577    264   20    50    4
578    264   21    50    4
> tapply(ChickWeight$weight,ChickWeight$Time,mean)
        0         2         4         6         8        10        12        14 
 41.06000  49.22000  59.95918  74.30612  91.24490 107.83673 129.24490 143.81250 
       16        18        20        21 
168.08511 190.19149 209.71739 218.68889 
> tapply(ChickWeight$weight,ChickWeight$Diet,median)
    1     2     3     4 
 88.0 104.5 125.5 129.5
> codon1=c("UUU","UUC","UUA","UUG","UUA","UUG","UUC")
> table(codon1)
codon1
UUA UUC UUG UUU 
  2   2   2   1 
> aminoacid=list(Phe=c("UUU","UUC"),Leu=c("UUA","UUG"))
> codon=as.factor(codon1)
> levels(codon)=aminoacid
> codon
[1] Phe Phe Leu Leu Leu Leu Phe
Levels: Phe Leu
> table(codon)
codon
Phe Leu 
  3   4

'apply'

edit
> x<-matrix(1:15,nrow=3,byrow=T)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10
[3,]   11   12   13   14   15
> apply(x,1,sum)
[1] 15 40 65
> apply(x,2,sum)
[1] 18 21 24 27 30
> apply(x,1,sqrt)
         [,1]     [,2]     [,3]
[1,] 1.000000 2.449490 3.316625
[2,] 1.414214 2.645751 3.464102
[3,] 1.732051 2.828427 3.605551
[4,] 2.000000 3.000000 3.741657
[5,] 2.236068 3.162278 3.872983
> apply(x,2,sqrt)
         [,1]     [,2]     [,3]     [,4]     [,5]
[1,] 1.000000 1.414214 1.732051 2.000000 2.236068
[2,] 2.449490 2.645751 2.828427 3.000000 3.162278
[3,] 3.316625 3.464102 3.605551 3.741657 3.872983

Closets

edit
> x<-c(3,22,15,11,50,85)
> x-10
[1] -7 12  5  1 40 75
> abs(x-10)
[1]  7 12  5  1 40 75
> min(abs(x-10))
[1] 1
> which(abs(x-10)==min(abs(x-10)))
[1] 4

Sort,Rank,Order

edit
> x<-c(2,5,10,-6,29,45)
> # rank: the rank of unsorted vector
> rank(x)
[1] 2 3 4 1 5 6
> # order:the rank of the sorted vector
> order(x)
[1] 4 1 2 3 5 6

Unique and Duplicated

edit
> x<-c("a","b","c","a","a","a","b","c")
> table(x)
x
a b c 
4 2 2 
> unique(x)
[1] "a" "b" "c"
> duplicated(x)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> x[!duplicated(x)]
[1] "a" "b" "c"

Run length

edit
> x<-rpois(20,0.5)
> x
 [1] 2 0 0 1 0 1 0 0 1 1 0 0 2 0 0 0 0 0 0 0
> rle(x)
Run Length Encoding
  lengths: int [1:10] 1 2 1 1 1 2 2 2 1 7
  values : int [1:10] 2 0 1 0 1 0 1 0 2 0

Set functions

edit
> setA <-c("I","II","III","IV","V")
> setB <-c("III","IV","V","VI")
> union(setA,setB)
[1] "I"   "II"  "III" "IV"  "V"   "VI" 
> intersect(setA,setB)
[1] "III" "IV"  "V"  
> setdiff(setA,setB)
[1] "I"  "II"
> setdiff(setB,setA)
[1] "VI"

Practise

edit

Reference

edit