Radiation Oncology/Medical Statistics/Fishers Test

Fisher's Exact Test (2x2)
OverviewEdit
 Very common in smaller medical studies
 Appropriate for binary data
 Evaluates association between two binary classifications
 Characteristic
 "Group 1" vs. "Group 2"
 For example: male/female, treatment given/no treatment given, treatment type 1/treatment type 2, high dose/low dose, etc.
 Outcome
 "Success" vs. "Failure"
 For example: alive/dead, response/no response, recurrence/no recurrence, etc.
 This defines a 2x2 matrix
 Characteristic


 Used for small sample sizes (typically less than 50), due to limitation on computational power. Please see Χ^{2} Test for more information on approximations
DetailsEdit
 Hypothesis tested (H0) is that there is no association between the two classifications
 Fisher's test determines the degree to which this hypothesis is consistent with the data
 Probability of correlation of Group 1 with Outcome 1 (e.g. male  survive) is p1
 Probability of correlation of Group 2 with Outcome 2 (e.g. female  survive) is p2
 The test hypothesis (H0) is that there is no correlation, and that p1 = p2
 Therefore, the population "success" rate (Outcome 1) p_{0} = p1 = p2, and can be calculated as (Group 1 Outcome 1 + Group 2 Outcome 1) / (Total Population)
 Observation: We observe a given total number of Outcome 1 in the study
 Question is, are the Outcomes 1 divided between Group 1 and Group 2 such that p1 = p2, or are they split up such that it is not likely that p1 = p2?
 To objectively evaluate this question, we set up the tables showing all the possible outcomes that we could have seen, given the fixed number of cases in Group 1 and Group 2, and the fixed number of Outcome 1 and Outcome 2.
 Test statistic (T) is the number of Outcome 1 ("success") in Group 1.
 Given that the number of cases in Group 1 and Group 2 is known, and the number of Outcome 1 and Outcome 2 events is known, knowing T determines all the other points in the matrix
Outcome 1  Outcome 2  Total  
Group 1  T  R1T  R1 
Group 2  C1T  C2+TR1  R2 
Total  C1  C2  N 
 We then evaluate the probability that each possible outcome (including the actual observed outcome) would occur by chance from a random sampling
 Binomial coefficient calculations are done to find the probability that a table T (range of t) would be observed by chance in a random sampling:
 However, as these are computationally intensive, statistical tables or software is generally used to evaluate probability of T
 The probability of the actual outcome combined with the probability of the even less likely outcomes occurring due to chance random sampling alone defines the significance level (p)
 If the probability of these outcomes is <5% (p<0.05), typically the test hypothesis that there is no difference between the populations is rejected
 Assumptions:
 Probability of Outcome for each member of a given Group is the same; it does not vary from member to member. Random sampling ensures this
 The Outcome of one member does not affect the outcome of a different member
ExampleEdit
 Adapted from PMID 6092550 as shown in Using and Understanding Medical Statistics
 In our example above, there are 4 total relapses observed. The question is, is the rate of relapse related to treatment type (large field RT vs. small field RT)?
 If the H0 hypothesis is true, and the two failure rates are comparable, and are also comparable to the failure rate in the entire population, then p_{0} = 4/259 = 0.015
 The expected number of failures observed in the Small Field RT group would therefore be 23 patients x 0.015 = 0.4 failures
 However, 2 failures were observed. Is this likely due to chance, or not?
 There are 5 possibilities of how the 4 failures could have been observed in a random sampling (with Possibility 2 being the actual observation):





 The corresponding relapse rates for the Small Field RT (Group 1) are:
Possibility 0  Possibility 1  Possibility 2 (Observed)  Possibility 3  Possibility 4  (Expected)  
Small field RT  0.000  0.044  0.087  0.130  0.174  0.015 
 We then need to calculate the probability that each of the possibilities occurs (something typically done by professional statisticians, for more detail see below)
Possibility 0  Possibility 1  Possibility 2 (Observed)  Possibility 3  Possibility 4  Total  
Small field RT  0.687  0.271  0.039  0.002  0.0001  1.000 
 Our original hypothesis is that the probability of failure in Small Field RT is the same as in Large Field RT, which is the same as in the entire underlying population treated with RT. Possibility 0 (68.7%) and Possibility 1 (27.1%) would have been reasonably consistent with this hypothesis
 The probability that the actual observation (Possibility 2) occurs as a result of the random sampling process is 0.039 (3.9%), which is not as reasonable. Possibilities 35 are even less likely to be reasonable than the actual observation (0.2%, 0.01%)
 The significance level is the sum of probability of the observed occurrence (3.9%) and the probabilities of the even less likely possibilities (0.2%, 0.01%) = 4.11%; expressed as p=0.04
 In this example, the hypothesis that there is no difference between the two groups with respect to the two outcomes is rejected. Conclusion: Small Field RT results in significantly more failures
LinksEdit
 Wikipedia Fisher's Exact Test