Statistics Ground Zero/Significance

Significance

In statistical testing we deal in probabilities. To ask our research question in a statistically testable way is to ask

If the null hypothesis is true, how likely is it that I would observe the data that I have collected?

Put slightly more technically

The p value represents the probability of seeing data this extreme if the null hypothesis were true

We set a threshold, most commonly 99% or 95%, meaning that we acknowledge that we might be misled into rejecting the null hypothesis 1% or 5% of the time respectively. P must fall below this threshold for us to reject the null hypothesis. That is to say p must be less than 0.01 or less than 0.05 (the inverse of 99% and 95% expressed as decimals).

This value, the p value, is said to determine whether the outcome of a test is significant or not. If the outcome is significant then the null hypothesis is rejected.

Choosing a test

Very often people find step three above - choosing the correct test - the most difficult, but if we know what we want to do and something about the nature of our data it is not so very difficult. The following table covers a surprisingly large number of common cases.

Question	Measure of Dependent Variable	Two Variables or Groups	More than Two Variables or Groups	Parametric	Non-parametric
Is there an association?	Nominal	Yes	^[1]		Chi-square
Is there an association?	Ordinal	Two			Spearman's correlation coefficient (with an indication of strength)
Is there an association?	Scalar	Two		Pearson's correlation coefficient (with an indication of strength)
Are the means or medians the same?	Scalar	Two		Student's T-test	Mann-Whitney U-test
Are the means or medians the same?	Scalar		More than two	Analysis of Variance (ANOVA)	Kruskal-Wallis
Can I predict one from another?	Scalar	Two	More than two independent	Regression or multiple regression

One or two tails?

When we formulate our hypothesis involving the comparison of values for a parameter or statistic, we choose whether to ask the question in one of two ways. We might simply ask are the values different or we might ask is one value smaller (or greater) than the other. In the first case we will determine the outcome using a two tailed test and in the second case, using a one tailed test.

Notes

↑ This is not true: it is possible to test the assocation of more than two nominal variables but the design is complicated