Statistics Ground Zero

Introduction

This book is intended for readers who need to deploy standard statistical techniques for data analysis but do not have statistical training. In particular it might be useful for readers of the wikibook Using SPSS and PASW.

It is possible to get by in applied statistics without any real mathematical understanding of what you are doing, but it cannot be recommended. This book starts from the assumption that even flying by the seat of your pants, it is worth knowing how to measure wind-speed.

The content was determined by listing what an undergraduate social science student might have to learn on a non-specialist course in statistics or applied statistics and then stripping it to the bare bones and especially avoiding much mathematical detail.

There is no coverage of probability, which is one of the foundations of modern statistical thinking, just because it is the author's belief that if really necessary, you can get by without - well almost. Probability rears its head but this book relies only on naive, intuitive ideas of the probable.

I want to stress that I do not think anyone should get by without developing a proper understanding of statistical methods but if you find yourself having to analyse data or perform a test and do not know where to start, this book might help.

I assume that the reader will use a suitable computer application (SPSS, Minitab, Stata are examples) and so although I give brief explanations of some calculations, I don't cover the use of statistical tables to find significance values.

Two uses of statistics

Describing the world

Descriptive statistics is a way of formalising the measurable characteristics of the world. From the standpoint of scientific investigation we often isolate some part of the world of experience and simplify it for the purpose of observation and measurement. In descriptive statistics we concentrate on the quantifiable aspects of what we investigate and try to characterise it in a useful way. Our aim is to distinguish things that are unalike in relevant ways and to make clear similarities between things that may on the surface appear quite different.

Testing theories about the world

Science aims to explain our experience by offering a theory that accounts for it. Theories are explanatory accounts that consist (among other things) of one or more hypotheses about the elements of experience. These hypotheses might be, for example that x is very like y; p is very often associated with q; t is generally half the value of r multiplied by s. Perhaps all the x, ys and zs don't help but we can find an example that might. Suppose that we believe that the amount of sweat (measured as the area of patches on clothing) a football player produces depends upon the amount of time she is in possession of the ball and the environmental temperature. We could try model this as

sweat patch area = environmental temperature + possession time

now this is a very unlikely model. Marginally more likely is that there are some coefficients that would be needed in this equation if it could work at all. But we can ask whether these coefficients exist such that

sweat patch area = (environmental temperature * a) + (possession time * c) ± noise

and it turns out that many relationships between phenomena in the world are governed by equations that are something like this. Statistical analysis can tell us if there is such an equation and how reliable its predictions are.

How significance testing works

Imagine that I have a drawer full of socks - six unmatched socks, each a different colour from red, blue, green, orange, yellow, grey. If I pull two socks out at random, what are the chances that I will pull out a red and a blue together? The possible combinations for a pull are:

red blue

red green

red orange

red yellow

red gray

blue green

blue orange

blue yellow

blue gray

green orange

green yellow

green gray

orange yellow

orange grey

yellow gray

(If you remember anything about permutations this is n!/(p! (n-p)!)

In short the chance of getting the red-blue combination is 1/15. In fact you can work out the probability of any mix on a pull of any number of socks. We can know how likely - and conversely how unlikely - a combination is. Now, lets imagine that there are ten socks of each colour in the drawer. We can calculate the probability for any pull of any number as before. Now, imagine that we don't know how many socks of each colour are in the drawer though we know there are six possible colours and sixty socks over all. I make a pull of socks - say five at a time. Suppose that I get

green green blue red orange

We can compute how unlikely this is if in fact there are equal numbers of each colour sock in the drawer. Some combinations will be comfortably likely given an equal distribution of sock colours but some will be very unlikely on that assumption.

So, if we wanted to bet on whether there were in fact equal numbers of each colour sock in the drawer we could proceed like this

Make the null hypothesis that there are equal numbers of each colour sock in the drawer
Set a level of confidence we require - we'd like to be wrong (and lose money!) no more than 5% of the time for example
Make a pull
Compute how unlikely this pull is given the null hypothesis

If we get results that are wildly different from what is likely on the assumption of the null hypothesis we will feel confident in rejecting it, otherwise we will be cautious and will say that we have no grounds to reject it.

Many statistical tests involve the computation of a statistic which is taken to be drawn from a particular distribution of values. This distritbution is our sock drawer. A given value for this statistic will be more or less likely on the null hypothesis and this likelihood is called the p value. We decide what degree of confidence we will be satisfied by (that is we allow that we may be mislead some fraction of the time) and then calculate whether on the numbers before us we should reject the null hypothesis - how extreme, we ask ourselves, are the data in this case?. If the p value on this occasion is less than the level of confidence then it is too extreme to accept the null hypothesis.