1.10: Introduction to Statistics
Measures of Center and Spread
editThe following three numbers represent 3 different ways to think about the average value of your set.
Mean - This is what we usually think of as the "average" of a data set. The mean can be found by summing all the values in the data set and dividing by the size of the data set (that is number of elements in the set). In mathematical notation,
For example: Suppose 1, 2, 4, 6, 8, 9 is our data set then the sum is 1 + 2 + 4 + 6 + 8 + 9 = 30 and there are 6 elements in the data set, so the mean is 30/6 = 5.
The mean, while a very useful statistic, has its flaws. Notably, its value may be heavily influenced by outliers - numbers in a data set which are significantly higher or lower than the majority of the data. It is often preferable to use the median instead to describe such data sets.
Median - This is the middle of our data set. To find the median you must first put your data values in numerical order (say, from smallest to largest). If you have an odd number of elements in your data set there will be exactly one number in the middle, this number is the median. If you have an even number of elements in your data set then the median is the average of the middle two numbers. et For example. If our data set was 2, 2, 3, 4, 4, 5, 6, 7, 8, 9, 12, 13, 16, 22 is our data s data set. Since it has an even number of elements, we have to take the mean of the middle two, in this case 6 and 7, so the median is 6.5.
Mode - Mode refers to how many times a number or numbers occur in a data set. Since mean, median, and mode often are confused with each other, an easy way to remember mode is 'most often'. The first two letters in mode are 'm' and 'o', imagine this stands for 'most often' to help you remember. In the case that two or more different values are tied for the most number of repeats then that data set is said to have multiple modes. If your asked to find the mode of a data set with multiple modes, then all of the modes should be listed. If no element of the data repeats, then there is no mode.
For example. Suppose 1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 7 is our data set, then the mode would be both 2 and 5. They both occur three times and three is the maximum number of repeats in our data set.
The following quantity tell us how spread out our data set is.
Range - The difference between the largest and smallest numbers in our data set. Notice this means the range is never negative.
Examples
editMean
Let's look at the following data set:
Data Values: 10, 13, 4, 7, 9 so n = 5
Now add the values together:
10 + 13 + 4 + 7 + 9 = 43
43 / 5 = 8.6
Mean = 8.6
Median
Case 1:
Data Values: 10, 13, 4, 7, 8 so n = 5
Numerical Order: 4, 7, 8, 13, 10
Since 8 is the middle number,
Median = 8
Case 2:
Data Values: 10, 13, 4, 7, 8, 10 so n = 6
Numerical Order: 4, 7, 8, 10, 10, 13
Middle Numbers: 8 and 10
Find Mean: 8 + 10 = 18
18 / 2 = 9
Median = 9
Mode
Data Values: 10, 13, 4, 7, 8, 10
10 is in the data set twice.
Mode = 10
Data Values: 4, 9, 13, 18, 4, 2, 9, 4, 13, 8, 9
4 and 9 both have three data values.
Mode = 4, 9
Range
Data Values: 10, 13, 4, 7, 8
Numerical Order: 4, 7, 8, 10, 13
Difference of last and first: 13 - 4 = 9
Range = 9
Quiz
edit