Statistics/Summary/Quartiles
Quartiles
editThe quartiles of a data set are formed by the two boundaries on either side of the median, which divide the set into four equal sections. The lowest 25% of the data being found below the first quartile value, also called the lower quartile (Q1). The median, or second quartile divides the set into two equal sections. The lowest 75% of the data set should be found below the third quartile, also called the upper quartile (Q3). These three numbers are measures of the dispersion of the data, while the mean, median and mode are measures of central tendency.
Examples
editGiven the set {1,3,5,8,9,12,24,25,28,30,41,50} we would find the first and third quartiles as follows:
There are 12 elements in the set, so 12/4 gives us three elements in each quarter of the set.
So the first or lowest quartile is: 5, the second quartile is the median 12, and the third or upper quartile is 28.
However some people when faced with a set with an even number of elements (values) still want the true median (or middle value), with an equal number of data values on each side of the median (rather than 12 which has 5 values less than and 6 values greater than. This value is then the average of 12 and 24 resulting in 18 as the true median (which is closer to the mean of 19 2/3. The same process is then applied to the lower and upper quartiles, giving 6.5, 18, and 29. This is only an issue if the data contains an even number of elements with an even number of equally divided sections, or an odd number of elements with an odd number of equally divided sections.
Inter-Quartile Range
editThe inter quartile range is a statistic which provides information about the spread of a data set, and is calculated by subtracting the first quartile from the third quartile), giving the range of the middle half of the data set, trimming off the lowest and highest quarters. Since the IQR is not affected at all by outliers in the data, it is a more robust measure of dispersion than the range
IQR = Q3 - Q1
Another useful quantile is the quintiles which subdivide the data into five equal sections. The advantage of quintiles is that there is a central one with boundaries on either side of the median which can serve as an average group. In a Normal distribution the boundaries of the quintiles have boundaries ±0.253*s and ±0.842*s on either side of the mean (or median),where s is the sample standard deviation. Note that in a Normal distribution the mean, median and mode coincide.
Other frequently used quantiles are the deciles (10 equal sections) and the percentiles (100 equal sections)