Statistical Inference Using the t Statistic

One of the biggest changes in society over the past fifty years has been the expansion of economic rights for women. More women have jobs outside the home than ever before, and women are able to work in a wider variety of jobs than ever before. Women still face many challenges including discrimination in the workplace, but high-paying jobs are no longer reserved just for men. Opportunities are becoming more equal. It is not so long ago that many careers were closed to women. The women who made up the original survey sample for the US National Longitudinal Survey of Youth (NLSY) were born in the years 1957-1964. These women started their careers in the early 1980s, at a time when most professions were open to women but serious wage discrimination still existed. In many cases these women were also expected to put childcare responsibilities ahead of career development. Today, as these women near retirement, the discrimination they faced over the decades probably means that they are still making less money than the should be given their knowledge and experience. They suffered a lifetime of challenges that helped open opportunities for the women of the future. The daughters of these women were mainly born in the 1980s They entered the labor market in the 2000s. These women are experiencing much less wage discrimination than their mothers did. As a result, we might hypothesize that the daughters of the original NLSY women would earn higher wages than their mothers did before them. A database comparing NLSY daughters' wages to their mothers' wages is illustrated in Figure 6-1. Daughters' wages are taken from the 2008 NLSY and pertain to wages earned in the calendar year 2007. Only daughters who were approximately 23-30 years old at this time are included. Mothers' wages are taken from the 1988 NLSY and pertain to calendar year 1987, when they were also 23-30 years old. The mothers' wages have been adjusted for inflation. Only daughters and mothers who reported having wage income are included in the database (those with no income or missing income are excluded). The first 30 rows of the database are shown in Figure 6-1, but the full database contains a total of 642 daughter-mother pairs. The database includes 7 columns. The first two are metadata items: the daughters' and mothers' case identification numbers. The five variables included in the database are: B.YEAR -- the year of the daughter's birth D.WAGE -- the daughter's wage income in 2007 M.WAGE -- the mother's wage income in 1987 M.ADJ -- the mother's wage income adjusted for inflation to 2007 dollars DIFF -- the difference between each daughter's wage income and her mother's wage income, or how much more the daughter makes than the mother did

Figure 6-1. Daughters' wages from work in 2007 and their mothers' wages from work in 1987 for Americans aged 24-31 at each time period (NLSY data)

Descriptive statistics for the daughters' wages and their mothers' wages (adjusted for inflation) are reported in Figure 6-2. The observed mean of the daughters' wages ($23,881) is much higher than the observed mean of their mothers' wages ($17,181). This suggests that there has been real generational change in women's employment opportunities.

Figure 6-2. Comparison of daughters' wages with their mothers' wages from work (NLSY data)

Inferential statistics can be used to make conclusions about the intergenerational daughter-mother difference in wages with a high degree of confidence. The observed mean of the daughter-mother difference in wages is $6700. The standard error of this mean difference in wages is just $792. This implies that the true mean difference in wages is somewhere in the vicinity of $5900 to $7500. It may fall somewhere outside this range, but it is extremely unlikely give the observed mean and standard error that the true mean of the difference in wages could be $0. We can confidently conclude that the employed NLSY daughters make more money than their mothers did twenty years earlier. At first glance, this is a surprising result. The standard deviation of the difference in wages is very large. The observed mean difference in wages is $6700, but the standard deviation is nearly three times as big: $20,071. There are many reasons for the massive amount of error in the difference in wages. First, incomes are notoriously difficult to measure. People often don't know their incomes in detail, and even more often lie about their incomes. Second, there may be sampling error in the data, since only 642 daughter-mother pairs are being used to represent the entire female population of the United States. Third and most important, people's incomes vary widely. Differences in education levels, career choice, ability, personality, and personal connections all create case-specific error in incomes. The reason we can be so confident about the size of the mean of the difference in wages despite all this error is that the mean difference is calculated based on a large number of cases (N = 642). The use of a large sample drives down the standard error of the mean. In effect, all those sources of error tend to cancel each other out. The result is that the observed mean is probably pretty close to the true mean. How close? The standard error is a useful guide to the likely value of the true mean, but what we really need is a way to determine the actual probabilities of different values of the true mean. In regression models, we'd like to know the actual probabilities of different values of the slope. In order to judge the importance or significance of the results of a statistical model, we need more detailed information about the distributions of true means and slopes.

This chapter shows how formal inferences can be made about the likely ranges of the true values of the parameters of statistical models. First, observed parameters and their standard errors can be combined to calculate a new measure called the "t" statistic (Section 6.1). The t statistic is a measure of how big an observed parameter is relative to its standard error. Second, t statistics can be used to determine whether or not a true mean is significantly different from zero (Section 6.2). This is especially important in the study of paired samples, like daughters and their mothers. Third, t statistics can also be applied to regression slopes(Section 6.3). The t statistic of a regression slope can be used to infer whether or not a regression model adds any explanatory power compared to a simple mean model. An optional section (Section 6.4) demonstrates how the t statistic can be used to make inferences about how true means differ from specific target levels. Finally, this chapter ends with an applied case study of the relationship between poverty and crime for a large selection of US counties (Section 6.5). This case study illustrates how t statistics can be used to make inferences about means and regression slopes. All of this chapter's key concepts are used in this case study. By the end of this chapter, you should be able to make formal inferences about the statistical and substantive significance of the parameters of statistical models.

6.1. The t statistic Standard errors are a lot like standard deviations. Every variable has a mean and a standard deviation. Standard deviations are meaningful because in the majority of cases the value of a variable falls within one standard deviation either direction of its mean. In nearly all cases the value of a variable falls within two standard deviations either direction of its mean. Any case that falls more than two standard deviations away from the mean is exceptional and likely to be an outlier. On the other hand, parameters like means, slopes, and intercepts have standard errors. Most of the time the true parameter of a model falls within one standard error of the observed parameter. The true parameter is almost always within two standard errors of the observed parameter. Standard errors, however, also have one more very important property. It is possible to prove mathematically that true parameters differ from observed parameters with an exact probability that can be calculated from the standard error of the parameter. The main difference between a standard deviation and a standard error is that a standard deviation describes a collection of data points (like a variable) while a standard error describes an observed parameter (like a mean, slope, or intercept). Standard errors represent the amount of error in the observed parameters. Standard deviations are used to calculate standard errors, but standard errors are much smaller than standard deviations. The key differences between standard deviations and standard errors and how they are used are summarized in Figure 6-3.

Figure 6-3. Differences between standard deviations and standard errors

In inferential statistics standard errors are used to make inferences about the true values of parameters. For example, we often want to know the value of a true mean. We know the values of observed means of variables, and we know that the true means of variables are probably close to their observed means, but never know for certain exactly what the true means are. The standard error of the observed mean of a variable helps us make inferences about the probably value of its true mean. In Figure 6-2, it's very important (from a social policy perspective) to know whether or not the true mean of the daughter-mother difference in wages in wages might be zero. If the true mean were zero, that would mean that women were making no progress at all in fighting discrimination in the workplace. The observed mean difference in wages is $6700 with a standard error of $792. Another way of saying this is that the observed mean is 8.46 standard errors away from 0. If you started at $6700 and subtracted one standard error, you'd get $5908. If you went down a second standard error, you'd get to $5116. If you went down a third standard error, you'd get to $4324. You'd have to go down 8.46 standard errors to get to $0. If the true mean difference in wages really were $0, the observed mean would be off by 8.46 standard errors. Statisticians call this ratio of an observed parameter to its standard error the "t" statistic. t statistics are measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters. The label "t" doesn't actually stand for anything. By an accident of history, the person who first calculated the t statistic in 1908 had used "s" to represent the standard deviation and "t" just happened to be the next letter of the alphabet. The t statistic is a measure of how big a parameter is relative to its standard error. The t statistic of a parameter can be calculated very easily by dividing a parameter by its standard error, but it's usually unnecessary to do the calculation yourself. Statistical software programs routinely report t statistics alongside parameters and their standard errors. The t statistic measures the size of a parameter. When an observed parameter has a large t statistic, we can infer that the true parameter is significantly different from 0. For example, a t statistic of 10 means that a parameter is 10 times as large as its standard error and is thus 10 standard errors away from 0. This is a large and significant difference. Statistical significance is when a statistical result is so large that is unlikely to have occurred just by chance. An observed parameter that has a t statistic larger than 2 is usually statistically significantly different from 0. In the NLSY women's wages example (Figure 6-2), the t statistic for the observed mean of the daughter-mother difference in wages is 8.46. This is much bigger than 2. We can infer that the true mean daughter-mother difference in wages is very significantly different from $0. Statistical computer programs can tell us exactly how statistically significant this result is. Statistical software can be used to calculate the exact probability of finding a t statistic of any given size. This probability is based on the numbers of degrees of freedom in a model. For example, the mean model for the daughter-mother wage gap in Figure 6-2 is based on 642 cases. Since the mean model uses 1 parameter, the model has 642 - 1 = 641 degrees of freedom. A computer program can tell us that when the t statistic of an observed mean is 8.46 with 641 degrees of freedom, the probability that the true mean could be something like $0 is 0.000000000000018%. In other words, the true mean is certainly not $0. Technically, what the probability of the t statistic tells us is that "the probability that the true mean could be 8.46 or more standard errors in either direction away from the observed mean of $6700 is 0.000000000000018%." So technically, this probability is the probability that the true mean could be $0, less than $0, $13400, or greater than $13400. The practice in some disciplines (like psychology) is to break out these probabilities into different "tests" and perform "1-tailed" and "2-tailed" analyses. In the social sciences, the usual practice is much more straightforward. The probability of the t statistic is simply considered to be the probability that the parameter is significantly different from 0. Figure 6-4 gives some idea of how big a t statistic has to be to be considered statistically significant. Figure 6-4 reports the probability associated with a t statistic with 641 degrees of freedom (as in the daughter-mother wage example). The actually observed t statistic of 8.46 is off the chart. Social scientists usually consider a t statistic to be statistically significant if it is associated with a probability of 5% or less. So when an observed mean is so large (relative to its standard error) that the probability that the true mean could be 0 is under 5%, we proclaim the true mean to be statistically significantly different from 0.

Figure 6-4. Probabilities of finding a t statistic with 641 degrees of freedom

For a mean model with 641 degrees of freedom, any t statistic greater than about 1.96 indicates that the true mean is significantly different from 0. As with other statistics, there's usually no need to calculate any of this. Statistical software programs can provide all the necessary information. For example, statistical software output reporting the results of a regression of smoking rates on temperature in Canadian provinces (Figure 4-9) would usually look something the table in Figure 6-5. While it is helpful to be able to understand the standard errors and t statistics, all you really need from the table is the probability. In the example of the Canadian provincial smoking rates, both the intercept (37.041) and the slope (-.443) for the effect of temperature on smoking are statistically significant.

Figure 6-5. Software output reporting the regression of smoking rates on average temperatures across the 13 Canadian provinces and territories, 2008 (after Figure 4-9)

6.2. Inferences using the mean model In the mean model, the t statistic is used to make inferences about the level of the true mean, but we're usually not very interested in true means. Usually the observed means are good enough. For example, the observed means for daughters' and mothers' wages for the NLSY sample (Figure 6-2) are $23881 (daughters) and $17181 (mothers). The t statistic for daughters' wages is 36.1 and the t statistic for mothers' wages is 33.2 (both with 641 degrees of freedom). The very large t statistics tell us that the true means of both daughters' and mothers' wages are significantly different from $0. That's correct, but not very interesting. Of course their true wages are different from $0. Why would they work if their employers paid them $0? The t statistic is only really useful in mean models when there's some reason to demonstrate that the true mean couldn't be 0. This usually happens when pairs of linked cases are being compared, like when daughters' incomes are compared to their mothers' incomes. Paired samples are databases in which each case represents two linked observations. With paired samples, often what we want to know is whether or not the true mean of a variable has changed significantly from one time period to another. We know that the observed means have changed. If we survey 642 women in 1988, then survey their daughters twenty years later in 2008, it's almost impossible for the two means to be exactly the same. Just by chance one or the other will be higher. What we want to know is whether or not the mean twenty years later is significantly higher. With paired sample data, mean models are often used to make inferences about the true mean change over time. As with daughters' and mothers' incomes, there are often important social policy reasons for knowing whether or not change has occurred. For example, many people believe that schools today suffer because they spend too much time and money on social support services for children and families and not enough on education. There is a general impression in society today that schools now spend much more of their limited resources on student services instead of direct classroom instruction. Have schools really moved from a focus on education to a focus on social work? This is an ideal question for a mean model based on paired sample data. Figure 6-6 contains a database of school spending figures for the 50 US states plus the District of Columbia for 1988 and 2008. In addition to the usual metadata items there are six variables: EXPENDxxxx -- Total school expenditures in 1988 and 2008 SUPPORTxxxx -- Student support services expenditures in 1988 and 2008 Sup%xxxx -- Student support as a proportion of total expenditures in 1988 and 2008 Student support services expenditures include spending on school nursing, school psychology, counseling, and social work services. The 1988 figures have been adjusted for inflation to 2008 dollars. Note that all expenditures have risen in part because all state populations have increased since 1988 and in part because all states spend more on education per student than they did in 1988.

Figure 6-6. Database of US state educational and student support expenditures, 1998 and 2008 (NCES data)

Descriptive and inferential statistics for the student support spending as a percentage of total educational spending are summarized in Figure 6-7. Descriptive statistics are reported for student support spending in 1988 and 2008 and for the change in student support spending. Inferential statistics are only reported for the change in student support spending over time. Inferential statistics about the level of student support spending in 1988 and 2008 would be meaningless, since the true means for both variables are clearly far greater than 0%.

Figure 6-7. Mean model of the change in student support spending as a percentage of total educational spending, 1998 - 2008 (NCES data)

The observed mean change in student support spending was 1.36%. Observed state spending on student support services has in fact gone up. Does this mean that states are truly focusing more on student support services than they did in the past? Or is this rise more likely just random variation from a true mean of 0% (indicating no change)? The probability of the t statistic can be used to make inferences about the true mean change over time. The t statistic for the change in student support spending is 2.21 with 50 degrees of freedom (there are 51 cases, so the degrees of freedom are 51 - 1 = 50). The probability associated with this t statistic is 0.039, or 3.9%. In other words, there is only a 3.9% chance that the true mean change in student support spending between 1988 and 2008 was 0%. Since the chance that there was no change in the true mean is less than 5%, we can infer that the true mean level of student support spending has changed between 1988 and 2008. States spent a significantly higher proportion of their budgets on student support services in 2008 than they did in 1988. The mean proportion of education spending that goes to student support services has clearly increased since 1988. The observed mean increase was 1.36% and the t statistic confirms that this increase was statistically significant. Does this mean that it is important? After all, the mean level of student support spending only increased from 34.04% to 35.40% over a twenty year period. In 23 states is actually went declined. These seem like very weak results on which to base social policy. The increase is statistically significant, but is does not seem large enough to be meaningful from a policy standpoint. Substantive significance is when a statistical result is large enough to be meaningful in the view of the researcher and society at large. The increase in student support spending since 1988 is statistically significant, but it's probably not large enough to be considered substantively significant.

6.3. Inferences about regression slopes In regression models, the t statistic is used to make inferences about the true slope. The t statistic can also be used to make inferences about regression intercepts, but this is rarely done in practice. While inferences about true means are only made in special situations (like with paired samples), inferences about true slopes are made all the time. It's so common to use t statistics with regression slopes that statistical software programs usually print out t statistics and their associated probabilities by default whenever you use them to estimate a regression model. In writing up statistical results in the social sciences, almost any regression slope is accompanied by a note reporting its statistical significance. Returning to the example of official development assistance (ODA) spending as a proportion of national income for 20 rich countries, a mean model for ODA spending is depicted on the left side of Figure 6-8. The observed mean level of ODA spending for the 20 rich countries is 0.52% (as noted on the chart), but individual countries deviate widely from this mean. Sweden (SWE) gives almost twice the mean level (1.01% of national income) while the United States (USA) gives much less than half the mean level (0.19% of national income). What makes some rich countries more generous than other countries in helping poor countries with ODA aid?

Figure 6-8. Comparison of mean and regression (versus income) models for ODA spending for 20 rich countries, 2008 (OECD data from Figure 4-1)

One theory to account for some of the case-specific error in the mean model might be that richer countries have more money to spend and so can afford to be more generous. There is wide variability among rich countries in just how rich they are. The poorest rich country is New Zealand (NZL) with national income of $27,940 per person, while the richest is Norway (NOR) with national income of $87,070 per person. The theory would suggest that Norway can afford to give much more in aid than New Zealand, and it does. Generalizing from these two cases, we might hypothesize that country ODA spending levels rise with national income levels. The right side of Figure 6-8 shows how ODA spending is related to national income levels. The hypothesis seems to be correct: a large part of the case-specific error in ODA spending can be attributed to differences in national income. From a descriptive standpoint, differences in national income account for a "large" amount of the case-specific error in ODA spending, but is the effect of national income statistically significant? Model 1 in Figure 6-9 reports the results of a regression of ODA spending on national income in thousands of dollars. The data for this and the other models reported in Figure 6-9 are taken from the database depicted in Figure 4-1. The slope for national income is 0.013, indicating that every $1000 increase in national income is associated with an expected increase of 0.13% in ODA spending. Based on the data we have, the probability that the true slope for national income might really be 0 is tiny. We can infer that the true slope for national income is almost certainly not 0. National income has a highly significant impact on ODA spending.

Figure 6-9. Results for the regression of ODA spending on selected national indicators for 20 rich countries, 2008 (OECD data from Figure 4-1)

What other factors might explain some of the case-specific error in ODA spending levels? The results of two more regression models are reported in Figure 6-9. In Model 2, ODA spending is regressed on European status (as in Figure 4-3). The observed slope is 0.328. This represents the observed difference in mean ODA spending between non-European and European countries. European countries tend to give 0.328% more in ODA than non-European countries. The probability of the t statistic associated with this slope is 0.013 (or 1.3%), indicating that there is only a tiny chance that the true slope for European status is 0. Based on the small probability that the true slope is 0, we can infer that the true slope is not 0. We can infer that European countries are significantly more generous than non-European countries. Model 3 illustrates a non-significant slope. In Model 3, ODA spending is regressed on administrative efficiency (administrative costs as a percentage of total official development assistance). We might hypothesize that part of a country's case-specific deviation from the mean level of ODA spending could be due to high administrative costs. If administrative costs are high, ODA spending would be high, because the total level of spending equals a country's true "generosity" in giving aid plus its costs in administering its aid budget. The observed regression slope of -.003 indicates that this is not in fact what happens. High administrative costs are actually associated with less ODA spending, not more. The observed effect of administrative costs is very small, but it is definitely negative, not positive. The observed effect of administrative costs on ODA spending may negative, but the effect is not significantly different from 0. The probability of .927 for the t statistic for the slope of administrative costs indicates that there is a 92.7% chance that the true slope could be 0 (or as far away from the observed slope as 0 is). We would infer from this that administrative costs have no significant effect on ODA spending. Another way of thinking about a non-significant slope is depicted in Figure 6-10. Figure 6-10 contrasts the mean model for ODA spending (left side) with a regression model for ODA spending based on administrative costs (right side). Thought the regression line does slope slightly downward, the scatter plot on the right side of the chart doesn't particularly move with the line. This is very different from the right side of Figure 6-8, where the scatter plot tracks the line much more closely. Figure 6-10 illustrates a situation in which only a small part of the case-specific error in the mean model for ODA spending is explained by the regression model. When a regression model explains very little of the case-specific error in the dependent variable, the slope tends to be small and not statistically significant.

Figure 6-10. Comparison of mean and regression (versus administrative costs) models for ODA spending for 20 rich countries, 2008 (OECD data from Figure 4-1)

6.4. One-sample t statistics (optional/advanced) The t statistic associated with a parameter is usually used to make inferences about whether or not a true parameter is significantly different from 0. By default, this is how all statistical software programs are programmed to use the t statistic. Nonetheless, sometimes social scientists want to make other inferences about true parameters. It takes some extra work, but is it possible to use t statistics to evaluate whether or not true parameters are significantly different from any number, not just 0. This is most commonly done in mean models. In regression models, we almost always want to know whether or not the line truly slopes, and nothing else. In mean models, on the other hand, we often want to know whether or not the true mean hits some target or threshold. This can be illustrated once again using the data on ODA spending levels. The observed mean level of ODA spending across the 20 rich countries listed in Figure 4-1 is 0.52%. As discussed in Chapter 5, these countries agreed a target level for ODA spending of 0.70% of national income. The standard error of ODA spending was used in Chapter 5 to argue that it is "very unlikely" that the true mean level of ODA spending met the target of 0.70%. Just how unlikely is it? One way to answer this question would be to construct an artificial paired sample database. Each country's actual level of ODA spending could be paired with the target level of 0.70% and the difference calculated. This is done in Figure 6-11. The t statistic for the mean difference between actual and target ODA spending is 3.03 with 19 degrees of freedom. The probability associated with this t statistic is just 0.007 (0.7%). In other words, there is only a 0.7% chance that countries are truly meeting their target of spending 0.7% of national income on ODA (the fact that both figures are 0.7 is just a coincidence). The true mean spending level falls significantly short of the 0.7% target.

Figure 6-11. Artificial paired sample comparing ODA spending for 20 rich countries to the 0.70% target, 2008 (OECD data from Figure 4-1)

Another, more direct way to answer the question would be to compare the observed mean to 0.70%. The observed mean level of ODA spending is 0.518%. This is 0.182% short of the target level. The standard error of this observed mean is 0.060%. As discussed in Chapter 5, the observed mean is 3 standard errors lower than 0.70 (it's actually 3.03 standard errors, as shown in Figure 6-11). Since the observed mean is 3.03 standard errors away from the target, the t statistic for the difference between the observed mean and the target is 3.03. A t statistic of 3.03 with 19 degrees of freedom has a probability of 0.007. This is the same result as found using the paired sample design. The true mean level of ODA spending is significantly less than 0.7%. This use of the t statistic is called a "one-sample" t statistic. It works because the difference between a mean and its target (used in the one-sample scenario) is the same as the mean of the differences between cases and their targets (used in the paired sample scenario). The one-sample t statistic can be used to evaluate the gap between an observed mean and any arbitrary target level of the true mean. In principle, the same logic can also be used to evaluate the gap between observed regression slopes and intercepts and arbitrary targets, but this rarely occurs in practice. A common use of the one-sample t statistic is to make inferences about sampling error. For example, according to the 2000 US Census the observed mean size of a US household was 2.668 persons. This figure is based on an actual count of the entire population, so while it may have measurement error and case-specific error, it has no sampling error. The observed mean size of a US household in the 2000 Current Population Survey (CPS) was 2.572 persons. The CPS is a sample survey of the US population that uses the same measures as the Census itself. The measurement error of the CPS should be identical to the measurement error of the Census, and the case-specific error of the CPS should be equivalent to the case-specific error of the Census. That means that the only difference between the CPS observed household size and the Census observed household size should be sampling error in the CPS. The difference between the Census observed mean of 2.668 and the CPS observed mean of 2.572 is 0.096 persons. The standard error of the CPS mean is 0.0057, giving a t statistic of 16.79. The CPS mean is based on 64,944 households. A t statistic of 16.79 with 64,943 degrees of freedom has a tiny probability that is very close to 0. From this we can infer that the CPS mean is significantly different from the Census mean. In other words, the level of sampling error in the CPS is statistically significant. On the other hand, the mean difference of 0.096 persons indicates that it is probably not substantively significant.

6.5. Case study: Poverty and crime Poverty is strongly associated with crime. Poverty is usually measured as a lack of money income. People who live in households with less than a certain income threshold are considered to live in poverty. The exact income threshold depends on the country, household size and composition, and sometimes on the area within the country. In the United States, the overall national poverty rate has been stable at around 12.5% of the population for the past forty years. Since the whole concept of poverty is so closely tied to the idea of having too little money, one might expect poverty to be more closely related to property crime than to violent crime. If poverty is fundamentally a lack of money, then people in poverty might be expected to commit property crimes in order to gain more money. This is a materialist theory of poverty. On the other hand, it is possible that poverty is not fundamentally an economic phenomenon. It might be argued that poverty really means much more than just a lack of income. In this theory of poverty, living in poverty means living a life that is lacking in the basic human dignity that comes from having a good job, a decent education, and a safe home. If poverty has more to do with personal dignity than with income, poverty might be more closely related to violent crime than to property crime, as people lash out violently at loved ones and others around them in response to their own lack of self-respect. This is a psychosocial theory of poverty. Which is correct, the materialist theory of poverty or the psychosocial theory of poverty? In terms of specific hypotheses, is poverty more closely related to property crime or to violent crime? In the United States, poverty rates are available from the US Census Bureau for nearly every county, while crime data are available from the Federal Bureau of Investigation for most (but not all) counties (data are missing for many rural counties). All in all, both poverty and crime statistics are available for 2209 out of the 3140 US counties for 2008. In these 2209 US counties property crime rates are much higher than violent crime rates. Considering the two crime rates for US counties to be a paired sample, the observed mean violent crime rate of 85.9 crimes per 100,000 population is much lower than the observed mean property crime rate of 636.7 crimes per 100,000 population. The mean difference in the two crime rates is 550.8 with a standard error of 18.52. The t statistic associated with this mean difference is t =13.937 with 2208 degrees of freedom, giving a probability of .000 that the true levels of property and violent crimes are equal across the 2209 counties. Property crimes are indeed significantly more common than violent crimes. The difference is both statistically significant (probability = 0%) and substantively significant (the mean violent crime rate is 7.4 times the mean violent crime rate). The results of regressing both the property and violent crime rates on the poverty rates of the 2209 US counties are reported in Figure 6-12. Both slopes are positive and statistically highly significant. Every 1% rise in the poverty rate is associated with an expected increase of 13.0 per 100,000 in the property crime rate and an expected increase of 4.5 per 100,000 in the violent crime rate. Clearly, crime rates rise with poverty. The relative slopes seem to imply that poverty is more important for property crime than for violent crime, but the two slopes aren't really comparable. Since property crime is so much more common than violent crime, an extra point of poverty would be expected to cause more of a change in property crime rates than in violent crime rates.

Figure 6-12. Regressions of property crime (Model 1) and violent crime (Model 2) on US county poverty rates, 2008

The two t statistics, on the other hand, are comparable. The t statistics represent the statistical significance of each relationship. Put differently, the t statistics are related to how much of the county-specific deviation from the mean crime rate is captured by the regression models for property crime and for violent crime. The t statistic for violent crime is about 2.5 times as large as the t statistic for property crime. This implies that poverty explains much more of the variability in violent crime rates than the variability in property crime rates. In other words, poverty is more important for understanding violent crime than for understanding property crime. This tends to lend more support to the psychosocial theory of poverty than to the materialist theory of poverty. Poverty is not just a matter of money. It is also -- or maybe even more so -- a matter of dignity.

Chapter 6 Key Terms

Paired samples are databases in which each case represents two linked observations.
Statistical significance is when a statistical result is so large that is unlikely to have occurred just by chance.
Substantive significance is when a statistical result is large enough to be meaningful in the view of the researcher and society at large.
t statistics are measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters.

← Chapter 5 · Chapter 7 →

Social Statistics/Chapter 6

Statistical Inference Using the t Statistic

Chapter 6 Key Terms