Social Statistics: Key Terms

This is a list of the key terms from each chapter, for sake of convenience.

Chapter 1
  • Conceptualization is the process of developing a theory about some aspect of the social world.
  • Cases are the individuals or entities about which data have been collected.
  • Databases are arrangements of data into variables and cases.
  • Dependent variables are variables that are thought to depend on other variables in a model.
  • Generalization is the act of turning theories about specific situations into theories that apply to many situations.
  • Independent variables are variables that are thought to cause the dependent variables in a model.
  • Metadata are additional attributes of cases that are not meant to be included in analyses.
  • Operationalization is the process of turning a social theory into specific hypotheses about real data.
  • Scatter plots are very simple statistical models that depict data on a graph.
  • Statistical models are mathematical simplifications of the real world.
  • Variables are analytically meaningful attributes of cases.
Chapter 2
  • Expected values are the values that a dependent variable would be expected to have based solely on values of the independent variable.
  • Linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight line according to values of the independent variable.
  • Outliers are data points in a statistical model that are far away from most of the other data points.
  • Regression error is the degree to which an expected value of a dependent variable in a linear regression model differs from its actual value.
  • Robustness is the extent to which statistical models give similar results despite changes in operationalization.
  • Slope is the change in the expected value of the dependent variable divided by the change in the value of the independent variable.
Chapter 3
  • Extrapolation is the process of using a regression model to compute predicted values inside the range of the observed data.
  • Intercepts are the places where regression lines cross the dependent variable axis in a scatter plot.
  • Interpolation is the process of using a regression model to compute predicted values inside the range of the observed data.
  • Predicted values are expected values of a dependent variable that correspond to selected values of the independent variable.
  • Regression coefficients are the slopes and intercepts that define regression lines.
Chapter 4
  • Conditional means are the expected values of dependent variables for specific groups of cases.
  • Degrees of freedom are the number of errors in a model that are actually free to vary.
  • Mean models are very simple statistical models in which a variable has just one expected value, its mean.
  • Means are the expected values of variables.
  • Parameters are the figures associated with statistical models, like means and regression coefficients.
  • Regression error standard deviation is a measure of the amount of spread in the error in a regression model.
  • Standard deviation is a measure of the amount of spread in a variable, which is the same thing as the amount of spread in the error in a mean model.
Chapter 5
  • Case-specific error is error resulting from any of the millions of influences and experiences that may cause a specific case to have a value that is different from its expected value.
  • Descriptive statistics is the use of statistics to describe the data we actually have in hand.
  • Inferential statistics is the use of statistics to make conclusions about characteristics of the real world underlying our data.
  • Measurement error is error resulting from accidents, mistakes, or misunderstandings in the measurement of a variable.
  • Observed parameters are the actually observed values of parameters like means, intercepts, and slopes based on the data we actually have in hand.
  • Sampling error is error resulting from the random chance of which research subjects are included in a sample.
  • Standard error is a measure of the amount of error associated with an observed parameter.
  • True parameters are the true values of parameters like means, intercepts, and slopes based on the real (but unobserved) characteristics of the world.
Chapter 6
  • Paired samples are databases in which each case represents two linked observations.
  • Statistical significance is when a statistical result is so large that is unlikely to have occurred just by chance.
  • Substantive significance is when a statistical result is large enough to be meaningful in the view of the researcher and society at large.
  • t statistics are measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters.
Chapter 7
  • Complementary controls are control variables that complement an independent variable of interest by unmasking its explanatory power in a multiple regression model.
  • Competing controls are control variables that compete with an independent variable of interest by splitting its explanatory power in a multiple regression model.
  • Control variables are variables that are "held constant" in a multiple regression analysis in order to highlight the effect of a particular independent variable of interest.
  • Multicausal models are statistical models that have one dependent variable but two or more independent variables.
  • Multiple linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight lines according to values of two or more independent variables.
  • Predictors are the independent variables in regression models.
Chapter 8
  • Correlation (r) is a measure of the strength of the relationship between two variables that runs from r = −1 (perfect negative correlation) through r = 0 (no correlation) to r = +1 (perfect positive correlation).
  • R2 is a measure of the proportion of the total variability in the dependent variable that is explained by a regression model.
  • Standardized coefficients are the coefficients of regression models that have been estimated using standardized variables.
  • Standardized variables are variables that have been transformed by subtracting the mean from every observed value and then dividing by the standard deviation.
  • Unstandardized coefficients are the coefficients of regression models that have been estimated using original unstandardized variables.
  • Unstandardized variables are variables that are expressed in their original units.
Chapter 9
  • Base models are initial models that include all of the background independent variables in an analysis that are not of particular theoretical interest for a regression analysis.
  • Confounding variables are variables that might affect both the dependent variable and an independent variable of interest.
  • Explanatory models are regression models that are primarily intended to be used for evaluating different theories for explaining the differences between cases in their values of the dependent variable.
  • Parsimony is the virtue of using simple models that are easy to understand and interpret.
  • Predictive models are regression models that are primarily intended to be used for making predictions about dependent variables as outcomes.
  • Saturated models are final models that include all of the variables used in a series of models in an analysis.
Chapter 10
  • Analysis of variance (ANOVA) is a type of regression model that focuses on the proportion of the total variability in a dependent variable that is explained by a categorical variable.
  • ANOVA variables are the numerical variables in a regression model that together describe the effects of categorical group memberships.
  • Categorical variables are variables that divide cases into two or more groups.
  • Mixed models are regression models that include both ANOVA components and ordinary independent variables.
  • Numerical variables are variables that take numerical values that represent meaningful orderings of the cases from lower numbers to highest numbers.
  • Reference groups are the groups that are set aside in ANOVA variables and not explicitly included as variables in ANOVA models.
Chapter 11
  • Interaction effects are the coefficients of the interaction variables in an interaction model.
  • Interaction models are regression models that allow the slopes of some variables to differ for different categorical groups.
  • Interaction variables are variables created by multiplying an ANOVA variable by an independent variable of interest.
  • Intercept effects are the coefficients of the ANOVA variables in an interaction model.
  • Main effects are the coefficients of the independent variable of interest in an interaction model for the reference group.

Chapter 11 · Noted Contributors