# Social Statistics: Key Terms

This is a list of the key terms from each chapter, for sake of convenience.

Chapter 1
• Conceptualization is the process of developing a theory about some aspect of the social world.
• Cases are the individuals or entities about which data have been collected.
• Databases are arrangements of data into variables and cases.
• Dependent variables are variables that are thought to depend on other variables in a model.
• Generalization is the act of turning theories about specific situations into theories that apply to many situations.
• Independent variables are variables that are thought to cause the dependent variables in a model.
• Metadata are additional attributes of cases that are not meant to be included in analyses.
• Operationalization is the process of turning a social theory into specific hypotheses about real data.
• Scatter plots are very simple statistical models that depict data on a graph.
• Statistical models are mathematical simplifications of the real world.
• Variables are analytically meaningful attributes of cases.
Chapter 2
• Expected values are the values that a dependent variable would be expected to have based solely on values of the independent variable.
• Linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight line according to values of the independent variable.
• Outliers are data points in a statistical model that are far away from most of the other data points.
• Regression error is the degree to which an expected value of a dependent variable in a linear regression model differs from its actual value.
• Robustness is the extent to which statistical models give similar results despite changes in operationalization.
• Slope is the change in the expected value of the dependent variable divided by the change in the value of the independent variable.
Chapter 3
• Extrapolation is the process of using a regression model to compute predicted values inside the range of the observed data.
• Intercepts are the places where regression lines cross the dependent variable axis in a scatter plot.
• Interpolation is the process of using a regression model to compute predicted values inside the range of the observed data.
• Predicted values are expected values of a dependent variable that correspond to selected values of the independent variable.
• Regression coefficients are the slopes and intercepts that define regression lines.
Chapter 4
• Conditional means are the expected values of dependent variables for specific groups of cases.
• Degrees of freedom are the number of errors in a model that are actually free to vary.
• Mean models are very simple statistical models in which a variable has just one expected value, its mean.
• Means are the expected values of variables.
• Parameters are the figures associated with statistical models, like means and regression coefficients.
• Regression error standard deviation is a measure of the amount of spread in the error in a regression model.
• Standard deviation is a measure of the amount of spread in a variable, which is the same thing as the amount of spread in the error in a mean model.
Chapter 5
• Case-specific error is error resulting from any of the millions of influences and experiences that may cause a specific case to have a value that is different from its expected value.
• Descriptive statistics is the use of statistics to describe the data we actually have in hand.
• Inferential statistics is the use of statistics to make conclusions about characteristics of the real world underlying our data.
• Measurement error is error resulting from accidents, mistakes, or misunderstandings in the measurement of a variable.
• Observed parameters are the actually observed values of parameters like means, intercepts, and slopes based on the data we actually have in hand.
• Sampling error is error resulting from the random chance of which research subjects are included in a sample.
• Standard error is a measure of the amount of error associated with an observed parameter.
• True parameters are the true values of parameters like means, intercepts, and slopes based on the real (but unobserved) characteristics of the world.
Chapter 6
• Paired samples are databases in which each case represents two linked observations.
• Statistical significance is when a statistical result is so large that is unlikely to have occurred just by chance.
• Substantive significance is when a statistical result is large enough to be meaningful in the view of the researcher and society at large.
• t statistics are measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters.
Chapter 7
• Complementary controls are control variables that complement an independent variable of interest by unmasking its explanatory power in a multiple regression model.
• Competing controls are control variables that compete with an independent variable of interest by splitting its explanatory power in a multiple regression model.
• Control variables are variables that are "held constant" in a multiple regression analysis in order to highlight the effect of a particular independent variable of interest.
• Multicausal models are statistical models that have one dependent variable but two or more independent variables.
• Multiple linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight lines according to values of two or more independent variables.
• Predictors are the independent variables in regression models.
Chapter 8
• Correlation (r) is a measure of the strength of the relationship between two variables that runs from r = −1 (perfect negative correlation) through r = 0 (no correlation) to r = +1 (perfect positive correlation).
• R2 is a measure of the proportion of the total variability in the dependent variable that is explained by a regression model.
• Standardized coefficients are the coefficients of regression models that have been estimated using standardized variables.
• Standardized variables are variables that have been transformed by subtracting the mean from every observed value and then dividing by the standard deviation.
• Unstandardized coefficients are the coefficients of regression models that have been estimated using original unstandardized variables.
• Unstandardized variables are variables that are expressed in their original units.
Chapter 9
• Base models are initial models that include all of the background independent variables in an analysis that are not of particular theoretical interest for a regression analysis.
• Confounding variables are variables that might affect both the dependent variable and an independent variable of interest.
• Explanatory models are regression models that are primarily intended to be used for evaluating different theories for explaining the differences between cases in their values of the dependent variable.
• Parsimony is the virtue of using simple models that are easy to understand and interpret.
• Predictive models are regression models that are primarily intended to be used for making predictions about dependent variables as outcomes.
• Saturated models are final models that include all of the variables used in a series of models in an analysis.
Chapter 10
• Analysis of variance (ANOVA) is a type of regression model that focuses on the proportion of the total variability in a dependent variable that is explained by a categorical variable.
• ANOVA variables are the numerical variables in a regression model that together describe the effects of categorical group memberships.
• Categorical variables are variables that divide cases into two or more groups.
• Mixed models are regression models that include both ANOVA components and ordinary independent variables.
• Numerical variables are variables that take numerical values that represent meaningful orderings of the cases from lower numbers to highest numbers.
• Reference groups are the groups that are set aside in ANOVA variables and not explicitly included as variables in ANOVA models.
Chapter 11
• Interaction effects are the coefficients of the interaction variables in an interaction model.
• Interaction models are regression models that allow the slopes of some variables to differ for different categorical groups.
• Interaction variables are variables created by multiplying an ANOVA variable by an independent variable of interest.
• Intercept effects are the coefficients of the ANOVA variables in an interaction model.
• Main effects are the coefficients of the independent variable of interest in an interaction model for the reference group.