Social Statistics: Key Terms
This is a list of the key terms from each chapter, for sake of convenience.
- Chapter 1
- Conceptualization is the process of developing a theory about some aspect of the social world.
- Cases are the individuals or entities about which data have been collected.
- Databases are arrangements of data into variables and cases.
- Dependent variables are variables that are thought to depend on other variables in a model.
- Generalization is the act of turning theories about specific situations into theories that apply to many situations.
- Independent variables are variables that are thought to cause the dependent variables in a model.
- Metadata are additional attributes of cases that are not meant to be included in analyses.
- Operationalization is the process of turning a social theory into specific hypotheses about real data.
- Scatter plots are very simple statistical models that depict data on a graph.
- Statistical models are mathematical simplifications of the real world.
- Variables are analytically meaningful attributes of cases.
- Chapter 2
- Expected values are the values that a dependent variable would be expected to have based solely on values of the independent variable.
- Linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight line according to values of the independent variable.
- Outliers are data points in a statistical model that are far away from most of the other data points.
- Regression error is the degree to which an expected value of a dependent variable in a linear regression model differs from its actual value.
- Robustness is the extent to which statistical models give similar results despite changes in operationalization.
- Slope is the change in the expected value of the dependent variable divided by the change in the value of the independent variable.
- Chapter 3
- Extrapolation is the process of using a regression model to compute predicted values inside the range of the observed data.
- Intercepts are the places where regression lines cross the dependent variable axis in a scatter plot.
- Interpolation is the process of using a regression model to compute predicted values inside the range of the observed data.
- Predicted values are expected values of a dependent variable that correspond to selected values of the independent variable.
- Regression coefficients are the slopes and intercepts that define regression lines.
- Chapter 4
- Conditional means are the expected values of dependent variables for specific groups of cases.
- Degrees of freedom are the number of errors in a model that are actually free to vary.
- Mean models are very simple statistical models in which a variable has just one expected value, its mean.
- Means are the expected values of variables.
- Parameters are the figures associated with statistical models, like means and regression coefficients.
- Regression error standard deviation is a measure of the amount of spread in the error in a regression model.
- Standard deviation is a measure of the amount of spread in a variable, which is the same thing as the amount of spread in the error in a mean model.
- Chapter 5
- Case-specific error is error resulting from any of the millions of influences and experiences that may cause a specific case to have a value that is different from its expected value.
- Descriptive statistics is the use of statistics to describe the data we actually have in hand.
- Inferential statistics is the use of statistics to make conclusions about characteristics of the real world underlying our data.
- Measurement error is error resulting from accidents, mistakes, or misunderstandings in the measurement of a variable.
- Observed parameters are the actually observed values of parameters like means, intercepts, and slopes based on the data we actually have in hand.
- Sampling error is error resulting from the random chance of which research subjects are included in a sample.
- Standard error is a measure of the amount of error associated with an observed parameter.
- True parameters are the true values of parameters like means, intercepts, and slopes based on the real (but unobserved) characteristics of the world.
- Chapter 6
- Paired samples are databases in which each case represents two linked observations.
- Statistical significance is when a statistical result is so large that is unlikely to have occurred just by chance.
- Substantive significance is when a statistical result is large enough to be meaningful in the view of the researcher and society at large.
- t statistics are measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters.
- Chapter 7
- Complementary controls are control variables that complement an independent variable of interest by unmasking its explanatory power in a multiple regression model.
- Competing controls are control variables that compete with an independent variable of interest by splitting its explanatory power in a multiple regression model.
- Control variables are variables that are "held constant" in a multiple regression analysis in order to highlight the effect of a particular independent variable of interest.
- Multicausal models are statistical models that have one dependent variable but two or more independent variables.
- Multiple linear regression models are statistical models in which expected values of the dependent variable are thought to rise or fall in a straight lines according to values of two or more independent variables.
- Predictors are the independent variables in regression models.
- Chapter 8
- Correlation (r) is a measure of the strength of the relationship between two variables that runs from r = −1 (perfect negative correlation) through r = 0 (no correlation) to r = +1 (perfect positive correlation).
- R2 is a measure of the proportion of the total variability in the dependent variable that is explained by a regression model.
- Standardized coefficients are the coefficients of regression models that have been estimated using standardized variables.
- Standardized variables are variables that have been transformed by subtracting the mean from every observed value and then dividing by the standard deviation.
- Unstandardized coefficients are the coefficients of regression models that have been estimated using original unstandardized variables.
- Unstandardized variables are variables that are expressed in their original units.
- Chapter 9
- Base models are initial models that include all of the background independent variables in an analysis that are not of particular theoretical interest for a regression analysis.
- Confounding variables are variables that might affect both the dependent variable and an independent variable of interest.
- Explanatory models are regression models that are primarily intended to be used for evaluating different theories for explaining the differences between cases in their values of the dependent variable.
- Parsimony is the virtue of using simple models that are easy to understand and interpret.
- Predictive models are regression models that are primarily intended to be used for making predictions about dependent variables as outcomes.
- Saturated models are final models that include all of the variables used in a series of models in an analysis.
- Chapter 10
- Analysis of variance (ANOVA) is a type of regression model that focuses on the proportion of the total variability in a dependent variable that is explained by a categorical variable.
- ANOVA variables are the numerical variables in a regression model that together describe the effects of categorical group memberships.
- Categorical variables are variables that divide cases into two or more groups.
- Mixed models are regression models that include both ANOVA components and ordinary independent variables.
- Numerical variables are variables that take numerical values that represent meaningful orderings of the cases from lower numbers to highest numbers.
- Reference groups are the groups that are set aside in ANOVA variables and not explicitly included as variables in ANOVA models.
- Chapter 11
- Interaction effects are the coefficients of the interaction variables in an interaction model.
- Interaction models are regression models that allow the slopes of some variables to differ for different categorical groups.
- Interaction variables are variables created by multiplying an ANOVA variable by an independent variable of interest.
- Intercept effects are the coefficients of the ANOVA variables in an interaction model.
- Main effects are the coefficients of the independent variable of interest in an interaction model for the reference group.