*Social Statistics*: Key Terms

This is a list of the key terms from each chapter, for sake of convenience.

- Chapter 1

**Conceptualization**is*the process of developing a theory about some aspect of the social world*.**Cases**are*the individuals or entities about which data have been collected*.**Databases**are*arrangements of data into variables and cases*.**Dependent variables**are*variables that are thought to depend on other variables in a model*.**Generalization**is*the act of turning theories about specific situations into theories that apply to many situations*.**Independent variables**are*variables that are thought to cause the dependent variables in a model*.**Metadata**are*additional attributes of cases that are not meant to be included in analyses*.**Operationalization**is*the process of turning a social theory into specific hypotheses about real data*.**Scatter plots**are*very simple statistical models that depict data on a graph*.**Statistical models**are*mathematical simplifications of the real world*.**Variables**are*analytically meaningful attributes of cases*.

- Chapter 2

**Expected values**are*the values that a dependent variable would be expected to have based solely on values of the independent variable*.**Linear regression models**are*statistical models in which expected values of the dependent variable are thought to rise or fall in a straight line according to values of the independent variable*.**Outliers**are*data points in a statistical model that are far away from most of the other data points*.**Regression error**is*the degree to which an expected value of a dependent variable in a linear regression model differs from its actual value*.**Robustness**is*the extent to which statistical models give similar results despite changes in operationalization*.**Slope**is*the change in the expected value of the dependent variable divided by the change in the value of the independent variable*.

- Chapter 3

**Extrapolation**is*the process of using a regression model to compute predicted values inside the range of the observed data*.**Intercepts**are*the places where regression lines cross the dependent variable axis in a scatter plot*.**Interpolation**is*the process of using a regression model to compute predicted values inside the range of the observed data*.**Predicted values**are*expected values of a dependent variable that correspond to selected values of the independent variable*.**Regression coefficients**are*the slopes and intercepts that define regression lines*.

- Chapter 4

**Conditional means**are*the expected values of dependent variables for specific groups of cases*.**Degrees of freedom**are*the number of errors in a model that are actually free to vary*.**Mean models**are*very simple statistical models in which a variable has just one expected value, its mean*.**Means**are*the expected values of variables*.**Parameters**are*the figures associated with statistical models, like means and regression coefficients*.**Regression error standard deviation**is*a measure of the amount of spread in the error in a regression model*.**Standard deviation**is*a measure of the amount of spread in a variable, which is the same thing as the amount of spread in the error in a mean model*.

- Chapter 5

**Case-specific error**is*error resulting from any of the millions of influences and experiences that may cause a specific case to have a value that is different from its expected value*.**Descriptive statistics**is*the use of statistics to describe the data we actually have in hand*.**Inferential statistics**is*the use of statistics to make conclusions about characteristics of the real world underlying our data*.**Measurement error**is*error resulting from accidents, mistakes, or misunderstandings in the measurement of a variable*.**Observed parameters**are*the actually observed values of parameters like means, intercepts, and slopes based on the data we actually have in hand*.**Sampling error**is*error resulting from the random chance of which research subjects are included in a sample*.**Standard error**is*a measure of the amount of error associated with an observed parameter*.**True parameters**are*the true values of parameters like means, intercepts, and slopes based on the real (but unobserved) characteristics of the world*.

- Chapter 6

**Paired samples**are*databases in which each case represents two linked observations*.**Statistical significance**is*when a statistical result is so large that is unlikely to have occurred just by chance*.**Substantive significance**is*when a statistical result is large enough to be meaningful in the view of the researcher and society at large*.**t statistics**are*measures based on observed parameters that are used to make specific inferences about the probabilities of true parameters*.

- Chapter 7

**Complementary controls**are*control variables that complement an independent variable of interest by unmasking its explanatory power in a multiple regression model*.**Competing controls**are*control variables that compete with an independent variable of interest by splitting its explanatory power in a multiple regression model*.**Control variables**are*variables that are "held constant" in a multiple regression analysis in order to highlight the effect of a particular independent variable of interest*.**Multicausal models**are*statistical models that have one dependent variable but two or more independent variables*.**Multiple linear regression models**are*statistical models in which expected values of the dependent variable are thought to rise or fall in a straight lines according to values of two or more independent variables*.**Predictors**are*the independent variables in regression models*.

- Chapter 8

**Correlation (r)**is*a measure of the strength of the relationship between two variables that runs from r = −1 (perfect negative correlation) through r = 0 (no correlation) to r = +1 (perfect positive correlation)*.**R2**is*a measure of the proportion of the total variability in the dependent variable that is explained by a regression model*.**Standardized coefficients**are*the coefficients of regression models that have been estimated using standardized variables*.**Standardized variables**are*variables that have been transformed by subtracting the mean from every observed value and then dividing by the standard deviation*.**Unstandardized coefficients**are*the coefficients of regression models that have been estimated using original unstandardized variables*.**Unstandardized variables**are*variables that are expressed in their original units*.

- Chapter 9

**Base models**are*initial models that include all of the background independent variables in an analysis that are not of particular theoretical interest for a regression analysis*.**Confounding variables**are*variables that might affect both the dependent variable and an independent variable of interest*.**Explanatory models**are*regression models that are primarily intended to be used for evaluating different theories for explaining the differences between cases in their values of the dependent variable*.**Parsimony**is*the virtue of using simple models that are easy to understand and interpret*.**Predictive models**are*regression models that are primarily intended to be used for making predictions about dependent variables as outcomes*.**Saturated models**are*final models that include all of the variables used in a series of models in an analysis*.

- Chapter 10

**Analysis of variance (ANOVA)**is*a type of regression model that focuses on the proportion of the total variability in a dependent variable that is explained by a categorical variable*.**ANOVA variables**are*the numerical variables in a regression model that together describe the effects of categorical group memberships*.**Categorical variables**are*variables that divide cases into two or more groups*.**Mixed models**are*regression models that include both ANOVA components and ordinary independent variables*.**Numerical variables**are*variables that take numerical values that represent meaningful orderings of the cases from lower numbers to highest numbers*.**Reference groups**are*the groups that are set aside in ANOVA variables and not explicitly included as variables in ANOVA models*.

- Chapter 11

**Interaction effects**are*the coefficients of the interaction variables in an interaction model*.**Interaction models**are*regression models that allow the slopes of some variables to differ for different categorical groups*.**Interaction variables**are*variables created by multiplying an ANOVA variable by an independent variable of interest*.**Intercept effects**are*the coefficients of the ANOVA variables in an interaction model*.**Main effects**are*the coefficients of the independent variable of interest in an interaction model for the reference group*.