The Null Hypothesis should be an assumption about the difference in the population means for two populations (note that the same quantitative variable must have been measured in each population). The data should consist of two samples of quantitative data (one from each population). The samples must be obtained independently from each other.
The samples must be drawn from populations which have known Standard Deviations (or Variances). Also, the measured variable in each population (generically denoted x1 and x2) should have a Normal Distribution.
Note that if the distributions of the variables in the populations are non-normal (or unknown), the two-sample z-test can still be used for approximate results, provided the combined sample size (sum of sample sizes) is sufficiently large. Historically, a combined sample size of at least 30 has been considered sufficiently large; reality is (of course) much more complicated, but this rule of thumb is still in use in many textbooks.
- The Null Hypothesis:
- H0 : μ1 - μ2 = δ
in which δ is the supposed difference in the expected values under the null hypothesis.
- The Alternate Hypothesis:
- H0 : μ1 - μ2 < δ
- H0 : μ1 - μ2 > δ
- H0 : μ1 - μ2 ≠ δ
For more information about the Null and Alternate Hypotheses, see the page on the z test for a single mean.
- The Test Statistic:
Usually, the null hypothesis is that the population means are equal; in this case, the formula reduces to
In the past, the calculations were simpler if the Variances (and thus the Standard Deviations) of the two populations could be assumed equal. This process is called Pooling, and many textbooks still use it, though it is falling out of practice (since computers and calculators have all but removed any computational problems).
- The Significance (p-value)
Calculate the probability of observing a value of z (from a Standard Normal Distribution) using the Alternate Hypothesis to indicate the direction in which the area under the Probability Density Function is to be calculated. This is the Attained Significance, or p-value.
Note that some (older) methods first chose a Level Of Significance, which was then translated into a value of z. This made more sense (and was easier!) in the days before computers and graphics calculators.
The Attained Significance represents the probability of obtaining a test statistic as extreme, or more extreme, than ours—if the null hypothesis is true.
If the Attained Significance (p-value) is sufficiently low, then this indicates that our test statistic is unusual (rare)—we usually take this as evidence that the null hypothesis is in error. In this case, we reject the null hypothesis.
If the p-value is large, then this indicates that the test statistic is usual (common)—we take this as a lack of evidence against the null hypothesis. In this case, we fail to reject the null hypothesis.
It is common to use 5% as the dividing line between the common and the unusual; again, reality is more complicated.
Worked Examples edit
Do Professors Make More Money at Larger Universities? edit
Universities and colleges in the United States of America are categorized by the highest degree offered. Type IIA institutions offer a Master's Degree, and type IIB institutions offer a Baccalaureate degree. A professor, looking for a new position, wonders if the salary difference between type IIA and IIB institutions is really significant.
He finds that a random sample of 200 IIA institutions has a mean salary (for full professors) of $54,218.00, with standard deviation $8,450. A random sample of 200 IIB institutions has a mean salary (for full professors) of $46,550.00, with standard deviation $9,500 (assume that the sample standard deviations are in fact the population standard deviations).
Do these data indicate a significantly higher salary at IIA institutions?
The null hypothesis is that there is no difference; thus
- H0 : μA = μB
(where μA is the true mean full professor salary at IIA institutions, and μB is the mean at IIB institutions)
He is looking for evidence that IIA institutions have a higher mean salary; thus the alternate hypothesis is
- H1 : μA > μB
Since the hypotheses concern means from independent samples (we'll assume that these are independent samples), a two sample test is indicated. The samples are large, and the standard deviations are known (assumed?), so a two sample z-test is appropriate.
Now we find the area to the right of z = 8.5292 in the Standard Normal Distribution. This can be done with a table of values or software—I get 0.
If the null hypothesis is true, and there is no difference in the salaries between the two types of institutions, then the probability of obtaining samples where the mean for IIA institutions is at least $7,668 higher than the mean for IIB institutions is essentially zero. This occurs far too rarely to attribute to chance variation; it seems quite unusual. I reject the null hypothesis (at any reasonable level of significance!).
It appears that IIA schools have a significantly higher salary than IIB schools.
Example 2 edit
A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10; she scored 30 on history test with a mean of 25 and a standard deviation of 5.compare her relative position on the two tests?