Thursday, May 14, 2009

Class notes for 5/13: Two population tests for proportions and averages

The first hypothesis tests we studied were checking to see if an experimental sample produced a value that was significantly different from some known value produced either by math or by earlier experiments.

For example, in the lady tasting tea, since she has two choices each time a mixture is given to her, the math would say that her chance of getting it right by just guessing is 50% or H0: p = .5. In testing psychic abilities, there are five different symbols on the cards, so random guessing should get the right answer 1 out of 5 times, or 20%, so H0: p = .2.

In a test for average human body temperature, the assumption of 98.6 degrees Fahrenheit being the average came from an experiment performed in the 19th Century.

We can also do tests by taking samples from two different populations. The null hypothesis, as always, is an equality, the assumption that the parameters from the two different populations are the same. As always, we need convincing evidence that the difference is significant to reject the null hypothesis, and we can choose just how convincing that evidence must be by setting the confidence level, which is usually either 90% or 95% or 99%.

Two proportions from two populations


Like with the one proportion test, the test statistic is a z-score. We have the proportions from the two samples, p-hat1 = f1/n1 and p-hat2 = f2/n2, but we also need to create the pooled proportion p-bar = (f1 + f2)/(n1 + n2).

Here's an example from the polling data from last year.

Question: Was John McCain's popularity in Iowa significantly different from his popularity in Pennsylvania?

Let's assume we don't know either way, so it will be a two tailed test. Polling data traditionally uses the 95% confidence level, so that means the z-score will have to be either greater than or equal to 1.96 or less than or equal to -1.96 for us to reject the null hypothesis. Here are our numbers, with Iowa as the first data set.

f1 = 263
n1 = 658
p-hat1 = .400

f2 = 283
n2 = 657
p-hat2 = .430

p-bar = (263+283)/(658+657) = .415 (q-bar = .585)

Type this into your calculator.

(.400-.430)/sqrt(.415x.585/658+.415x.585/657[enter]

The answer is -1.103..., which rounds to -1.10. This would say the difference we see in the two samples is not enough to convince us of a significant difference in popularity for McCain between the two states, so we would fail to reject the null hypothesis. In the actual election, McCain had 45.2% of the vote in Pennsylvania and 44.8% of the vote in Iowa, which are fairly close to equal.


Two averages from two populations


In the tests to see if the average of some numerical value is significantly different when comparing two populations, we need the averages, standard deviations and sizes of both populations. The score we use is a t-score and the degrees of freedom is the smaller of the two sample sizes minus 1.

Question: Do female Laney students sleep more hours each night than male Laney students?

We will take our data from the larger of the two class surveys, Data Set #2. Here are the numbers for the students who submitted data, with the females listed as group #1. Again, let's assume a two-tailed test, since we don't have any information going in which should be greater, and let's do this test to 90% level of confidence.

H0: mu1 = mu2 (average hours of sleep are the same for males and females at Laney)

x-bar1 = 7.31
s1 = .94
n1 = 26

x-bar2 = 7.54
s2 = 1.47
n2 = 12

The degrees of freedom will be 12-1=11, and 10% in two tails gives us the thresholds of +/-1.796. Here is what to type into the calculator.

(7.31-7.54)/sqrt(.94^2/26+1.47^2/12)[enter]

-0.4971...

This number is between the thresholds, and so does not impress us enough to make us reject the null hypothesis. It's possible that larger samples would give us numbers that would show a difference, which if true would mean this example produced a Type II error, but we have no proof of that.

No comments: