The first hypothesis tests we studied were checking to see if an
experimental sample produced a value that was significantly different
from some known value produced either by math or by earlier experiments.
For
example, in the lady tasting tea, since she has two choices each time a
mixture is given to her, the math would say that her chance of getting
it right by just guessing is 50% or H0: p
= .5. In testing psychic abilities, there are five different symbols
on the cards, so random guessing should get the right answer 1 out of 5
times, or 20%, so H0: p = .2.
In
a test for average human body temperature, the assumption of 98.6
degrees Fahrenheit being the average came from an experiment performed
in the 19th Century.
We can also do tests by taking samples from
two different populations. The null hypothesis, as always, is an
equality, the assumption that the parameters from the two different
populations are the same. As always, we need convincing evidence that
the difference is significant to reject the null hypothesis, and we can
choose just how convincing that evidence must be by setting the
confidence level, which is usually either 90% or 95% or 99%.
Two proportions from two populations
Like with the one proportion test, the test statistic is a z-score. We have the proportions from the two samples, p-hat1 = f1/n1 and p-hat2 = f2/n2, but we also need to create the pooled proportion p-bar = (f1 + f2)/(n1 + n2).
Here's an example from the polling data from last year.
Question: Was John McCain's popularity in 2008 Iowa significantly different from his popularity in Pennsylvania?
Let's
assume we don't know either way, so it will be a two tailed test.
Polling data traditionally uses the 95% confidence level, so that means
the z-score will have to be either greater than or equal to 1.96 or less
than or equal to -1.96 for us to reject the null hypothesis. Here are
our numbers, with Iowa as the first data set.
f1 = 263
n1 = 658
p-hat1 = .400
f2 = 283
n2 = 657
p-hat2 = .430
p-bar = (263+283)/(658+657) = .415 (q-bar = .585)
Type this into your calculator.
(.400-.430)/sqrt(.415x.585/658+.415x.585/657[enter]
The
answer is -1.103..., which rounds to -1.10. This would say the
difference we see in the two samples is not enough to convince us of a
significant difference in popularity for McCain between the two states,
so we would fail to reject the null hypothesis. In the actual election,
McCain had 45.2% of the vote in Pennsylvania and 44.8% of the vote in
Iowa, which are fairly close to equal.
Student's t-scores
Confidence
intervals are used over and over again in statistics, most especially
in trying to find out what value the parameter of the population has,
when all we can effectively gather is a statistic from a sample. For
numerical data, we aren't allowed to use the normal distribution table
for this process, because the standard deviation sx of a sample isn't a very precise estimator of sigmax
of the underlying population. To deal with this extra level of
uncertainty, a statistician named William Gossett came up with the t-score distribution, also known as Student's t-score
because Gossett published all his work under the pseudonym Student. He
used this fake name for publishing to get around a ban on publishing in
journals established by his superiors at the Guinness Brewing Company
where he worked.
The critical t-score values are published on
table A-3. The values depend on the Degrees of Freedom, which in the
case of a single sample set of data is equal to n-1. For every degree of freedom, we could have another positive and negative t-score table two pages long, just like the z-score
table, but that would take up way too much room, so statistics
textbooks have reverted instead to publishing just the highlights.
There are five columns on the table, each column labeled with a number
from "Area in One Tail" and "Area in Two Tails". Let's look at the
degrees of freedom row for the number 13.
1 tail___0.005______0.01_____0.025______0.05______0.10
2 tails__0.01_______0.02_____0.05_______0.10______0.20
13_______3.012_____2.650_____2.160_____1.771_____1.333
What
this means is that if we have a sample of size 14, then the degrees of
freedom are 13 and we can use these numbers to find the cut-off points
for certain percentages. The formula for t-scores looks like the formula for z-scores, where z = (x - mux)/sigmax and t = z = (x - x-bar)/sx. Because we don't know sigmax, we use the t-score table. For example, the second column in row 13 is the number 2.650.
This means that in a sample of 14, a Student's t-score of -2.650 is the
cutoff for the bottom 1%, the t-score of +2.650 is the cutoff for the top 1% and the middle 98% is between t-scores of -2.650 and +2.650.
Using the t-score table
Let's say we have a t-score of 2.53 and n = 25, which means the degrees of freedom are 25-1 = 24. Here is the line of the t-score table that corresponds to d.f. = 24.
1 tail___0.005______0.01_____0.025______0.05______0.10
2 tails__0.01_______0.02_____0.05_______0.10______0.20
24_______2.797_____2.492_____2.064_____1.711_____1.318
What does this mean for our t-score of 2.53. If it was a z-score,
the look-up table would give us an answer to four digits, .9943, which
is a score that would be beyond the 99% confidence threshold for one
tail (.9943 > .9500) but not beyond the confidence interval for
99% confidence and one tail because those thresholds are .9950 high and
.0050 low. On the t-score table, all we can say is 2.53 is
between 2.797 and 2.492, the closest scores on our line. In an two
tailed test, it is beyond the 0.02 threshold (which would be 98%
confidence, a number we don't use much) but not beyond the 99%
threshold. In a one tailed (high) test our t-score is between 0.005 and 0.01, which means it passes the 99% threshold. Unlike the z-score table, t-scores only work with positive values, so if we get a negative t-score test, we follow these rules.
1. You have a negative t-score and the test is two tailed. Take the absolute value of the t-score and work with it.
2. You have a negative t-score and the test is one tailed low. Again, the absolute value will work.
3. You have a positive t-score and the test is one tailed low. This would be a problem, since only a negative t-score is useful in a one-tailed low test. You should fail to reject H0.
In the example below, we have yet another choice which always lets a one-tailed test be a one-tailed high test.
Two averages from two populations
In
the tests to see if the average of some numerical value is
significantly different when comparing two populations, we need the
averages, standard deviations and sizes of both populations. The score
we use is a t-score and the degrees of freedom is the smaller of the two
sample sizes minus 1.
Question: Do female Laney students sleep a number of hours each night different from male Laney students?
This uses data sets from a previous class. Here are the numbers for the students who submitted data, with the males listed as group #1. Again, let's assume a two-tailed test,
since we don't have any information going in which should be greater,
and let's do this test to 90% level of confidence.
With a test like this, we can arbitrary choose which set is the first set and which is the second. Let's do it so x-bar1 > x-bar2. This way, our t-score will be positive, which is what the table expects.
H0: mu1 = mu2 (average hours of sleep are the same for males and females at Laney)
x-bar1 = 7.54
s1 = 1.47
n1 = 12
x-bar2 = 7.31
s2 = .94
n2 = 26
The
degrees of freedom will be 12-1=11, and 10% in two tails gives us the
thresholds of +/-1.796. Here is what to type into the calculator.
(7.54-7.31)/sqrt(1.47^2/26+.94^2/12)[enter]
0.4971...
2 tails__0.01_______0.02_____0.05_______0.10______0.20
11_______3.106_____2.718_____2.201_____1.796_____1.383
This
number is less than every threshold, and so does not impress us enough to
make us reject the null hypothesis. It's possible that larger samples
would give us numbers that would show a difference, which if true would
mean this example produced a Type II error, but we have no proof of
that.
Monday, March 31, 2014
Notes for March 25 and 27
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment