Monday, May 11, 2009

Class notes for 5/11: p-values and Confidence Levels

The test statistics we will use for hypothesis testing will differ, sometimes using z-scores, other times t-scores and yet other times the chi-square table. What all these tests have in common is that they correspond to probabilities, called p-values.

The idea is that if we have a test with 90% confidence, we will only reject H0 if an event happens 10% of the time or less.

Likewise, 95% confidence means we reject H0 when the probability of the event is less than or equal to 5%.

99% confidence means we reject H0 with tests that happen 1% of the time or less.



This table shows what p-value corresponds to the threshold where we reject the null hypothesis, which technically is the same as accepting the alternate hypothesis. When it's one tailed high, the p-value threshold is easy, .9 for 90% confidence, .95 for 95% confidence and .99 for 99% confidence. For one tailed low, the pattern is that the p-value equals 100% minus the confidence level. For the two tailed test, the two tails have to add up to 100% minus the confidence level, so the 90% confidence level threshold is at the p-values of 5% (.05) and 95% (.95).

Remember, just because a test convinced us to reject the null hypothesis doesn't mean the null hypothesis is false. It could still be true, and we would be making a Type I error.

If we use the 90% confidence level, we should make Type I errors about 10% of the time.

At the 95% confidence level, the probability of Type I errors is 5%.

At 99% confidence level, the probability of Type I errors is 1%.

The lower likelihood of errors is why most tests done on medical data is done at the 99% confidence level.

The p-value is often published to show just how well the test did. Maybe you only asked to prove something to 90% confidence on a one-tailed high test, but the p-value is .9978. This shows people that read the findings that it would be strong enough data to reject H0 at even higher confidence levels.

No comments: