Thursday, October 15, 2015

Notes for October 13 and 15

Testing a hypothesis about the average of an underlying population (mux) based on the average of a sample (x-bar)

Let's assume we have information about an average from a population. For example, it is regularly assumed that human body temperature is 98.6° Fahrenheit or that the average IQ is 100. Another way to assume information would like our test to see if 2015 in Oakland is warmer than average, comparing it to the average temperature for 1999 to 2014. Here are the steps to take to create a null hypothesis H0 and an alternate hypothesis HA, determine a confidence level at which we will reject H0, and use a sample's statistics to see if the data warrants the rejection of the null hypothesis or it fails to meet that standard.

Setting the two hypotheses: The null hypothesis is always an equality and the alternate hypothesis an inequality. There are three kinds of inequalities.


One tailed high: Consider a drug that is supposed to increase muscle mass. The only kind of result that will impress us is one that shows that increase. This would set the two hypotheses as

H0: mux = constant
HA: mux > constant

One tailed low: Instead let's say we have a drug that is supposed to reduce cholesterol. We want to see results where the average goes down.

H0: mux = constant
HA: mux < constant

Two tailed: In class, we looked at data that seems to indicate human body temperature is not 98.6° Fahrenheit. When this claim was made, it was not made clear if the temperature was now higher or lower and in this case, any significant difference would be a surprising result.

H0: mux = constant
HA: mux != constant (The equal sign with the slash isn't available in this text editor. The symbols '!=' are used in some computer languages to mean inequality.)

Setting the confidence level: We may have some leeway as to whether our confidence level is 90%, 95% or 99%. In most experiments I've seen published about scientific statements, the 99% confidence level is standard.

Using the numbers from the experiment: We will need x-bar, sx and n to produce our test statistic t, shown in the equation on the left. We will also use n to get the degrees of freedom, which in this case is n-1.

Example: In class we had a set where n = 36, so degrees of freedom would be 35.

Threshold for one-tailed high test in this situation: Our test stat t would have to be greater than 2.438.

Threshold for one-tailed low test in this situation: Our test stat t would have to be less than -2.438.

Threshold for two-tailed test in this situation: Our test stat t would have to be greater than 2.724 -OR- less than -2.724.

When we plugged in the values 97.96 for the sample average and 0.69 for the sample standard deviation we got (97.96 - 98.6)/0.69 * sqrt(36) = -5.49.  This number is well beyond or low threshold and we would reject the null hypothesis. The technical statement in this case would be

"We are 99% confident from the evidence our our sample that the average human body temperature is not 98.6° Fahrenheit." Notice that we cannot say what the true value is from this, though most samples with fairly large n place the true number now at around 98.2° Fahrenheit. We cannot be certain if the temperature has changed over time or if the means of measurement have become more accurate.

Notes on Bayesian probabilities.
 

No comments: