Monday, February 17, 2014

Notes for February 11 and 13

Reverse look-up of z scores: Percentiles to z-scores, z-scores to raw scores

So far, we only use the percentile look-up table if we know a data set is normally distributed. In these cases, we are given the average mux and the standard deviation sigmax and find a z-score z(x) for a raw score x using this formula.

z(x) = (x - mux)/sigmax

We can then use this z-score to look up a percentile on the orange look-up table.  

Example: For men, the average height is 70.5 inches (5' 10.5") and the standard deviation is 2.8 inches. What is the z-score for a man 6'0" tall (72 inches)?

z(72) = (72 -70.5)/2.8 = 1.5/2.8 = 0.535714..., which rounded to the nearest hundredth is 0.54. Using our lookup table in row 0.5 and column 0.04, we get .7054, which means a man 6'0" is as tall or taller than 70.54% of the male population. Subtracting from 100%, we can also say that 29.46% of the male population is 6'0" or taller.

What if instead we wanted to find the cut-off height for the 75th percentile of men. What we have to find is when the table has two consecutive numbers that look like this

... .74xx .75xx ...

What that would mean is that between those two z-score will be the cutoff for the 75th percentile. Looking on the positive side of the chart we find

0.6 ... .7486 .7517 ...

In the columns 0.06 and 0.07. This means the 75th percentile lies in between 0.66 and 0.67. Since .7486 is .0014 below .7500 and .7517 is .0017 above, it's fair to say the 75th percentile is about half way in between, and 0.665 is a good approximation.

Now that we have a z-score, we use this formula to find the raw score x.

x = mux + z(x) × sigmax

In this case we get

x = 70.5" + 0.665 × 2.8 = 70.5" + 1.862" = 72.362" = 6" 0.4"

This says the 75th percentile of men's heights is at just below six feet and one half inch tall.

What about the percentage of men between 5'11" and 6'2". Here were find the percentages for both hieghts, then subtract the smaller percentage from the larger.

6'2" = 74", and z(74) = (74-70.5)/2.8 = 3.5/2.8 = 1.25

z = 1.25 corresponds to 0.8944

5'11" = 71" and z(71) = (71-70.5)/2.8 = 0.5/2.8 = 0.17857... = 0.18

z = 0.18 corresponds to 0.5714

0.8944 - .5714 = 0.323, which says about 32.3% of men fall between exactly 5'11" and exactly 6'2" tall.

The Central Limit Theorem 

 If we have a normally distributed population and take a random sample, z(x-bar) is different from z(x) by changing the standard deviation. The simplest way to compute z(x-bar) is to take the z-score the normal way then multiply by the square root of n, the size of the sample. For example, if 10 men averaged to a height of 6'0", they would have a different z-score and correspond to a different proportion of the population.

 z(72) = (72 -70.5)/2.8 × sqrt(10) = 1.694..., which rounds to 1.69. Looking up in row 1.6 and column 0.09, we get .9545.  This says a sample of 10 men averaging 6'0" tall is taller than 95.45% of all samples of ten men, and only 4.55% of all samples of ten men are shorter than that.

This rule is called the Central Limit Theorem. We will be using it to try to get estimates of the average of a population using the average of a sample that is assumed to be representative.

No comments: