Tuesday, April 28, 2009

Class Notes for 4/27 and 4/29

Dependence and Independence
The first line of Leo Tolstoy's Anna Karenina is "All happy families are alike; each unhappy family is unhappy in its own way." In statistics, all independent trials are alike, in that the probability of a particular outcome of one trial does not effect the outcome of later trials, nor was it effected by earlier trials. With dependent probability, the outcome of one trial is effected by the outcome of previous trials, but how that effects things is not always the same.

For example, if we talk about a 70% free throw shooter taking two shots, does missing the first shot effect the probability of missing the second shot? Let's look at this simple problem three different ways.

Predicting the future mathematically by carefully studying the past: Let's say I called this person a 70% free throw shooter because so far in the season, she has made 7 of 10 shots from the line. If she misses, she has now made 7 of 11 shots from the line, and she is now a 63.6% shooter. Should we factor that in to be more precise? Some mathematical models would say yes.

Using this method, missing a shot would make her percentage worse and making one would make it better, no matter how many shots she had taken. But if she had made 70 of 100 so far in the season, one miss would make her 70 of 101, which would lower her percentage to 69.3%, a much smaller effect than it has if she had made 7 of 10. If instead we were looking at her entire career instead a single season, perhaps she has made 700 of 1000, and missing one would mean she was 700 of 1001, which changes her percentage to 69.9%. When discussing free throw percentage, announcers on TV usually round to the nearest percent, so the first example of 7 of 10 to 7 of 11 would be a drop from 70% to 64%, the second example would be a drop from 70% to 69%, and in the third example, 69.9% would round to 70% and the change would be too small to notice.

Factors effecting success and failure: The free throw shooter goes to the line for two shots and misses the first. Is there a physical reason? We might treat free throw shooting as we would picking a random number from 1 through 100, where we count any number from 1 through 70 as a success and 71 through 100 as failures, but shooting free throws includes the human factor. Maybe she missed because she is nervous or distracted. Maybe it's late in the game and she is tired or injured, changing her technique. If any of these are the case, it might make more sense for us to downgrade the probability of making the next shot, though exactly how much it should be downgraded is no longer some simple formula like turning a fraction into a percentage.

Compensating for failure: Again, let's add in the human factor in this problem. She misses the first free throw, and her coach notices her technique looks inconsistent. "Elbow up!" the coach shouts from the sidelines, and the shooter hears the coach and readjusts her technique to match the way she shoots in practice. Should this change bring her back to being a 70% shooter, or even upgrade her chance of success? That is uncertain, but the failure on the first shot and the diagnosis of at least one reason for the failure could effect the odds, and that effect means the second shot should not be considered independent of the first.

The dependent probabilities in a 52 card deck

One of the simplest mathematical models of dependency is sampling without replacement, which is the way most card games or lotteries or the game of Bingo works. You have a set of outcomes which get effectively randomized and a trial is performed, meaning a card is taken from the deck or a ping pong ball is removed from the hopper or a bingo marker is removed from the spinner. Once removed, the number of possible outcomes has been reduced by one and probabilities for success and failure of certain outcomes change.

Looking for an ace: There are 52 cards is a standard deck and 4 of them are aces. If I draw a card from a randomized deck, the chances are 4/52 = 1/13 ~= 7.7% that the card will be an ace. What are the chances the second card is an ace?

That depends on the first card.

Probability that the second card is an ace, given the first card is an ace is 3/51 = 1/17 ~= 5.9%.

Probability that the second card is an ace, given the first card is not an ace is 4/51 ~= 7.8%.

Unlike the mathematical model of free throw shooting where we re-calculate the probabilities by adding the most recent make or miss into the precentage, which means a miss brings the odds down and a make brings the odds up, not getting an ace makes the odds a little better next time, and getting an ace makes the odds worse.


This is the formula for the dependent probability model of sampling without replacement is given at the left. The two numbers in parentheses are a binomial coefficient, the numbers you get when you use nCr on your calculator, which I pronounce "n choose r" in class. The pairs of numbers that look like a base and an exponent, except that the exponent is underlined, is the convention developed by Donald Knuth at Stanford for writing the numbers that you get on your calculator using the nPr function, which I pronounce "n fall r", referring to the name "the falling factorial". If we think about a deck of cards, the lowercase letters refer to the size of the hand n, where r is the number of successful trials (r for right) and w is the number of unsuccessful trials (w for wrong), and r+w=n. The uppercase letters refer to the size of the deck, where T is the size of the deck, G is the number of cards we consider success if we draw them and B is the number of cards we consider a failed trial if we draw them. The letter T stands for Total, G for Good and B for Bad. Again, we have an equation, G+B=T.

Example: If we want consider drawing a heart a success and anything else a failure, what is the probability of drawing three hearts and two non hearts in a five card hand from a well-shuffled 52 card deck.

Here are the six numbers we need.
n = 5
r = 3
w = 2
T = 52
G = 13
B = 39

On a TI-30XIIs, here are the keys you would press.

5[prb][right]3×13[prb]3×39[prb]2÷52[prb]5[enter]

The calculator will read as follows.

5 nCr 3*13 nPr 3*39 nPr 2/52 nPr 5
0.081542617

This means the probability of exactly three hearts and two cards of some other suit is about 8.15%.

The Expected Value (EV) of a two outcome game

Let us assume we have a game that has only two outcomes, winning and losing. Let us further assume that two players have decided to wager on this game, both putting money into a pooled amount and the winner taking all at the end.


If we look at the game from the point of view of one of the players, we need to know the probability of winning p, how much that player put in, which we call Risk and how much the opponent put in, called Profit. The expected value EV equals the probability of victory p times the sum of Profit and Risk divided by Risk.

Different books use different formulas for this. Some do not divide by Risk. By dividing, the number we get is a percentage of return, and a game of flipping coins for $1 a game is equivalent to a game of flipping coins for $100 a game. Some subtract 1 from this formula. This just changes the most important number in identifying results from 1 to 0.

If EV = 1, we consider this a "fair game". For every $1 risked on this game, the expected value is that you will have that dollar returned to you, breaking even. Notice if we are flipping coins, that event never happens on any single play. Either the player makes a dollar profit or a dollar loss, but expected value is about the long run.

If EV > 1, the game is advantageous to the player. If EV < 1, the game is disadvantageous to the player.

In the game of roulette, there are 38 slots where the ball can land, and for simplicity's sake we will assume each has an equal chance of showing up, so p = 1/38. For every $1 you risk, you can make a profit of $35 if you correctly guess in the exact slot where the ball will land. To find the expected value using the TI-30xIIs, you should type in this.

1÷38×(35+1)÷1[enter]

The calculator will read as follows.

1/38*(35+1)/1
0.947368421

What this number means is that for every dollar risked on the spin of a roulette wheel, you should expect about 94.7 cents returned to you in change. In other words, about 5.3 cents is lost from every dollar you bet on every spin of the wheel.

Another way to play the game is to bet red or black. Of the 38 compartments, 18 are red and 18 are black and 2 are green. The probability of victory on betting one of the two major colors is 18/38 = 9/19 ~= 0.473684211. The profit and risk are now both $1. Here's what to type on the TI-30xIIs.


16÷38×(1+1)÷1[enter]

The calculator will read as follows.

16/38*(1+1)/1
0.947368421

The game has changed, both in probability and amount of profit compared to risk, but from the player's point of view, the expected value is precisely the same and still in favor of the casino.

No matter what the levels of profit and risk are, we can find a probability p that will make the expected value equal to 1, and that is p = Risk/(Profit+Risk). If the probability is increased with the profit and risk remaining unchanged, the game becomes advantageous. If is is decreased, the game becomes disadvantageous.

Modern and Classic Parimutuel odds

Profit and risk are listed either in classic form like 3-1 or 2-7 (or sometimes with colons 3:1 or 2:7), where profit is the first number and risk is the second.

In online betting sites, the numbers are given as numbers with absolute value greater than 100, with either a + or - in front of them. +250 means 250 is the profit and 100 is the risk, while -250 means 10o is the profit and 250 is the risk. The fourth page of the yellow sheet explains this in greater detail and shows how to switch back and forth between the two systems.

Practice problems

1. With a well-shuffled 52 card deck, find the probability of getting exactly r hearts in a five card hand when
a) r = 0
b) r = 1
c) r = 2
d) r = 3 (already solved above)
e) r = 4
f) r = 5

2. Find the break-even p when Profit and Risk are as given. Round to three places after the decimal point.

a) Modern parimutuel = +150
b) Modern parimutuel = -110
c) Classic parimutuel = 5:3
d) Classic parimutuel = 5:11

Answers in the comments.

Thursday, April 23, 2009

Class notes for 4/22

Random and deterministic

It's common for people to use the word random in a casual manner, but in the field of the philosophy of science, discussion of whether something can be considered random or not is a subject of intense debate. The opposite of random is deterministic, which is to say that when we perform a task, we understand the possible outcomes thoroughly. For example, putting a key in a lock is a deterministic act. Will the door open? Not necessarily. It might be the wrong key. The key might be correct, but it might have worn out over time. A brand new key in a lock that has been used might have edges that are too sharp. The lock could malfunction. Maybe the person didn't turn the key in the right direction. So deterministic does not mean, "If you do a, then b will also happen." It can be more complicated than that. But in a completely deterministic act, we have an expected outcome, and even when it fails, we have explanations of why it fails.

The problem of determinism versus randomness is not a yes/no situation. We have some acts we consider random, like flipping a coin or rolling a die or choosing a card from a deck. If done under certain circumstances, even these can be deterministic. If the deck is removed from the pack and unshuffled, the top card will be the way the cards were sorted at the factory, and so it is completely deterministic. Some card tricks are done with decks that aren't actually randomly shuffled or the magician has ways of forcing the participant to pick a certain card, so picking a card is deterministic, or picking the card is random but returning it to the deck is deterministic, and so it can easily be found by being out of position. If we drop a coin or a die only a short distance, it might not bounce much, so the result is strongly determined by the original state of the die. Much of the randomness of things like coin flips or dice rolls or lottery balls being removed from a hopper have to do with physics problems that could be considered deterministic if we understood all the variables, but the equations are so difficult that solving them completely is beyond even the most sophisticated computer simulations.

This brings us to random numbers and computers. A computer is completely deterministic. It cannot truly produce a random number, though every computer and even many calculators, including the TI-30XIIs, have random number generators. Here, the computer takes some input unknown to the user, puts it through a function also unknown to the user and produces an output. This is called pseudorandom, and debate over these methods continue to this day. One of the fathers of computer science, the great Hungarian mathematician John Von Neumann, was quoted as saying, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." A computer has nothing but arithmetical methods available. The tests done to see if a method produces output that is sufficiently random have to do with a distributions of a large set of pseudorandom output.


The probability of r successes in n independent trials, where p is the probability of success of any given trial

The simplest kind of problem in multiple trials is to assume independence between the trials, that the probability of success remains constant over all the trials, and we call that probability p. A basic question in these kinds of experiments is to ask what is the probability of getting a specific number of successes, calling that number r, in a specific number of trials, a number we will call n. If you have a TI-83 or TI-84, there is a function in the distribution menu (blue button then DISTR) called binompdf, that takes as its input n, p and r in that order. On the TI-30XIIs, we have to type in a formula.

Example #1: If we flip a fair coin, p = .5, ten times, what is the probability of exactly 6 heads and 4 tails?

TI-83 or TI-84: Go to the distribution menu, select binompdf and type in the following values

binompdf(10,.5,6) = 0.205078125

On the TI-30xIIs, here is what to type in

10[prb][right arrow]6x.5^6x.5^4[enter]

It will read
10 nCr 6 * .5^6 * .5^4
0.205078125



A different question would be what are the chances of at most r successes in n trials. Again, the TI-83 and TI-84 have a single function solution, this one called binomcdf(n, p, r). On the TI-30XIIs, we either have to calculate all the separate values, from the probability of 0 successes up through the probability of r successes then add these up, or if this is too much work, there is a method using normal distribution to approximate binomial distribution.

If you have Java on your computer, you can go to this website to see how the numbers across a row of Pascal's Triangle look like the bell-shaped curve. We can use z-scores and the lookup table to get values that will approximate these probabilities, and the approxomations get better as the numbers get bigger.

Example #2: If we flip a fair coin, p = .5, ten times, what is the probability of at most 6 heads?

TI-83 or TI-84: Go to the distribution menu, select binomcdf and type in the following values

binompcf(10,.5,6) = 0.828125

On the TI-30XIIs:

This entails a lot of work, but not an impossible amount. We need to find
p(exactly 0 heads) + p(exactly 1 head) + p(exactly 2 heads) + p(exactly 3 heads) + p(exactly 4 heads) + p(exactly 5 heads) + p(exactly 6 heads) = 0.828125

We have the normal approximation method, but we will see when n = 10, it's not very good.

n = 10, p = .5, r = 6

z = (6 + .5 - .5*10)/sqrt(.5*.5*10) = .3

Lookup table: .3 -> .6179

The approximate value at around 62% is very far from the true value at nearly 83%.

Different books have different cut-off points, but at a minimum, using the normal approximation to binomial should only be used when both np > 5 and nq > 5. In this case, both values equal 5, so this would not be a good candidate for using this method.

Example #3: If we flip a fair coin, p = .5, one hundred times, what is the probability of at most 60 heads?

TI-83 or TI-84: Go to the distribution menu, select binomcdf and type in the following values

binompcf(100,.5,60) = 0.982399899891... ~= .9824

TI-30XIIs: It's too many things to add up, so the approximation method is our only hope. Both np > 5 and nq > 5, so we can move forward.

(60 + .5 - .5*100)/sqrt(.5 * .5 * 100) = 2.1

Lookup table: 2.1 -> .9821

When n=100, the approximation is not perfect, but it is very close, unlike the results when n=10.


Another question that could be asked is the probability of getting at least r successes in n trials. "At least r" is the complement of "At most r-1", which means the two probabilities will add up to one.

Example #4: If a 70% free throw shooter takes ten shots, p = .7, what is the probability she makes 8 shots or more?

On the TI-83 or TI-84:

1 - binomcdf(10, .7, 7) = .3827827864..., which rounds to .3828 to four places.

On the TI-30XIIs

10[prb][right arrow]8x.7^8x.3^2[enter]

It will read
10 nCr 8 * .7^8 * .3^2
0.233474441

10[prb][right arrow]9x.7^9x.3^1[enter]

It will read
10 nCr 9 * .7^9 * .3^1
0.121060821

10[prb][right arrow]10x.7^10x.3^0[enter]

It will read
10 nCr 10 * .7^10 * .3^0
0.028247525

If we round these numbers to five places, add them up and round the answer to four places, it should agree with the answer above.

.23347 + .12106 + .02825 = .38278, which rounds to .3828, which agrees with the above answer.

Example #4: If a 70% free throw shooter takes 100 shots, p = .7, what is the probability she makes 75 shots or more?

On the TI-83 or TI-84:

1 - binomcdf(100, .7, 74) = .163130104..., which rounds to .1631 to four places.

On the TI-30XIIs: The problem is adding up too many parts, but np = 70 and nq = 30, both of which are more than 5, so let's go to the approximation method.

z = (75 - .5 - 70)/sqrt(.7*.3*100) = .98198..., which rounds to .98

Lookup table: .98 -> .8365, and 1-.8365 = .1635, which again is pretty close to the actual answer.

Tuesday, April 21, 2009

Class Notes for 4/20

We dealt with probability in a single instance earlier in the class when we had the relative frequencies of the values of categorical variables. Relative frequencies, listed in a population as p and in a sample as p-hat, are numbers between 0 and 1. If we take all the relative probabilities of all the values of a variable, they will add up to 1, or something very close to 1 depending on rounding error.

We will now talk about probability in multiple event experiments, like flipping ten coins or rolling five dice or drawing a hand of four cards from a 52 card deck. The first important split in the types of multiple event experiments is between independent and dependent events.

Events are independent if the probability of a later event does not change based on the result of an earlier event. For example, if I flip a coin that I can assume is fair, there is a 50% chance of heads and a 50% chance of tails every time I flip it. If by chance, the coin comes up heads ten times in a row, even though earlier testing had shown it to be a 50%-50% chance each time, the eleventh flip is still 50%-50%. Unusually long runs of all heads or all tails are rare, but they are not impossible. Flipping coins and rolling dice are typical examples of independent random events.

Events are dependent if the probability of a later event changes based on the result of an earlier event. The typical example of this is drawing cards from a shuffled deck. If the deck has 52 cards and 4 aces, the probability of drawing an ace from the deck is 4/52 = 1/13 ~= .0769...

If the card has been drawn, what is the probability of the second card being an ace? That depends on what the first card is. If the first card is an ace, there are only 3 left in the deck, which now has 51 cards, so the probability is 3/51 = 1/17 ~= .0588..., which is a lower probability than getting an ace the first card.

If the first card wasn't an ace, the odds are 4/51, ~= .0784..., a slightly higher probability than drawing an ace the first time.

If I say someone is a 70% free throw shooter, is every free throw attempt independent of what happened before? Often, we set up such an experiment assuming independence just to make our work simpler, but the human factor is involved, so in reality it's very likely to be dependent. Some people get frustrated after a few misses and will do worse. Others will learn from the mistakes of a few misses and figure out what they are doing wrong and make improvements. A player might be having a bad day for some reason, or might instead have excellent concentration or just really good luck that day. But again, these kinds of experiments are often set up as though each free throw trial is independent of what came before.

Let's look at flipping coins. A list of all possible events is called the event space. Here are some examples of event spaces.

Event space for flipping one coin
Heads (H)
Tails (T)

ways to get one head = 1
ways to get no heads = 1

Event space for flipping two coins
HH
HT
TH
TT

ways to get two heads = 1
ways to get one head = 2
ways to get no heads = 1

Event space for flipping three coins
HHH
HHT
HTH
HTT
THH
THT
TTH
TTT

ways to get three heads = 1
ways to get two heads = 3
ways to get one head = 3
ways to get no heads = 1


The list of numbers of ways to get r successes in n trials is often written in the pattern of the picture shown here, and this pattern is called Pascal's Triangle, at least in most of the world. The Italians call it Tartaglia's Triangle and the Chinese call it Yanghui's Triangle. None of these people actually invented it or claimed to have invented it. It's been around since before the time of Christ, and it has been studied all around the world.

While it is very common to see it presented in the form here as an equilateral triangle, it can also be presented where the first numbers in each row are lined up straight as follows

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
... etc.

It is standard to start counting the top row as row 0, and the left most column as column 0. For example, the 6 we see in the middle of the last row I typed in is row 4, column 2. Instead of having a copy of Pascal's Triangle around, our calculators have these numbers available. On Texas Instruments calculators, the function is under the probability menu. On the TI-30XIIs, the way to get that 6 is to type

4 [prb][right arrow]2[enter]

The calculator will read

4 nCr 2
6

All scientific calculators should have this function available, but all of them are slightly different. The TI-89 writes it as nCr(4,2) and Casio calculators write it as 4 C 2. I will pronounce it "4 choose 2", and when I type on the blog, I will type C(4,2). When I write it on the board or on tests, I will put a 4 on top of a 2 and surround both numbers with a large parentheses. These numbers are called the binomial coefficients.


The formula for finding the probability for exactly r successes in n independent trials where the probability of success on any single trial is p is shown here. In some books, they don't use the letter q, instead replacing it with (1-p). Likewise, sometimes w is replaced with (n-r). I use the extra letters and include the relationships between them. The letters r and w stand for right and wrong. The letter p and q are standard in probability texts for the probability of a success or a failure.

Let's do an example. You are given a four question multiple choice test, each question having five possible answers. The test is given in a language you do not read, so all you can do is guess. Each question is independent from the others, meaning that if C is the right answer to the first question, it's also possibly the answer to the second. The probability p of a correct guess is 1 chance in 5, or .2, The probability of failure q is 1-.2 = .8, and of course p + q = 1.

Probability of no correct answers = C(4,0)*.2^0*.8^4 = .4096
Probability of exactly one correct answer = C(4,1)*.2^1*.8^3 = .4096
Probability of exactly two correct answers = C(4,2)*.2^2*.8^2 = .1536
Probability of exactly three correct answers = C(4,3)*.2^3*.8^1 = .0256
Probability of four correct answers = C(4,4)*.2^4*.8^0 = .0016

The expected value of correct answers is n*p, so in this case it's 4*.2 = .8, which isn't possible. You can't get a fraction of correct answers on a multiple choice test. The expected value in this case says that over the long run, a test like this should average .8 right answers out of four. As we can see, the most likely thing to happen is actually a tie for first, where getting either no answers right or one answer right both have a probability of about 41%. If you need to get three answers right to pass the test, the odds are less than 3% to get either three or four right, and the odds of getting everything right by chance is a very slim 16 chances in 10,000.

If you have a TI-83 or TI-84, there is a function under the distribution menu called binompdf(n,p,r). All you have to is enter the function, then the three values in the order given, separated by commas.

The function for three right in four trials with probability .2 at each trial is binompdf(4, .2, 3), which as we see above is .0256.


Practice problem.
The test is changed. There are now five multiple choice questions and four choices for each, but it is still given in a language you do not read.

Round the probabilities to four places after the decimal.

1. What is the expected value?

2. What is the probability of no correct answers?

3. What is the probability of exactly one correct answer?

4. What is the probability of exactly two correct answers?

5. What is the probability of exactly three correct answers?

6. What is the probability of exactly four correct answers?

7. What is the probability of five correct answers?

Answers in the comments.

Tuesday, April 7, 2009

Practice for the second exam

Here are links to practice problems that are among the topics for the second exam. The questions are at the bottom of each lost and the answers are in the comments.

Confidence of victory

Confidence intervals for proportions

More confidence intervals for proportions

What percentage of American women are listed as a given height?

Confidence intervals for averages of numerical data and two tests for normally distributed data

Confidence intervals for standard deviation of numerical data

More confidence intervals for averages and tests for normal distribution

z-scores and percentages above and below a given z-score in a normally distributed set

Practice for finding n given a margin of error.

If we have taken p-hat in a previous sample and it was 42%, what is the minimum n we should poll to get a margin of error of no more than 2.3%?

If we haven't taken a previous poll, what is the minimum n we should take to get a margin of error of no more than 2.3%?

Answers in the comments.

Monday, April 6, 2009

Answers to Homework 10

Final answers in bold and red.

Here is the data for the combined small samples of m&m’s in both classes on Monday. Round all answers in this part to the nearest tenth of a percent. The formula for the margin of error for 95% is 1.96*sqrt(p-hat*q-hat/n) and the endpoints of the 95% confidence interval are p-hat-MoE95% and p-hat+MoE95% .

n = 580 f(red) = 72 f(blue) = 134

p-hat(red) = 72/580 = 12.4% p-hat(blue) = 134/580 = 23.1%

MoE95%(red) = 1.96*sqrt(.124*.876/580) = .0268... = 2.7%

MoE95%(blue) = 1.96*sqrt(.231*.769/580) = .0343... = 3.4%

We are 95% confident the true proportion of blue milk chocolate m&m’s currently being

produced in New Jersey is between (23.1+3.4)% = 26.5% and (23.1-3.4)% = 19.7%.

The formula for finding the minimum n for a particular margin of error given a 95% confidence interval is n >= 1.96^2*p-hat*q-hat/MoE95%^2 , where n should be rounded up to the next highest integer if the formula produces a non integer answer.

If we want a MoE95% of 2.5% and we can assume p-hat = .5 , what is the minimum n?

Answer: n >= 1.96^2*.5*.5/.025^2 = 1536.64, so n >= 1537

If we want a MoE95% of 2.5% and we can assume p-hat = .55 , what is the minimum n?

Answer: n >= 1.96^2*.55*.45/.025^2 = 1521.2736, so n >= 1522

In the last poll before the 2008 election, Obama lead McCain 51% to 47% in a sample of 600 voters in Virginia. Since 51% + 47% = 98%, we can use the confidence of victory method. Round the proportions and standard deviation to the nearest tenth of a percent. Round the confidence of victory number to the nearest percent. The z-score formula is (p-hat(leader) - .5)/sp-hat.

f(Obama) = .51*600 = 306 f(McCain) = .47*600 = 282 new n = 306 + 282 = 588


new p-hat(Obama) = 306/588 = .5204 = 52.0%

new z(Obama) = .520/sqrt(.520*.480/588) = .97

Finish this sentence about confidence of victory.

Given this sample, if the election were held when the poll was taken, we are 83% confident the underlying population would favor Obama and he would win the election in Virginia.

Sunday, April 5, 2009

Homework 10 - not accepted late

Usually, I let students turn in homework assignements one class period late. Homework 10 is the last homework before the midterm, so I want to post the correct answers on Monday afternoon so everyone can study them. Because of this, homework 10 will not be accepted any later than 1 p.m. on Monday afternoon when my office hours end in room G-201.

Thursday, April 2, 2009

Class notes for 4/1

We have learned how to find a confidence interval for a proportion in a population. Unlike numerical data, where the confidence level multipliers, known in this class as CLMxx%, are taken from the t-score tables, the CLMxx% for proportions are from the z-score table.

CLM90% = 1.645
CLM95% = 1.96
CLM99% = 2.575

The most common use of confidence intervals for proportions are in opinion polls, though when the people in the media talk about "margin of error", they rarely say that the margin of error is associated with a confidence level, which in opinion polls is always 95%. For example, in early 2008 the opinion polls before the New Hampshire election showed Barack Obama with a lead in a multi-candidate race in the Democratic primary, but the primary was won by Hillary Clinton. There were four candidates who polled well over 1% of the voters, Clinton, Obama, John Edwards and Bill Richardson. Clinton's final total was not within the 95% confidence interval set for her by the poll results, but the other three were in the 95% confidence intervals set for them. These things happen. The idea of the 95% confidence interval is that is leaves open the possibility that it will get the numbers wrong about 1 time in every 20.

Final opinion poll (true result in parentheses)
n = 500
Obama: 39% +/- 4.3% (36.5%, inside the confidence interval)
Clinton: 34% +/- 4.2% (39.1%, outside the confidence interval)
Edwards: 15% +/- 3.1% (16.9%, inside the confidence interval)
Richardson: 4% +/- 1.7% (4.6%, inside the confidence interval)

Obama did worse than predicted, everyone else did better, especially Hillary, whose result was well outside her 95% confidence interval. Again, remember that the 95% confidence interval is not a promise. It says it will be right about 19 times out of 20, but it never knows when that 1 time in 20 that it will be wrong will happen. If Bill Richardson outperformed expectations and gotten 9% of the vote, it would still have been a mistake by the polls, but no one would have paid much attention, because it wouldn't have changed the outcome of who finished first.

Of all the websites, TV shows and newspapers that report on opinion polls, the only one that consistently explains them correctly is The New York Times, which has a standard sidebar it puts next to opinion poll results.


Finding n when the margin of error is given

Opinion polling companies always report the margin of error, but the public does not completely understand it, largely because the media does not explain it. In general, the lower the margin of error the better, but the simplest way to guarantee a low margin of error, given that you can't change the industry standard confidence level of 95%, is to increase the sample size. While this is simple, it's also expensive. If the polling company in New Hampshire had wanted all the margins of error to be no more than 3.0%, it could have used the formula above, with MoE95% = .03 and p-hat = 39%, the best estimate for the leader, who they assumed was Barack Obama.

n >= 1.96^2*.39*.61/.03^2 = 1,015.46...

This says a sample of 1,016 likely voters would have produced a margin of error for each candidate of no more than 3.0%. If we did not have the previous information that the highest percentage expected was about 39%, we would have had to assume someone might be close to 50%, which would increase the needed sample size.

n >= 1.96^2*.5*.5/.03^2 = 1,067.11...

In this case, n would have to be 1,068 to guarantee the margin of error of 3.0% or less.


Confidence of victory

The margin of error is the industry standard, but the people who use these numbers, most especially the news media, really don't understand them very well. Here is a different method to produce a more useful piece of mathematical information which this author has developed, called the confidence of victory method.

Confidence of victory should only be used if the top two vote getters combined are getting 90% or more of the respondents to the opinion poll. So in the New Hampshire primary, we could not use this method. In the final poll taken in New Hampshire before the general election, these were the results.

n = 700
Obama 51%
McCain 44%

Since they add up to 95% of the respondents, we can use the confidence of victory method. What we do is effectively ignore the 5% who are either voting for third party candidates, are preferring none of the candidates or are still telling pollsters they are undecided. We figure out how many people in the poll said they prefer Obama and how many prefer McCain by multiplying the percents by the size of the poll.

f(Obama) = 700 * .51 = 357
f(McCain) = 700 * .44 = 308

new n = 357+308 = 665

p-hat(Obama) = 357/665 ~ 53.7%
p-hat(McCain) = 308/665 ~ 46.3%

sp-hat = sqrt(.537*.463/665) ~ 1.93%

z(Obama) = (53.7 - 50)%/1.93% ~ 1.91

This says Obama's percentage is about 1.91 standard deviations above 50%. The percentage he will get in the actual election may be higher or lower than what we see here. We assume there's about a half a chance he will do better than the final opinion poll, and a half a chance he will do worse. What the public actually cares about is whether he wins or loses. What the confidence of victory method does is find the percentage that corresponds to the z-score. That number is the confidence level we have that the true percentages from the population polled will show that the leader in the poll will be the winner of the election. In this example, z=1.91 corresponds to .9719 on our positive z-score table. Because the confidence of victory method is sensitive to small changes, we should round to the nearest percent and use this sentence to describe the results.

If the election were held when the poll was taken, we are 97% confident that Obama will hold on to the lead shown in the poll and win the election in New Hampshire.

In the actual election, Obama outpolled McCain 55% to 44%, which is to say he did better than expected. Confidence of victory is not concerned with the margin of victory, just whether the favored candidate in the polls wins the actual elections.

In 2008, the final polls in the 50 states and Washington D.C. had two states too close to call, Missouri and Indiana. Both elections were very close, called late in the evening, Missouri for McCain and Indiana for Obama. In the other 49 contests where confidence of victory claimed an advantage for one side or the other, 48 contests were won by the person leading in the most recent poll, which is to say the confidence of victory method was vindicated about 98% of the time. The only state where the confidence of victory method did not get the right result was North Carolina. McCain had a 60% confidence of victory in North Carolina, but Obama actually won the state.

In 2004, there were two states that the confidence of victory method got wrong, Ohio and Florida which looked to be favoring John Kerry in the final polls. Of course, 2004 was a much closer election, and either of those states could have turned the tide. 2008 was an electoral college landslide. Even if McCain had won North Carolina, he still would have lost the election.

Practice problems

Here are some final poll numbers from the 2008 election where the totals favoring either Obama or McCain add up to over 90%. Use the confidence of victory method.

Colorado
n = 600
Obama 52%
McCain 45%

Arizona
n = 600
Obama 46%
McCain 50%

Answers in the comments.