Statistics on a budget: Class notes for 4/22

Random and deterministic

It's common for people to use the word random in a casual manner, but in the field of the philosophy of science, discussion of whether something can be considered random or not is a subject of intense debate. The opposite of random is deterministic, which is to say that when we perform a task, we understand the possible outcomes thoroughly. For example, putting a key in a lock is a deterministic act. Will the door open? Not necessarily. It might be the wrong key. The key might be correct, but it might have worn out over time. A brand new key in a lock that has been used might have edges that are too sharp. The lock could malfunction. Maybe the person didn't turn the key in the right direction. So deterministic does not mean, "If you do a, then b will also happen." It can be more complicated than that. But in a completely deterministic act, we have an expected outcome, and even when it fails, we have explanations of why it fails.

The problem of determinism versus randomness is not a yes/no situation. We have some acts we consider random, like flipping a coin or rolling a die or choosing a card from a deck. If done under certain circumstances, even these can be deterministic. If the deck is removed from the pack and unshuffled, the top card will be the way the cards were sorted at the factory, and so it is completely deterministic. Some card tricks are done with decks that aren't actually randomly shuffled or the magician has ways of forcing the participant to pick a certain card, so picking a card is deterministic, or picking the card is random but returning it to the deck is deterministic, and so it can easily be found by being out of position. If we drop a coin or a die only a short distance, it might not bounce much, so the result is strongly determined by the original state of the die. Much of the randomness of things like coin flips or dice rolls or lottery balls being removed from a hopper have to do with physics problems that could be considered deterministic if we understood all the variables, but the equations are so difficult that solving them completely is beyond even the most sophisticated computer simulations.

This brings us to random numbers and computers. A computer is completely deterministic. It cannot truly produce a random number, though every computer and even many calculators, including the TI-30XIIs, have random number generators. Here, the computer takes some input unknown to the user, puts it through a function also unknown to the user and produces an output. This is called pseudorandom, and debate over these methods continue to this day. One of the fathers of computer science, the great Hungarian mathematician John Von Neumann, was quoted as saying, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." A computer has nothing but arithmetical methods available. The tests done to see if a method produces output that is sufficiently random have to do with a distributions of a large set of pseudorandom output.

The probability of r successes in n independent trials, where p is the probability of success of any given trial

The simplest kind of problem in multiple trials is to assume independence between the trials, that the probability of success remains constant over all the trials, and we call that probability p. A basic question in these kinds of experiments is to ask what is the probability of getting a specific number of successes, calling that number r, in a specific number of trials, a number we will call n. If you have a TI-83 or TI-84, there is a function in the distribution menu (blue button then DISTR) called binompdf, that takes as its input n, p and r in that order. On the TI-30XIIs, we have to type in a formula.

Example #1: If we flip a fair coin, p = .5, ten times, what is the probability of exactly 6 heads and 4 tails?

TI-83 or TI-84: Go to the distribution menu, select binompdf and type in the following values

binompdf(10,.5,6) = 0.205078125

On the TI-30xIIs, here is what to type in

10[prb][right arrow]6x.5^6x.5^4[enter]

It will read
10 nCr 6 * .5^6 * .5^4
0.205078125

A different question would be what are the chances of at most r successes in n trials. Again, the TI-83 and TI-84 have a single function solution, this one called binomcdf(n, p, r). On the TI-30XIIs, we either have to calculate all the separate values, from the probability of 0 successes up through the probability of r successes then add these up, or if this is too much work, there is a method using normal distribution to approximate binomial distribution.

If you have Java on your computer, you can go to this website to see how the numbers across a row of Pascal's Triangle look like the bell-shaped curve. We can use z-scores and the lookup table to get values that will approximate these probabilities, and the approxomations get better as the numbers get bigger.

Example #2: If we flip a fair coin, p = .5, ten times, what is the probability of at most 6 heads?

TI-83 or TI-84: Go to the distribution menu, select binomcdf and type in the following values

binompcf(10,.5,6) = 0.828125

On the TI-30XIIs:

This entails a lot of work, but not an impossible amount. We need to find
p(exactly 0 heads) + p(exactly 1 head) + p(exactly 2 heads) + p(exactly 3 heads) + p(exactly 4 heads) + p(exactly 5 heads) + p(exactly 6 heads) = 0.828125

We have the normal approximation method, but we will see when n = 10, it's not very good.

n = 10, p = .5, r = 6

z = (6 + .5 - .5*10)/sqrt(.5*.5*10) = .3

Lookup table: .3 -> .6179

The approximate value at around 62% is very far from the true value at nearly 83%.

Different books have different cut-off points, but at a minimum, using the normal approximation to binomial should only be used when both np > 5 and nq > 5. In this case, both values equal 5, so this would not be a good candidate for using this method.

Example #3: If we flip a fair coin, p = .5, one hundred times, what is the probability of at most 60 heads?

TI-83 or TI-84: Go to the distribution menu, select binomcdf and type in the following values

binompcf(100,.5,60) = 0.982399899891... ~= .9824

TI-30XIIs: It's too many things to add up, so the approximation method is our only hope. Both np > 5 and nq > 5, so we can move forward.

(60 + .5 - .5*100)/sqrt(.5 * .5 * 100) = 2.1

Lookup table: 2.1 -> .9821

When n=100, the approximation is not perfect, but it is very close, unlike the results when n=10.

Another question that could be asked is the probability of getting at least r successes in n trials. "At least r" is the complement of "At most r-1", which means the two probabilities will add up to one.

Example #4: If a 70% free throw shooter takes ten shots, p = .7, what is the probability she makes 8 shots or more?

On the TI-83 or TI-84:

1 - binomcdf(10, .7, 7) = .3827827864..., which rounds to .3828 to four places.

On the TI-30XIIs

10[prb][right arrow]8x.7^8x.3^2[enter]

It will read
10 nCr 8 * .7^8 * .3^2
0.233474441

10[prb][right arrow]9x.7^9x.3^1[enter]

It will read
10 nCr 9 * .7^9 * .3^1
0.121060821

10[prb][right arrow]10x.7^10x.3^0[enter]

It will read
10 nCr 10 * .7^10 * .3^0
0.028247525

If we round these numbers to five places, add them up and round the answer to four places, it should agree with the answer above.

.23347 + .12106 + .02825 = .38278, which rounds to .3828, which agrees with the above answer.

Example #4: If a 70% free throw shooter takes 100 shots, p = .7, what is the probability she makes 75 shots or more?

On the TI-83 or TI-84:

1 - binomcdf(100, .7, 74) = .163130104..., which rounds to .1631 to four places.

On the TI-30XIIs: The problem is adding up too many parts, but np = 70 and nq = 30, both of which are more than 5, so let's go to the approximation method.

z = (75 - .5 - 70)/sqrt(.7*.3*100) = .98198..., which rounds to .98

Lookup table: .98 -> .8365, and 1-.8365 = .1635, which again is pretty close to the actual answer.

Statistics on a budget

Thursday, April 23, 2009

Class notes for 4/22

No comments:

Links to special posts

You need a calculator

Labels

Blog Archive

About Me

Site Meter