If a trait is very rare, only a very accurate test gives us useful information. For example, if a trait shows up in only 1 in 10,000 people but the test for the trait has an error rate of 1 in 1,000, we should expect about 10 false positives for every true positive. Here is the completed table for that situation.
________don't___have____row total
test + ____9,999__999______10,998__
test - _9,989,001___1_____9,989,002__
col.___9,999,000_1,000______10,000,000 grand total
In a situation such as this, testing positive twice could give us useful information, as testing positive once has an error rate of about 90.9%. We have to assume the errors are random and not deterministic. For example, if a test for a chemical compound in opium also catches a similar compound found in poppy seed bagels, testing twice won't get rid of the errors. Assuming just random errors here is what we do.
Step 1: The top row of the first contingency table is the column total/grand total row of the second contingency table. What this does is takes the numbers from the people who tested positive the first time and makes them the totals for those who will be tested twice.
________don't___have____row total
test + __________________________
test - __________________________
col._______9,999__999_____10,998 grand total
Step 2: Multiply error rate by have column total to find the number who have that test negative. Round to the nearest whole number. (We didn't have to round before, but now we do.)
999*1/1000 = .999 ~= 1, this means test positive and have is 998.
________don't___have____row total
test + ___________998____________
test - _____________1____________
col._______9,999__999_____10,998 grand total
Step 3: Multiply error rate by don't have column total to find the errors. 9,999*1/1000 = 9.999 ~= 10. That means the test negative in that column is 9,999 - 10 = 9,989.
________don't___have____row total
test + _______10__998____________
test - _____9,989____1____________
col._______9,999__999_____10,998 grand total
Step 4: row totals
________don't___have____row total
test + _______10__998_____1,008___
test - _____9,989____1_____9,990___
col._______9,999__999_____10,998 grand total
Step 5: Find the error rate for testing positive twice. 10/1,008 = .0099... or about 1%.
Of the ten million people tested, we would send letters to 1,008 telling them they tested positive twice. Of those people, ten don't have the trait and are getting false information, but 998 are getting the right information. In the first test, there was someone with the trait who tested negative, and the same is true in the second test, so there are two people with the trait who did not get two positive test results. While this isn't a perfect situation, it's much better than the over 90% error rate we got for positive tests the first time through.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment