12.08.2008

The Paradox of the False Positive

An excerpt from Cory Doctorow's latest book, Little Brother:

If you ever decide to do something as stupid as build an automatic terrorism detector, here's a math lesson you need to learn first. It's called "the paradox of the false positive," and it's a doozy.

Say you have a new disease, called Super-AIDS. Only one in a million people gets Super-AIDS. You develop a test for Super-AIDS that's 99 percent accurate. I mean, 99 percent of the time, it gives the correct result -- true if the subject is infected, and false if the subject is healthy. You give the test to a million people.

One in a million people have Super-AIDS. One in a hundred people that you test will generate a "false positive" -- the test will say he has Super-AIDS even though he doesn't. That's what "99 percent accurate" means: one percent wrong.

What's one percent of one million?

1,000,000/100 = 10,000

One in a million people has Super-AIDS. If you test a million random people, you'll probably only find one case of real Super-AIDS. But your test won't identify *one* person as having Super-AIDS. It will identify *10,000* people as having it.

Your 99 percent accurate test will perform with 99.99 percent *inaccuracy*.

That's the paradox of the false positive. When you try to find something really rare, your test's accuracy has to match the rarity of the thing you're looking for. If you're trying to point at a single pixel on your screen, a sharp pencil is a good pointer: the pencil-tip is a lot smaller (more accurate) than the pixels. But a pencil-tip is no good at pointing at a single *atom* in your screen. For that, you need a pointer -- a test -- that's one atom wide or less at the tip.

This is the paradox of the false positive, and here's how it applies to terrorism:

Terrorists are really rare. In a city of twenty million like New York, there might be one or two terrorists. Maybe ten of them at the outside. 10/20,000,000 = 0.00005 percent. One twenty-thousandth of a percent.

That's pretty rare all right. Now, say you've got some software that can sift through all the bank-records, or toll-pass records, or public transit records, or phone-call records in the city and catch terrorists 99 percent of the time.

In a pool of twenty million people, a 99 percent accurate test will identify two hundred thousand people as being terrorists. But only ten of them are terrorists. To catch ten bad guys, you have to haul in and investigate two hundred thousand innocent people.

Guess what? Terrorism tests aren't anywhere *close* to 99 percent accurate. More like 60 percent accurate. Even 40 percent accurate, sometimes.

What this all meant was that the Department of Homeland Security had set itself up to fail badly. They were trying to spot incredibly rare events -- a person is a terrorist -- with inaccurate systems
.

DHS, in Doctorow's skillfully-written book (full disclosre: I'm a long-time fan) has taken over San Francisco after a pretty horrific terrorist attack, and implemented a security net so obnoxious that everyone is a suspect, turning the "innocent until proven guilty" maxim fundamental to the US justice system on its ear in the name of "protecting" the very citizens it interrogates on every corner.

As if we needed a reminder, the recent Mumbai attacks have thrust the extraordinarily rare specter of the large-scale, well-coordinated urban terror strike back into the public spotlight. Doctorow's greatest strength has always been his ability to make the future feel incredibly near; it is not even remotely difficult to imagine a scenario like the one described in LIttle Brother playing out were another major attack to strike a US city. Almost as frightening than the idea of an actual attack (or more frightening, depending on who you talk to) is the idea of what might come after. Imagine a massive citizen-tracking web, to put London's CCTV to shame, layered on top of the already overanxious contemporary cityscape, and consider the paradox of the false positive. Not pretty.

Little Brother is available for free download under Creative Commons licence right here.

No comments: