Thursday, May 17, 2012

Quick explanation of Bayesian probability (mondo typo fixes)

[Ack. Fixing a ton of horrible typos.]

Okay, I can't resist sharing the formula. The formula is provably rational (and optimal), though the tricky part lies in coming up with probabilities to plug into it.

Notation:
P(x) = probability of x occurring/being true/etc.
P(x|y) = probability of x given y

For example, P(person is male) is about 0.5, but P(person is male | person has deep voice) is high, say 0.995 if one deep-voiced person in two hundred is a woman who smokes or something. Okay, now the way you do Bayesian inference is to take two or more competing hypotheses h1 and h2 to which you have assigned a probability. Then you observe some evidence e and compute P(h1 | e) and P(h2 | e). Since you know that e is true because you observed it, these are the new probabilities you should use for h1 and h2.

There's some simple math involved in proving this next statement, but

                    P(e | h) * P(h)
P(h | e) =  --------------------------
                        P(e)

That is, the probability of h after given e is the probability that h would produce e, weighted by how probable h already was and by the probability that you would have seen e anyway. Surprising evidence carries much more weight than commonplace evidence.

Suppose I am evaluating whether my neighbor is a serial killer. Let's say that I believe, for some reason, that normal people have blood leaking out of their garage very infrequently, 1/10,000 of the time. Serial killers leak blood out of their garage 1/100 of the time. Say I sneak onto his property tonight and look for blood. I will show how I should adjust my beliefs based on whether or not I find blood, and I will show how to do it for two different levels of a priori belief in his serial killer-ness.

CASE 1: I'm 99.9% sure he's not a serial killer.
  P(blood) = 0.999/10,000 + 0.001/100 = 0.0001099 
  P(no blood) = (0.999 * 0.9999) + (0.001 * 0.99) = 0.9998901
  1a. I find no blood. Since even serial killers leak no blood 99% of the time, 
     P(killer | no blood) = (0.99) * (0.001) / 0.9998901 = 0.000990108813
     I didn't really think he was a killer, and I didn't really change my beliefs much by not finding anything.
   1b. I find blood!
     P(killer | blood) = (0.01) * (0.001) / 0.0001099 =  0.0909918107 
     I still don't really think he's a serial killer, but I was surprised to find the blood and I'm probably going to pay close attention to him in the future.

CASE 2: I think it's 50/50 that he might be a serial killer.
  P(blood) =  (0.50 / 10 000) + (0.50 / 100) = 0.00505
  P(no blood) =  (0.50 * 0.9999) + (0.50 * 0.99) = 0.99495 
  2a. I find no blood. Since serial killers leak no blood 99% of the time,
    P(killer | no blood) =  (0.99 * 0.50) / 0.99495 = 0.497512438
    I'm still pretty convinced he might be a serial killer. It's going to take many, many nights of lurking outside his garage to convince me otherwise, but if I only find blood about 1/10,000 of the time, eventually I'll decide he's innocent.
  2b. I find blood!
    P(killer | blood) = (0.01 * 0.50) / 0.00505 =  0.99009901 
    I thought he might be a killer before, and now I'm dead certain.

I want to point out two things here:

1.) Probabilities are subjective, but even if you may not agree with someone about the probability of a hypothesis, you may be able to come to an agreement on how likely some evidence is under that hypothesis (P(e|h)), and thus how strongly to weight the evidence. "I don't think Senator Lugard is corrupt, but I agree that having lunch with that lobbyist all the time is suspicious. If I didn't know him so well personally I'd probably agree with you, but I think he's probably just being naive." If you can't even agree on P(e|h) though there's not much to talk about.

2.) Falsifying a theory quickly requires looking for improbable evidence. Finding no blood never changes belief much in either 1a or 2a. To prove someone (likely) innocent, you need to come up with something that would be very unlikely for a serial killer and see if he has that characteristic, like doing anonymous good deeds. Surprising evidence is what changes minds quickly.

-Max

--
Hahahahaaaa!!! That is ME laughing at YOU, cruel world.
    -Jordan Rixon

I could not love thee, dear, so much,
Loved I not Honour more.

No comments: