I have experienced such a "condition" recently, so I documented its stages - for fear of relapse :‑)
Last Saturday, my older brother asked me about the probability of "something" related to a lottery that is played here in Greece. The lottery is called "Joker" (and no, it has nothing to do with the Bat mobile). He described it to me as a set of balls with numbers from 1 to 40; I was told that they are thoroughly mixed in each lottery, and that 5 of them are randomly picked. Then another number (the "Joker") is randomly picked from a separate set of balls numbered from 1 to 20. If you have predicted all 6 of them, you instantly become a millionaire.
"What is the chance that the joker is equal to one of the 5 others?", I was asked.
It's been 15 years since I passed the probability and statistics exam in the University, but... like I said: "relapse" :‑)
When we calculate the odds for a logical AND of events, the overall probability of the combined event is the product of the probabilities of the events. At least, that's how I remember it...
P(A and B and C and D and E) = P(A) x P(B) x P(C) x P(D) x P(E)In this case, how can I form a "logical AND"? Err...
p(A) = 39/40 p(B) = ...Now, the first one (A) is easy. There are 40 balls, I get one - and the joker is somewhere between 1 and 20. Chances of not hitting that "target" ball are therefore 39 in 40.
The second one (B) is a tad more difficult, though: Event B says that the second ball is not equal to the joker - this means that
either event (A) took place and now, event (B) takes place or event (A) didn't take place and now, event (B) takes placeor, in simpler terms,...
either (B1) the 1st ball was not the joker and neither is the 2nd one or (B2) the 1st ball was equal to the joker and the 2nd one is notLet's see...
P(B) = P(B1 or B2) = P(B1) + P(B2) - P(B1 and B2)However, B1 and B2? That can't happen - either event (A) takes place, or it doesn't. P(B1 and B2) = 0.
P(B1) = P(1stEscapesJoker) x P(2ndEscapesJoker given that 1stAlsoEscaped) = 39/40 x 38/39The second term is 38/39, not 38/40 and not 39/40: the 5 chosen balls are all different; now that we've picked the first one and it was not equal to the joker, only 38 out of the remaining 39 are not equal to neither the joker nor the first one.
P(B2) = P(1stIsJoker) x P(2ndEscapesJoker given that 1stIsJoker) = 1/40 x 39/39The second term here is "absolute certainty": with the first ball equal to the joker, the second one can not possibly be equal to the joker (since it can not be equal to the first ball!). All remaining 39 balls are therefore not equal to the joker.
So, what is P(B)?
P(B) = P(B1) + P(B2) - P(B1 and B2) = 39 38 1 39 = -- x -- + -- x -- - 0 = 40 39 40 39 38 1 39 = -- + -- = -- 40 40 40What do you know! P(A) = P(B)!
Let's do it once more, for P(C):
P(C) = P(third ball not equal to the joker) = = P(C1 or C2 or C3) = = P(C1) + P(C2) + P(C3) - - P(C1 and C2) - P(C2 and C3) - P(C1 and C3) + + P(C1 and C2 and C3) Where: C1 = first escapes, second escapes, third escapes C2 = first escapes, second matches, third escapes C3 = first matches, second escapes, third escapes and therefore,... 39 38 37 37 P(C1) = -- x -- x -- = -- 40 39 38 40 39 1 38 1 P(C2) = -- x -- x -- = -- 40 39 38 40 1 39 38 1 P(C3) = -- x -- x -- = -- 40 39 38 40 which, finally, leads us to... P(C) = P(C1) + P(C2) + P(C3) - - P(C1 and C2) - P(C2 and C3) - P(C1 and C3) + + P(C1 and C2 and C3) = 37 1 1 = -- + -- + -- - 0 - 0 - 0 + 0 = 40 40 40 39 = -- 40At this point, it indeed appears that the chances of escaping the Joker are constant (39/40) regardless of which one of the 5 balls we are looking at. Think about it: even if we were picking 40 balls instead of 5, the 40th ball has the same overall chance of escaping the joker as does the first one! Not immediately apparent, but makes sense if you think about it...
So, what is the chance of ALL 5 balls escaping the joker (the "logical AND")? The product of the 5 probabilities, so...
39 5 P(all 5 escaping) = ( -- ) 40...and the chances of one of them being equal to the joker, is therefore,
P(one or more of them equal to joker) = 1 - P(all 5 escaping) = 1 - (39/40)^5 = .1189043Looks good. Illusions of grandeur rapidly manifesting :‑)
Now let's put the theory to the test - Python to the rescue!
#!/usr/bin/env python success = 0 tries = 0 for joker in xrange(1, 21): # From 1 to 20 for first in xrange(1, 41): # From 1 to 40 for second in xrange(1, 41): # From 1 to 40 if first == second: continue # Ignore invalid lotteries if first != joker and second != joker: success += 1 tries += 1 print float(success)/tries
P(A and B) = P(A) x P(B) = (39/40)^2 = 0.950625
bash$ python experiment1.py 0.95Why?
Don't read further down immediately - think this through, reread... treat this as a puzzle!
No, it is not a Python "floating point precision" issue...
Let's try to see what Python gives us for each of P(A) and P(B) (i.e. the chance of the first one escaping the joker, and - separately - the chance of the second one escaping the joker):
#!/usr/bin/env python success1 = 0 success2 = 0 tries = 0 for joker in xrange(1, 21): for first in xrange(1, 41): for second in xrange(1, 41): if first == second: continue if first != joker: success1 += 1 if second != joker: success2 += 1 tries += 1 print float(success1)/tries print float(success2)/tries
bash$ python experiment2.py 0.975 0.975At least this one looks good (0.975 = 39/40). Both the first one and the second one have the same chance of escaping the joker, exactly 39/40. Not all our theory is wrong... We correctly calculated the probabilities of each one of the 5 balls escaping the joker. But... why is the end result for the overall question different?
P(A and B) = P(A) x P(B) = (39/40)^2 = 0.950625In other words, the real world experiment tells us that our theoretically calculated probability is slightly higher than the real one. Why, though? Isn't the probability of the logical AND of two events the product of their probabilities?
Time to dig up the textbooks... (boxes, dust, cough)
There, I found it...
They are most definitely NOT independent! Whether the second one escapes the joker VERY MUCH DEPENDS on whether the first one did! If the first one was equal to the joker, the second one SURELY will NOT (since it cannot be equal to the first one)! Don't get confused by the fact that we calculated the probabilities of each event and found them to be the same; the important point is that the outcome of the first event has implications on the second one.
Wait a minute... By knowing P(A) and P(B), we also know P(not A) and P(not B)...
P(first is equal to joker) = P(not A) = 1-P(A) = 1/40 P(second is equal to joker) = P(not B) = 1-P(B) = 1/40... and therefore ...
P(1stIsJoker or 2ndIsJoker) = P(1stIsJoker) + P(2ndIsJoker) - P(1stIsJoker and 2ndIsJoker)The last term (1stIsJoker and 2ndIsJoker) can't happen - the first and second ball are never equal!
Finally, I got it!
P(1stIsJoker or 2ndIsJoker) = P(1stIsJoker) + P(2ndIsJoker) = 1/40 + 1/40 = 2/40And yes, ...
P(both first and second escape) = 1-P(1stIsJoker or 2ndIsJoker) = 1-(2/40) = 0.95... as experiment1 showed.
P(firstEscapes and secondEscapes) = P(B1) = (see above in Theory section) 39 38 38 = -- x -- = -- = 0.95 40 39 40... I could have just extended this to get to the result easily...
p(1st and 2nd and 3rd and 4th and 5th escape the joker) = p(1stEscapes) x p(2ndEscapes given 1stEscaped) x p(3rdEscapes given 1stAnd2ndEscaped) x p(4thEscapes given 1stAnd2ndAnd3rdEscaped) x p(5thEscapes given 1stAnd2ndAnd3rdAnd4thEscaped) = 39 38 37 36 35 35 = -- x -- x -- x -- x -- = -- 40 39 38 37 36 40 and thus, the answer to our problem is... 35 5 P(one of the 5 is equal to the joker) = 1 - -- = -- 40 40Based on the above, I advised my brother as follows:
Updated: Sat Mar 8 22:58:16 2014