Probabilistic Tools
Today one of the first years studying for his qualification exams asked me a question that he came across on a previous year’s probability qual: If
people are removed from
couples, what is the expected number of remaining couples?
Give it a shot, then see under the cut for my solution
We spent a couple of minutes reluctantly thinking of a way to phrase this as a counting argument, but being at the coffee shop away from pencil and paper, we were forced to look for *nice* counting arguments, none of which were immediately evident (basically because the events of given couples remaining are not independent). If you can think of any, let me know.
Then, I remembered that a lot of the randomization algorithms I’ve seen use the expectation of indicator functions to calculate some pretty nontrivial averages. Once you make the observation that the expected number of surviving couples can be written as the expectation of a sum of indicator variables, the problem is pretty trivial. There are
copies of the same indicator variable,
, which is 1 when a couple remains, 0 otherwise. Note that the interdependence of these indicator variables doesn’t matter, because of the linearity of expectation. The probability that a particular couple remains after
removals is clearly
, so the desired expectation is
.
The moral of the story is, when you’re asked to calculate expectations of ‘binary’ events, try to introduce appropriate indicator variables. It’s a powerful tool.
Possibly relevant posts:
- Probability tricks (11/28/2006)
- Probablistic Integrals (10/27/2006)
- Subdividing the plane (10/13/2006)
The “hard” way to do it: If there are k couples amongst the M people eliminated, then there are N - M + k couples remaining. The probability of this is (N choose k) (N-k choose M-2k) 2^(M-2k) / (2N choose M). So you can get a nasty expression involving summation over k.
Comment by pete — 8/26/2008 @ 11:58 am
Thanks Peter!
Comment by Alex — 8/26/2008 @ 1:48 pm
The expression reduces nicely to N*(2*N-M)*(2*N-M-1)/(2*N*(2*N-M-1)), which makes the answer computationally feasible for even large values of N and M (no evaluation of Gamma or factorials).
-Mike
Comment by Mike — 8/27/2008 @ 1:42 pm