Finding e within the randoms

I got a message from Omar Camarena about my post “e and Clojure”; I had misstated an identity equating e to the expectation of a random process. The (now fixed) statement (taken from this site) is below:

… the cool fact that the expected (average) number of random numbers between 0 and 1 needed to sum to 1 or more is, well, 2.7182… e

I was curious why this was, so I decided to tackle it. The solution offered in the the above post takes the geometric approach of calculating the volume of the simplex which contains the set of points which add up to some number less than one. That simplex is the volume under the line $1-{x}_{n}-\dots -{x}_{1}$; drawing out the cases for 2 and 3 dimensions is a good way to visualize this. The equation representing that area is ${\int }_{0}^{\infty }{\int }_{0}^{1-{x}_{n}}\left(1-{x}_{1}-\dots -{x}_{n}\right)\phantom{\rule{0.167em}{0ex}}\mathrm{\text{d}}{x}_{1}\mathrm{\text{d}}{x}_{n}$. Finding a generalization (and, it turns out, a lovely closed-form version) of this requires induction using Fubini’s Theorem, which was more than I felt like dealing with.

So, instead, I took Tim’s suggestion to approach this from a probabilistic angle. We can enforce the same restrictions we did on our integral (that all the points must add up to one or less) using the right distributions. First, we’ll need a sprinkling of xs from a uniform distribution ${X}_{1},\dots ,{X}_{n}\sim \mathrm{\text{Unif(0, 1)}}$, and we’ll need the sum of them to be less than one, $Z=\mathrm{\text{sum(}}{X}_{1},\dots ,{X}_{n}\mathrm{\text{)}}$. $\mathrm{\text{P(}}Z\le 1\mathrm{\text{)}}$ is then what we’re interested in. It turns out there’s a nice distribution, called the Irwin-Hall distribution, which exactly models Z. Deriving the distribution is an exercise for another night.

We’re interested in $\mathrm{\text{P(}}N=n\mathrm{\text{)}}=\mathrm{\text{P(}}Z>1;n\mathrm{\text{)}}-\mathrm{\text{P(}}Z>1;n-1\mathrm{\text{)}}$, which is the probability that it takes exactly $n$ draws to get right past one.

From the CDF of the distribution we get $\mathrm{\text{P(}}Z\le 1\mathrm{\text{)}}=\frac{1}{n!}$, so we can find what we’re looking for with $\mathrm{\text{P(}}Z>1\mathrm{\text{)}}=1-\mathrm{\text{P(}}Z\le 1\mathrm{\text{)}}=1-\frac{1}{n!}$. From this and the above (plus a little bit of algebra), we find $\mathrm{\text{P(}}N=n\mathrm{\text{)}}=\frac{n-1}{n!}$. Now we merely have to use the definition of the expectation of a random variable to find that ${\sum }_{n=1}^{\infty }n\cdot \mathrm{\text{P(}}N=n\mathrm{\text{)}}={\sum }_{n=1}^{\infty }\frac{n}{n!}$. It turns out that that last sum is a definition of e.

Verifying this with a bit of Python is fun.

def fac(n): return reduce(lambda a, b: a * b, range(1, n+1), 1)

reduce(lambda a, n: a + float(n)/fac(n), range(1, 50), 0)
=> 2.7182818284590455

So there you have it. e is, in fact, the expected number of draws from our pool of uniform 0—1 until we hit or pass 1. The world is a saner place, and now I can sleep.