somewhere near the beginning.

What I’ve learned about Large Deviation Theory

Filed under: General — Alex @ 10:26 am 3/30/2007

We’re two lectures into the large deviations course, and I think I have some grasp of what large deviation theory is about. It deals with the probability of large deviations from expected values– “the analysis of the tail of probability distributions”. Here’s one example of a LDT result (a Large Deviation Principle):

Cramer’s Theorem
Let X_1, X_2, \ldots be a sequence of bounded i.i.d. random variables with mean m, and let M_n = \frac{1}{n}( X_1 + X_2 + \cdots + X_n) be the empirical means; then the tail of the probability distribution of M_n decay exponentially with increasing n at a rate given by a convex rate-function I(x):
\mathbb{P}\{M_n > x\} \asymp e^{-nI(x)} \text{ for } x>m
\mathbb{P}\{M_n < x\} \asymp e^{-nI(x)} \text{ for } x<m

Cramer apparently first proved this theorem using complex variables method and found I(x)– the rate function– as a power series expansion. I’m glad to have discovered this, because the way we were shown of finding I in class is very technical, and was supplied without motivation.

Specifically, we developed Cramer’s theorem by defining I as the Fenchel transform of the logarithm of the moment generating function of X_n, and proving (well, so far accepting the validity of) a technical lemma which has Cramer’s theorem as a corollary. Mathematically impressive, but it provides no motivation for why we knew that defining I in that way would give us useful results. Apparently this formulation of I is an application of another result in Large Deviation Theory, Varadhan’s Theorem on the asymptotics of expectations of random sequences.

An Introduction to Large Deviations for Teletraffic Engineers seems to be a good basic reference for LDT.

Possibly relevant posts:

1 Comment »

RSS feed for comments on this post. TrackBack URL

Leave a comment