What I’ve learned about Large Deviation Theory
We’re two lectures into the large deviations course, and I think I have some grasp of what large deviation theory is about. It deals with the probability of large deviations from expected values– “the analysis of the tail of probability distributions”. Here’s one example of a LDT result (a Large Deviation Principle):
Cramer’s Theorem
Letbe a sequence of bounded i.i.d. random variables with mean
, and let
be the empirical means; then the tail of the probability distribution of
decay exponentially with increasing
at a rate given by a convex rate-function
:
![]()
Cramer apparently first proved this theorem using complex variables method and found
– the rate function– as a power series expansion. I’m glad to have discovered this, because the way we were shown of finding
in class is very technical, and was supplied without motivation.
Specifically, we developed Cramer’s theorem by defining
as the Fenchel transform of the logarithm of the moment generating function of
, and proving (well, so far accepting the validity of) a technical lemma which has Cramer’s theorem as a corollary. Mathematically impressive, but it provides no motivation for why we knew that defining
in that way would give us useful results. Apparently this formulation of
is an application of another result in Large Deviation Theory, Varadhan’s Theorem on the asymptotics of expectations of random sequences.
An Introduction to Large Deviations for Teletraffic Engineers seems to be a good basic reference for LDT.
Possibly relevant posts:
- A large deviations problem (4/4/2007)
- I can finally move on (or, a deviation result proved) (11/5/2008)
- kvetching about probability (3/28/2007)
be a sequence of bounded i.i.d. random variables with mean
, and let
be the empirical means; then the tail of the probability distribution of
decay exponentially with increasing
at a rate given by a convex rate-function 
I think ‘Large Deviation Theory’ is a good tool for engineering applications. But I find that most theorems or theories of it are too abstract to use for engineerings or applications. As far as I know no one researched for communications or array signal processing. I’m interested. Could you give me some reference or advice on how to use it for ? Thank you very much.
Comment by Bluesky — 3/10/2008 @ 10:09 pm