Chernoff Bound
Intro
- Markov: $\Pr[f(X) > z] < E[f(X)]/z$.
- Chebyshev used $X^2$ in $f$
- other functions yield other bounds
- Chernoff most popular
Theorem:
- Let $X_i$ poisson (ie independent 0/1) trials, $E[\sum X_i]=\mu$ \[ \Pr[X > (1+\epsilon) \mu ] < \left[ \frac{e^\epsilon}{(1+\epsilon)^{(1+\epsilon)}}\right]^\mu. \]
- note independent of $n$
- and exponential in $\mu$, base $< 1$.
Proof.
- For any $t>0$, $$ \begin{eqnarray*} \Pr[X > (1+\epsilon)\mu] &= &\Pr[\exp(tX) > \exp(t(1+\epsilon)\mu)]\\ &< &\frac{E[\exp(tX)]}{\exp(t(1+\epsilon)\mu)} \end{eqnarray*} $$
- Use independence. $$ \begin{eqnarray*} E[\exp(tX)] &= &\prod E[\exp(tX_i)]\\ E[\exp(tX_i)] &= &p_ie^t+(1-p_i) \\ &= &1+p_i(e^t-1)\\ &\le &\exp(p_i(e^t-1)) \end{eqnarray*} $$ $\prod \exp(p_i(e^t-1)) = \exp(\mu(e^t-1))$
- So overall bound is \[ \frac{\exp((e^t-1)\mu)}{\exp(t(1+\epsilon)\mu)} \] True for any $t$. To minimize, plug in $t=\ln(1+\epsilon)$.
- Simpler bounds:
- less than $e^{-\mu\epsilon^2/(2+\epsilon)}$ for all $\epsilon$ (because $\ln(1+\epsilon) < 2\epsilon/(1+\epsilon)$)
- less than $e^{-\mu\epsilon^2/3}$ for $\epsilon < 1$
- less than $e^{-\mu \epsilon^2/4}$ for $\epsilon < 2e-1$.
- Less than $2^{-(1+\epsilon)\mu}$ for larger $\epsilon$.
- By same argument on $\exp(-tX)$, \[ \Pr[X < (1-\epsilon)\mu] < \left[ \frac{e^{-\epsilon}}{(1-\epsilon)^{(1-\epsilon)}}\right]^\mu \] bound by $e^{-\mu \epsilon^2/2}$.
- Bound trails off when $\epsilon \approx 1/\sqrt{\mu}$, ie absolute error $\sqrt{\mu}$
- no surprise, since standard deviation is around $\mu$ (recall chebyshev)
- If $\mu = \Omega(\log n)$, probability of constant $\epsilon$ deviation is $O(1/n)$, Useful if polynomial number of events.
- Note similarity to to Gaussian distribution.
- Generalizes: bound applies to any vars distributed in range $[0,1]$.
Alternative Proof by Van Vu
Theorem:- Let $X_i$ be discrete independent r.v.'s
- with $E[X_i]=0$
- and $|X_i|\le 1$.
- Let $X=\sum X_i$
- and $\sigma^2$ be variance of $X$. Then \[ \Pr[|X| \ge \lambda \sigma] \le 2e^{-\lambda^2/4} \]
Proof:
- Prove upper tail and double by applying to $-X$
- Moment trick: $\Pr[X \ge \lambda \sigma] \le E[e^{tX}/e^{t\lambda\sigma}]$
- Bound $e^{tZ}$ when $t\le 1$ and $|Z|\le 1$ and $E[Z]=0$ (i.e. $Z=X_i$) $$ \begin{align*} E[e^{tZ}] &= \sum p_j e^{tz_j}\\ &= \sum p_j(1+tz_j+(tz_j)^2/2!+(tz_j)^3/3!\cdots)\\ &= \sum p_j + \sum p_jtz_j + \sum p_j((tz_j)^2/2!+(tz_j)^3/3!\cdots)\\ &\le 1+t\dot 0+\sum p_j(tz_j)^2(1/2!+1/3!+\dots )\\ &\le 1+t^2\sum p_j(z_j)^2(e-2)\\ &\le 1+t^2 E[Z^2]\\ &=1+t^2\sigma_Z^2\\ &\le e^{t^2\sigma_Z^2} \end{align*} $$
- Then $$ \begin{align*} E[e^{tX}] &= \prod E[e^{tX_i}]\\ &\le \prod e^{t^2\sigma_{X_i}^2}\\ &=e^{t_2\sigma_X^2} \end{align*} $$
- So, overall bounds at most $e^{t^2\sigma^2}/e^{t\lambda\sigma} = e^{t\sigma(t\sigma-\lambda)}$
- optimize at $t=\lambda/2\sigma$ (which must be $< 1$ to work)