### Basic Probability Tools

This lecture will review everything you learned in 6.042.

- Basic tools in probability
- Expectations
- High probability events
- Deviations from expectation

### Deviations from Expectation

Sometimes expectation isn't enough. Want to study *
deviations*---** probability** and ** magnitude** of deviation
from expectation.

Example: coupon collection/stable marriage.

- Probability didn't get $k^{th}$ coupon after $r$ rounds is $(1-1/n)^r \le e^{-r/n}$
- which is $n^{-\beta}$ for $r=\beta n\ln n$
- so probability didn't get
*some*coupon is at msot $n\cdot n^{-\beta} = n^{1-\beta}$ (using**union bound**) - we say “time is $O(n\ln n)$
**with high probability**” because we can make probability $n^{-\beta}$ for**any**desired $\beta$ by changing constant that doesn't affect assymptotic claim. - sometime say “with high probability” when prove it for
**some**$\beta > 1$ even if didn't prove it for all. - Saying “almost never above $O(n\ln n)$” is a much stronger statement than saying “$O(n\ln n)$ on average.”

Lower bound:

- After $n\lg n$ steps, $1-1/n$ chance of getting given coupon
- So chance of getting all $< (1-1/n)^n \approx 1/e$
- So reasonable chance of
*not*being faster. - Can show bound is very tight.

** Do Exact Balls in Bins here next time**

### Tail Bounds---Markov Inequality

At other times, don't want to get down and dirty with problem. So have developed set of bounding techniques that are basically problem independent.

- few assumptions, so applicable almost anywhere
- but for same reason, don't give as tight bounds
- the more you require of problem, the tighter bounds you can prove.

Markov inequality.

- $\Pr[Y \ge t] \le E[Y]/t$
- $\Pr[Y \ge tE[Y]] \le 1/t$.
- Only requires an expectation! So very widely applicable.

Example: coupong collecting $\Pr[> tn\log n]=O(1/t)$.

Application: $ZPP=RP \cap coRP$.

- If $RP \cap coRP$
- just run both
- if neither affirms, run again
- Each iter has probability $1/2$ to affirm
- So expected iterations 2:
- So $ZPP$.

- If $ZPP$
- suppose expected time $T(n)$
- Run for time $2T(n)$, then stop and give default answer
- Probability of default answer at most $1/2$ (Markov)
- So, $RP$.
- If flip default answer, $coRP$

Does this mean can switch Las Vegas to Monte Carlo? Yes.

Monte Carlo to Las Vegas? No, unless have checker.

### Chebyshev.

Can make Markov much stronger by generalizing: $\Pr[h(Y) > t] \le E[h(Y)]/t$
for ** any positive** $h$.

Better than Markov because uses more info: variance.

- Remind variance, standard deviation. $\sigma_X^2 = E[(X-\mu_X)^2]=E[X^2]-\mu_X^2$
- For many distributions, this amount of variation is “right amount”
- $E[XY] = E[X]E[Y]$ if independent
- variance of independent variables: sum of variances
- $E[(\sum X_i)^2] = E[\sum X_i^2 + 2\sum X_iX_j]$
- second terms expand as $E[X_i]E[X_j]$
- so canceled by $\mu_{X_i}\mu_X{j}$ terms in $(\sum \mu_{X_i})^2$

- $\Pr[|X-\mu| \ge t\sigma] = \Pr [(X-\mu)^2 \ge t^2\sigma^2] \le 1/t^2$
- Or $\Pr[|X-\mu| \ge d] \le \sigma^2/d^2$
- So chebyshev predicts won't stray beyond stdev.
- binomial distribution. variance $np(1-p)$. stdev $\sqrt{n}$.
- requires (only) a mean and variance. less applicable but more powerful than markov
- Real applications later.

Example: coupon collecting.

- Start with geometric random variable with success prob. $p$
- mean $1/p$
- Variance:
- $E[Y^2] = E[Y^2\mid 1 \mbox{ step}]\cdot p + E[Y^2\mid >1 \mbox{ step}]\cdot (1-p)$
- $E[Y^2]=p+E[(Y+1)^2](1-p)=p+(E[Y^2]+2E[Y]+1)(1-p)$
- $Z=(1-p)z+(2-p)/p$
- $Z=(2-p)/p^2$
- variance $(1-p)/p^2< 1/p^2$

- Coupon collection is sum of independent vars: variance upper bound \[ \sum_n \left(\frac{n}{n-i+1}\right)^2=n^2 \sum\frac1{i^2}=O(n^2) \]
- so $\Pr[2nH_n] < O(n^2)/(nH_n)^2=O(1/\ln^2 n)$

### Median Finding

In homework, simple randomized linear-time algorithm

- But, what's the constant?
- Optimum deterministic: $ \ge (2+\epsilon)n$
- Can we do better randomized?

Sampling idea

- List $L$
- median of sample looks like median of whole. neighborhood.
- analysis via Chebyshev bound

Approximation

- choose $s$ samples
*with replacement* - (replacement simplifies analysis)
- what guess for median?
- sort, take $x_{s/2}$
- will it be near median?
- $\Pr[< x_{n/4}]$ is probability choose $s/2$ items less than $x_{n/4}$
- each choice has probability $1/4$
- so expect to choose only $s/4$ in smallest quarter
- variance is $s(1-1/4)/4=3s/16$
- $\sigma = \sqrt{3s}/4$
- $s/2$ such choices is deviation by $s/4=\sqrt{s/3}\sigma$
- so probability $3/s$ by Chebychev
- same analysis shows unlikely to exceed $x_{3n/4}$
- combine with union bound
- conclude probably in middle half
- similarly conclude
*any*constant error has probability $O(1/s)$

Exact Algorithm?

- choose $s$ samples
*with replacement* - take fences before and after sample median
- keep items between fences. sort.
- count items on both sides of fences
- find right element in between
- $s=n^{3/4}$ samples $x_1,\ldots, x_s$ in sorted order.
- Sample with replacement to keep analysis clean
- lemma: $x_r$ near $rn/s$ position of original list.
- To find median element, look at two positions in sorted order of sample: $a$ at pos. $s/2-\sqrt{n}$ and $b$ at $s/2+\sqrt{n}$

Analysis

- works so long as median is between fences
- fast so long as
**few items**between fences - Will prove (i) in class, (ii) in homework
- Expected number preceding $n/2$ is $s/2$.
- What about variance:
- each sample indicator has variance $p(1-p)=1/4$
- so $r$ choices have variance $< r/4$
- so $\sigma = \sqrt{r/4}\le n^{3/8}/2$
- chebyshev: prob. less than $rs/n-\sqrt{n}$ involves deviation by $n^{1/8}\sigma$ so probability $O(n^{-1/4}$).

Running time:

- Sorting takes $O(n^{3/4}\log n)=o(n)$ time
- Comparing to fences takes $2n$ time
- Fail probability $O(n^{-1/4})$

Want to improve?

- usually, repetition works
- but here, we're looking at the constant
- even repeating once kills the constant
- but we almost always succeed first time
- (later we'll see failure probability much smaller than shown here)

Randomized is strictly better:

- Gives important constant factor improvement
- Optimum deterministic: $ \ge (2+\epsilon)n$
- Optimum randomized: $\le (3/2) n+o(n)$ w.h.p.

### Pairwise Independence

pseudorandom generators.

- Motivation.
- Idea of randomness as (complexity theoretic) resource like space or time.
- pseudorandom generators turn a little “real” randomness into lots of “fake” randomness”
- can be used in many randomized algorithms
*without change to algorithm or analysis*

Pairwise independence

- $\Pr[X_i = a \wedge X_j=b] = \Pr[X_i=a]\Pr[X_j=b] \forall i,j,a,b$
- equivalently $\Pr[X_i=a \mid X_j=b] = \Pr[X_i = a]$
- sometime full independence unnecessary
- e.g. pairwise independent vars sufficient for Chebychev bounds

generating over $Z_p$.

- Want random numbers in range $[1,\ldots,p]$
- pick random $a,b$
- $i^{th}$ random number $ai+b$
- Works because invertible over field
- If want over nonprime field, use “slightly larger” $p$

Alternative

- Given $t$ independent random vars $\mod n$
- For each subset, define random variable as sum over subset, $\mod n$
- Any two subsets differ on a variable, so pairwise independent
- from $t$ independent random bits, get $2^t-1$ pairwise independent

Application: median

- median only depended on chebyshev
- which only depended on pairwise independence
- so $O(\log n)$ bits suffice for result

Application: conserving Random Bits

- Suppose $RP$ algorithm using $n$ bits.
- Reduce error to $2^{-t}$ using $t$ independent trials with $tn$ bits
- What do with $2n$ bits?
- two direct draws:
- error prob. 1/4.

- pseudorandom generator
- random variable for number of correct outcomes
- $\mu = t/2$.
- $\sigma = \sqrt{t}/2$.
- error if no cert,
- i.e. $Y-E[Y] \ge t/2$,
- prob. $1/t$ by Chebyshev
- even though only pairwise independent

- Trades time for randomness:
- with $tn$ random bits can get error $2^{-t}$ with $t$ trials
- with $2n$ random bits, can get same probability with $2^t$ trials