\documentclass{article} \usepackage{me,wide} \parindent0pt

### Basic Probability Tools

This lecture will review everything you learned in 6.042.

• Basic tools in probability
• Expectations
• High probability events
• Deviations from expectation

### Deviations from Expectation

Sometimes expectation isn't enough. Want to study deviations--- probability and magnitude of deviation from expectation.

Example: coupon collection/stable marriage.

• Probability didn't get $k^{th}$ coupon after $r$ rounds is $(1-1/n)^r \le e^{-r/n}$
• which is $n^{-\beta}$ for $r=\beta n\ln n$
• so probability didn't get some coupon is at msot $n\cdot n^{-\beta} = n^{1-\beta}$ (using union bound)
• we say “time is $O(n\ln n)$ with high probability” because we can make probability $n^{-\beta}$ for any desired $\beta$ by changing constant that doesn't affect assymptotic claim.
• sometime say “with high probability” when prove it for some $\beta > 1$ even if didn't prove it for all.
• Saying “almost never above $O(n\ln n)$” is a much stronger statement than saying “$O(n\ln n)$ on average.”

Lower bound:

• After $n\lg n$ steps, $1-1/n$ chance of getting given coupon
• So chance of getting all $< (1-1/n)^n \approx 1/e$
• So reasonable chance of not being faster.
• Can show bound is very tight.

Do Exact Balls in Bins here next time

### Tail Bounds---Markov Inequality

At other times, don't want to get down and dirty with problem. So have developed set of bounding techniques that are basically problem independent.

• few assumptions, so applicable almost anywhere
• but for same reason, don't give as tight bounds
• the more you require of problem, the tighter bounds you can prove.

Markov inequality.

• $\Pr[Y \ge t] \le E[Y]/t$
• $\Pr[Y \ge tE[Y]] \le 1/t$.
• Only requires an expectation! So very widely applicable.

Example: coupong collecting $\Pr[> tn\log n]=O(1/t)$.

Application: $ZPP=RP \cap coRP$.

• If $RP \cap coRP$
• just run both
• if neither affirms, run again
• Each iter has probability $1/2$ to affirm
• So expected iterations 2:
• So $ZPP$.
• If $ZPP$
• suppose expected time $T(n)$
• Run for time $2T(n)$, then stop and give default answer
• Probability of default answer at most $1/2$ (Markov)
• So, $RP$.
• If flip default answer, $coRP$

Does this mean can switch Las Vegas to Monte Carlo? Yes.

Monte Carlo to Las Vegas? No, unless have checker.

### Chebyshev.

Can make Markov much stronger by generalizing: $\Pr[h(Y) > t] \le E[h(Y)]/t$ for any positive $h$.

• Remind variance, standard deviation. $\sigma_X^2 = E[(X-\mu_X)^2]=E[X^2]-\mu_X^2$
• For many distributions, this amount of variation is “right amount”
• $E[XY] = E[X]E[Y]$ if independent
• variance of independent variables: sum of variances
• $E[(\sum X_i)^2] = E[\sum X_i^2 + 2\sum X_iX_j]$
• second terms expand as $E[X_i]E[X_j]$
• so canceled by $\mu_{X_i}\mu_X{j}$ terms in $(\sum \mu_{X_i})^2$
• $\Pr[|X-\mu| \ge t\sigma] = \Pr [(X-\mu)^2 \ge t^2\sigma^2] \le 1/t^2$
• Or $\Pr[|X-\mu| \ge d] \le \sigma^2/d^2$
• So chebyshev predicts won't stray beyond stdev.
• binomial distribution. variance $np(1-p)$. stdev $\sqrt{n}$.
• requires (only) a mean and variance. less applicable but more powerful than markov
• Real applications later.

Example: coupon collecting.

• Start with geometric random variable with success prob. $p$
• mean $1/p$
• Variance:
• $E[Y^2] = E[Y^2\mid 1 \mbox{ step}]\cdot p + E[Y^2\mid >1 \mbox{ step}]\cdot (1-p)$
• $E[Y^2]=p+E[(Y+1)^2](1-p)=p+(E[Y^2]+2E[Y]+1)(1-p)$
• $Z=(1-p)z+(2-p)/p$
• $Z=(2-p)/p^2$
• variance $(1-p)/p^2< 1/p^2$
• Coupon collection is sum of independent vars: variance upper bound $\sum_n \left(\frac{n}{n-i+1}\right)^2=n^2 \sum\frac1{i^2}=O(n^2)$
• so $\Pr[2nH_n] < O(n^2)/(nH_n)^2=O(1/\ln^2 n)$

### Median Finding

In homework, simple randomized linear-time algorithm

• But, what's the constant?
• Optimum deterministic: $\ge (2+\epsilon)n$
• Can we do better randomized?

Sampling idea

• List $L$
• median of sample looks like median of whole. neighborhood.
• analysis via Chebyshev bound

Approximation

• choose $s$ samples with replacement
• (replacement simplifies analysis)
• what guess for median?
• sort, take $x_{s/2}$
• will it be near median?
• $\Pr[< x_{n/4}]$ is probability choose $s/2$ items less than $x_{n/4}$
• each choice has probability $1/4$
• so expect to choose only $s/4$ in smallest quarter
• variance is $s(1-1/4)/4=3s/16$
• $\sigma = \sqrt{3s}/4$
• $s/2$ such choices is deviation by $s/4=\sqrt{s/3}\sigma$
• so probability $3/s$ by Chebychev
• same analysis shows unlikely to exceed $x_{3n/4}$
• combine with union bound
• conclude probably in middle half
• similarly conclude any constant error has probability $O(1/s)$

Exact Algorithm?

• choose $s$ samples with replacement
• take fences before and after sample median
• keep items between fences. sort.
• count items on both sides of fences
• find right element in between
• $s=n^{3/4}$ samples $x_1,\ldots, x_s$ in sorted order.
• Sample with replacement to keep analysis clean
• lemma: $x_r$ near $rn/s$ position of original list.
• To find median element, look at two positions in sorted order of sample: $a$ at pos. $s/2-\sqrt{n}$ and $b$ at $s/2+\sqrt{n}$

Analysis

• works so long as median is between fences
• fast so long as few items between fences
• Will prove (i) in class, (ii) in homework
• Expected number preceding $n/2$ is $s/2$.
• each sample indicator has variance $p(1-p)=1/4$
• so $r$ choices have variance $< r/4$
• so $\sigma = \sqrt{r/4}\le n^{3/8}/2$
• chebyshev: prob. less than $rs/n-\sqrt{n}$ involves deviation by $n^{1/8}\sigma$ so probability $O(n^{-1/4}$).

Running time:

• Sorting takes $O(n^{3/4}\log n)=o(n)$ time
• Comparing to fences takes $2n$ time
• Fail probability $O(n^{-1/4})$

Want to improve?

• usually, repetition works
• but here, we're looking at the constant
• even repeating once kills the constant
• but we almost always succeed first time
• (later we'll see failure probability much smaller than shown here)

Randomized is strictly better:

• Gives important constant factor improvement
• Optimum deterministic: $\ge (2+\epsilon)n$
• Optimum randomized: $\le (3/2) n+o(n)$ w.h.p.

### Pairwise Independence

pseudorandom generators.

• Motivation.
• Idea of randomness as (complexity theoretic) resource like space or time.
• pseudorandom generators turn a little “real” randomness into lots of “fake” randomness”
• can be used in many randomized algorithms without change to algorithm or analysis

Pairwise independence

• $\Pr[X_i = a \wedge X_j=b] = \Pr[X_i=a]\Pr[X_j=b] \forall i,j,a,b$
• equivalently $\Pr[X_i=a \mid X_j=b] = \Pr[X_i = a]$
• sometime full independence unnecessary
• e.g. pairwise independent vars sufficient for Chebychev bounds

generating over $Z_p$.

• Want random numbers in range $[1,\ldots,p]$
• pick random $a,b$
• $i^{th}$ random number $ai+b$
• Works because invertible over field
• If want over nonprime field, use “slightly larger” $p$

Alternative

• Given $t$ independent random vars $\mod n$
• For each subset, define random variable as sum over subset, $\mod n$
• Any two subsets differ on a variable, so pairwise independent
• from $t$ independent random bits, get $2^t-1$ pairwise independent

Application: median

• median only depended on chebyshev
• which only depended on pairwise independence
• so $O(\log n)$ bits suffice for result

Application: conserving Random Bits

• Suppose $RP$ algorithm using $n$ bits.
• Reduce error to $2^{-t}$ using $t$ independent trials with $tn$ bits
• What do with $2n$ bits?
• two direct draws:
• error prob. 1/4.
• pseudorandom generator
• random variable for number of correct outcomes
• $\mu = t/2$.
• $\sigma = \sqrt{t}/2$.
• error if no cert,
• i.e. $Y-E[Y] \ge t/2$,
• prob. $1/t$ by Chebyshev
• even though only pairwise independent
• with $tn$ random bits can get error $2^{-t}$ with $t$ trials
• with $2n$ random bits, can get same probability with $2^t$ trials