6.856 Lecture Notes

Balls in Bins

Load balancing:

$n$ users want to choose among $n$ resources---eg web servers.
They can coordinate---but communication is expensive
Or, each can choose a random one

$n$ balls in $n$ bins

$n$ balls in $n$ bins
Expected balls per bin: 1 (not very interesting)
What is max balls we expect to see in a bin?

Markov:

$E[]=1$ so $\Pr[>t] < 1/t$
so one bin unlikely to be big
But $n$ bins so union bound only tells us less than $n$
Example? Distribution that puts all balls in random bin

Chebyshev a little better:

variance of one ball in one bin is $1/n(1-1/n)< 1/n$
variance of $n$ balls is $1$
$\Pr[|X-1|>t]< 1/t^2$
So $O(\sqrt{n})$ with probability $1-1/n$
So now can apply union bound to all $n$ bins
conclude max is probably $O(\sqrt{n})$
but still hard to bound expected max
true for any pairwise independent distribution (cf 2-universal hashing)
exists pairwise independent distributions where this is tight

Chernoff:

expected number of balls in bin $1$
$\Pr[>t]< 2^{-t}$ (for $t > 2e-1$)
So $O(\log n)$ for one bin with high probability.
now apply union bound
Now can bound expected max:
- With probability $1-1/n$, max is $O(\log n)$.
- With probability $1/n$, max is bigger, but at most $n$
- So, expected max $O(\log n)$

Exact analysis

Start by bounding probability of many balls $$ \begin{align*} \Pr[k \mbox{ balls in bin 1}] &= \binom{n}{k}(1/n)^k(1-1/n)^{n-k}\\ & \le \binom{n}{k}(1/n)^k\\ & \le \left(\frac{ne}{k}\right)^k (1/n)^k\\ & = \left(\frac{e}{k}\right)^k\\ \end{align*} $$
So prob at least $k$ balls is $\sum_{j \ge k} (e/j)^j=O((e/k)^k)$ (geometric series)
$ \le 1/n^2 \mbox{ if } k > (e\ln n)/\ln\ln n$
What is probability any bin is over $k$? $1/n$ union bound.
Now can bound expected max:
- With probability $1-1/n$, max is $O(\ln n/\ln\ln n)$.
- With probability $1/n$, max is bigger, but at most $n$
- So, expected max $O(\ln n/\ln\ln n)$
Typical approach: small expectation as small “common case” plus large “rare case”

Purely random has clusters

Saw $\log n/\log\log n$ balls in one bin
Similarly saw $\Omega(\log n)$ “runs” in coin flipping
Gives good party trick---deciding if sequence is random
How can we do better?

Two Choices

Two choices

Azar, Broder, Karlin, Upfal
Pick two random bins
Insert in less loaded

Intuitive analysis:

only $n/4$ bins with 4 items
two random bins: probability both have 4 is $(1/4)(1/4)=1/16$
So, only $n/16$ with 5
so $n/256$ have 6
In general, $n/2^{2^{k-3}}$ have $k$
So $< 1$, i.e. none, have $\log\log n$.
Problem: Expectations are not certainties
Problem: analysis bounds one layer based on assumption about previous. must cope with this conditioning.

From expectation to tail bounds.

Define $V_h=$ number of bins of height at least $h$
We know $V_4 \le n/4$
Write $Y_t=1$ is ball $t$ has height $\ge 5$
We said $\Pr[Y_t] \le 1/16$
We argued $E[V_5] \le n/16$
Can we use Chernoff to conclude $V_5 \le n/8$ w.h.p.?
Problem: dependence
- $Y_t$ not independent (next ball height depends on position of previous balls)
- however, no matter what, $\Pr[Y_t] \le 1/16$
- so intutively $V_5$ should be smaller than the Chernoff bound says
Formalize:
- Lemma: Let $Y_i$ be a function of $X_1,\ldots,X_i$. If $\Pr[Y_i=1 \mid X_1,\ldots,X_{i-1}]< p$ for all $X$'s then $\sum Y_i$ is stochastically dominated by $B(n,p)$,
  - i.e. $Z_i$ form $B(n,p)$ and $\Pr[Y_i \ge k] \le \Pr[Z_i \ge k]$.
  - so Chernoff holds.
  - Even if $Y_i$ dependent.
- proof philosophically subtle, by “coupling”:
  - imagine generating a sequence $Z_i$.
  - do it while generating $Y_i$
  - first, check $p_i=Pr[Y_i=1]$ which may be a messy function of $X$ sequence
  - Now generate $Y_i$. Set $Z_i=1$ if $Y_i=1$ or with prob. $\frac{p-p_i}{1-p_i}$ otherwise.
  - So each $Z_i$ is 1 with probability $p$ independent of all other $Z_i$.
  - so Chernoff applies to $\sum Z_i$
  - but (guaranteed) $Z_i \ge Y_i$.
  - conclude $Z_i$ stochastically dominates $Y_i$
  - so $\sum Z_i$ s.d. $\sum Y_i$
  - so same bound applies to $\sum Y_i$
- Application
  - in our experiment, each ball has height $\ge 5$ with probability at most $1/16$
  - no matter what previous balls did
  - so Chernoff bound still gives us $V_5 \le n/8$ w.h.p.

Generalize?

Idea: use same argument for all heights
- Suppose $V_h \le \beta_h n$.
- Then we argued $E[V_{h+1}] \le \beta_h^2 n$
- Coupled Chernoff suggests: $\Pr[V_{h+1} \ge 2\beta_h^2 n]< e^{-2\beta_h^2n/3}$ (using $\epsilon=1$ in Chernoff bound)
- $< 1/n^2$ so long as $\beta_h^2 n \ge 2\ln n.$
so define $\beta_4=1/4$, $\beta_{h+1} = 2\beta_h^2$
we've just speculated that $V_h \le \beta_h n \implies V_{h+1} \le \beta_{h+1}n$ w.h.p.
and $(2\beta_{h+1})=(2\beta_h)^2=(1/2)^{2^{h+1}}$
so $\beta_{O(\log\log n)}n$ is $O(\log n)$
(at that point, Chernoff weakens because expectation too small; need something else)

Chain of high probability events:

Let $Q_h$ be the event $V_h \le \beta_h n$.
Suppose $Q_0$ holds, and $\Pr[Q_{h} \mid Q_{h-1}] \ge 1-\epsilon$
Then $\Pr[Q_h] \ge (1-\epsilon)^h \ge 1-h\epsilon$
Proof: $$ \begin{align*} \Pr[Q_h] &\ge \Pr[Q_h \cap Q_{h-1}]]\\ &=\Pr[Q_h \mid Q_{h-1}]\cdot\Pr[Q_{h-1}]\\ &\ge (1-\epsilon)\Pr[Q_{h-1}]\\ &\ge (1-\epsilon)^h &\text{by induction} \end{align*} $$
In particular, if each $Q_h$ happens w.h.p. conditioned on $Q_{h-1}$, then each $Q_h$ happens w.h.p. unconditionally (for polynomial values of $h$)

Remains to prove $Q_{h+1}$ holds w.h.p. conditioned on $Q_h$

Problem: we only know if $Q_h$ holds after we've placed all the balls!
So how can we go back and analyze as if each ball is placed randomly?
What does this conditioning do to the probability that ball $t$ is high?
Note:
- we like to analyze by considering random events in the present conditioned on the past.
- Matches intution.
- But there's nothing formally wrong with conditioning on the future instead.
- Just hard to think about.

Avoid conditioning on the future

Prove $Q_{h+1}$ holds w.h.p. conditioned on $Q_{h}$
Write $Y_t=1$ if ball $t$ has height at least $h+1$
So $V_{h+1} \le \sum Y_t$
So bound $\sum Y_t \mid Q_h$
But conditioning $Y_t$ on $Q_h$ is confusing because at time $t$ event $Q_h$ is still determined in the future
So write $Y_t'=1$ if $Y_t=1$ and $V_h(t-1) \le \beta_h n$
Depends only on past, so easy to analyze $$ \begin{align*} \Pr[Y'_t=1] &= \Pr[Y'_t \mid Q_h(t-1)]\cdot\Pr[Q_h(t-1)]\\ &\le \beta_h^2 \cdot 1\\ &= \beta_h^2 \end{align*} $$
So $Y'_t$ fit our stochastic domination argument above
- $Y'_t$ depends on balls $1,\ldots,t$
- But has probabilty $\le \beta_h^2$ conditioned on any balls $1,\ldots,t-1$
So Chernoff bound applies. w.h.p. $\sum Y'_t \le 2\beta_h^2 n = \beta_{h+1} n$
Now note that if $Q_h$ holds ($V_h \le \beta_h n$) then every $V_h(t-1) \le \beta_h n$.
So every $Y_t=Y'_t$ and thus $\sum Y_t = \sum Y'_t$
so $Q_h \cap \sum Y'_t > \beta_{h+1}n$ holds iff $Q_h \cap \sum Y_t > \beta_{h+1}n$ does
so their probabilities are equal, so $$ \begin{align*} \Pr[Q_h \cap \neg Q_{h+1}] &= \Pr[Q_h \cap \neg V_{h+1} > \beta_{h+1}n]\\ &\le \Pr[Q_h \cap \sum Y_t > \beta_{h+1}n]\\ &= \Pr[ Q_h \cap \sum Y'_t > \beta_{h+1}n] &\text{(identical event)}\\ &\le \Pr[\sum Y'_t > \beta_{h+1}n] &\text{(bigger event),}\\ &\le 1/n^2 &\text{by the Chernoff bound above} \end{align*} $$
Thus, $$ \begin{align*} \Pr[\neg Q_{h+1} \mid Q_{h}] &= \Pr[\neg Q_{h+1} \cap Q_{h}]/\Pr[Q_h]\\ &= O(1/n^2)/(1-O(1/n^2)) \end{align*} $$
In other words, $Q_{h+1}$ happens w.h.p., conditioned on $Q_h$

What of last few stages?

At last $i$ above, number of tall bins is $\beta_i n= O(\log n)$ whp
At this point, prob. a ball is tall is $O((\log n)^2/n^2)$
So prob. of 2 distinct tall balls is $O((n^2)(\log n)^2/n^4)$ which is negligible (union bound on $n^2$ pairs of balls)
if only one ball is tall, done.
because it would need to stand on other tall balls to be really tall

More choices? Only $O(\log_d\log n)$

slight variation: break ties to the left
improves to $O((\log\log n)/d)$
intuition: a slight amount of imbalance helps cope with clusters all attacking same spots
ensures some spots will be smaller