Balls in Bins
Load balancing:
- $n$ users want to choose among $n$ resources---eg web servers.
- They can coordinate---but communication is expensive
- Or, each can choose a random one
$n$ balls in $n$ bins
- $n$ balls in $n$ bins
- Expected balls per bin: 1 (not very interesting)
- What is max balls we expect to see in a bin?
Markov:
- $E[]=1$ so $\Pr[>t] < 1/t$
- so one bin unlikely to be big
- But $n$ bins so union bound only tells us less than $n$
- Example? Distribution that puts all balls in random bin
Chebyshev a little better:
- variance of one ball in one bin is $1/n(1-1/n)< 1/n$
- variance of $n$ balls is $1$
- $\Pr[|X-1|>t]< 1/t^2$
- So $O(\sqrt{n})$ with probability $1-1/n$
- So now can apply union bound to all $n$ bins
- conclude max is probably $O(\sqrt{n})$
- but still hard to bound expected max
- true for any pairwise independent distribution (cf 2-universal hashing)
- exists pairwise independent distributions where this is tight
Chernoff:
- expected number of balls in bin $1$
- $\Pr[>t]< 2^{-t}$ (for $t > 2e-1$)
- So $O(\log n)$ for one bin with high probability.
- now apply union bound
- Now can bound expected max:
- With probability $1-1/n$, max is $O(\log n)$.
- With probability $1/n$, max is bigger, but at most $n$
- So, expected max $O(\log n)$
Exact analysis
- Start by bounding probability of many balls $$ \begin{align*} \Pr[k \mbox{ balls in bin 1}] &= \binom{n}{k}(1/n)^k(1-1/n)^{n-k}\\ & \le \binom{n}{k}(1/n)^k\\ & \le \left(\frac{ne}{k}\right)^k (1/n)^k\\ & = \left(\frac{e}{k}\right)^k\\ \end{align*} $$
- So prob at least $k$ balls is $\sum_{j \ge k} (e/j)^j=O((e/k)^k)$ (geometric series)
- $ \le 1/n^2 \mbox{ if } k > (e\ln n)/\ln\ln n$
- What is probability any bin is over $k$? $1/n$ union bound.
- Now can bound expected max:
- With probability $1-1/n$, max is $O(\ln n/\ln\ln n)$.
- With probability $1/n$, max is bigger, but at most $n$
- So, expected max $O(\ln n/\ln\ln n)$
- Typical approach: small expectation as small “common case” plus large “rare case”
Purely random has clusters
- Saw $\log n/\log\log n$ balls in one bin
- Similarly saw $\Omega(\log n)$ “runs” in coin flipping
- Gives good party trick---deciding if sequence is random
- How can we do better?
Two Choices
Two choices
- Azar, Broder, Karlin, Upfal
- Pick two random bins
- Insert in less loaded
Intuitive analysis:
- only $n/4$ bins with 4 items
- two random bins: probability both have 4 is $(1/4)(1/4)=1/16$
- So, only $n/16$ with 5
- so $n/256$ have 6
- In general, $n/2^{2^{k-3}}$ have $k$
- So $< 1$, i.e. none, have $\log\log n$.
- Problem: Expectations are not certainties
- Problem: analysis bounds one layer based on assumption about previous. must cope with this conditioning.
From expectation to tail bounds.
- Define $V_h=$ number of bins of height at least $h$
- We know $V_4 \le n/4$
- Write $Y_t=1$ is ball $t$ has height $\ge 5$
- We said $\Pr[Y_t] \le 1/16$
- We argued $E[V_5] \le n/16$
- Can we use Chernoff to conclude $V_5 \le n/8$ w.h.p.?
- Problem: dependence
- $Y_t$ not independent (next ball height depends on position of previous balls)
- however, no matter what, $\Pr[Y_t] \le 1/16$
- so intutively $V_5$ should be smaller than the Chernoff bound says
Formalize:
- Lemma: Let $Y_i$ be a function of $X_1,\ldots,X_i$. If $\Pr[Y_i=1
\mid X_1,\ldots,X_{i-1}]< p$ for all $X$'s then $\sum Y_i$ is stochastically dominated by $B(n,p)$,
- i.e. $Z_i$ form $B(n,p)$ and $\Pr[Y_i \ge k] \le \Pr[Z_i \ge k]$.
- so Chernoff holds.
- Even if $Y_i$ dependent.
- proof philosophically subtle, by “coupling”:
- imagine generating a sequence $Z_i$.
- do it while generating $Y_i$
- first, check $p_i=Pr[Y_i=1]$ which may be a messy function of $X$ sequence
- Now generate $Y_i$. Set $Z_i=1$ if $Y_i=1$ or with prob. $\frac{p-p_i}{1-p_i}$ otherwise.
- So each $Z_i$ is 1 with probability $p$ independent of all other $Z_i$.
- so Chernoff applies to $\sum Z_i$
- but (guaranteed) $Z_i \ge Y_i$.
- conclude $Z_i$ stochastically dominates $Y_i$
- so $\sum Z_i$ s.d. $\sum Y_i$
- so same bound applies to $\sum Y_i$
- Application
- in our experiment, each ball has height $\ge 5$ with probability at most $1/16$
- no matter what previous balls did
- so Chernoff bound still gives us $V_5 \le n/8$ w.h.p.
Generalize?
- Idea: use same argument for all heights
- Suppose $V_h \le \beta_h n$.
- Then we argued $E[V_{h+1}] \le \beta_h^2 n$
- Coupled Chernoff suggests: $\Pr[V_{h+1} \ge 2\beta_h^2 n]< e^{-2\beta_h^2n/3}$ (using $\epsilon=1$ in Chernoff bound)
- $< 1/n^2$ so long as $\beta_h^2 n \ge 2\ln n.$
- so define $\beta_4=1/4$, $\beta_{h+1} = 2\beta_h^2$
- we've just speculated that $V_h \le \beta_h n \implies V_{h+1} \le \beta_{h+1}n$ w.h.p.
- and $(2\beta_{h+1})=(2\beta_h)^2=(1/2)^{2^{h+1}}$
- so $\beta_{O(\log\log n)}n$ is $O(\log n)$
- (at that point, Chernoff weakens because expectation too small; need something else)
Chain of high probability events:
- Let $Q_h$ be the event $V_h \le \beta_h n$.
- Suppose $Q_0$ holds, and $\Pr[Q_{h} \mid Q_{h-1}] \ge 1-\epsilon$
- Then $\Pr[Q_h] \ge (1-\epsilon)^h \ge 1-h\epsilon$
- Proof: $$ \begin{align*} \Pr[Q_h] &\ge \Pr[Q_h \cap Q_{h-1}]]\\ &=\Pr[Q_h \mid Q_{h-1}]\cdot\Pr[Q_{h-1}]\\ &\ge (1-\epsilon)\Pr[Q_{h-1}]\\ &\ge (1-\epsilon)^h &\text{by induction} \end{align*} $$
- In particular, if each $Q_h$ happens w.h.p. conditioned on $Q_{h-1}$, then each $Q_h$ happens w.h.p. unconditionally (for polynomial values of $h$)
Remains to prove $Q_{h+1}$ holds w.h.p. conditioned on $Q_h$
- Problem: we only know if $Q_h$ holds after we've placed all the balls!
- So how can we go back and analyze as if each ball is placed randomly?
- What does this conditioning do to the probability that ball $t$ is high?
- Note:
- we like to analyze by considering random events in the present conditioned on the past.
- Matches intution.
- But there's nothing formally wrong with conditioning on the future instead.
- Just hard to think about.
Avoid conditioning on the future
- Prove $Q_{h+1}$ holds w.h.p. conditioned on $Q_{h}$
- Write $Y_t=1$ if ball $t$ has height at least $h+1$
- So $V_{h+1} \le \sum Y_t$
- So bound $\sum Y_t \mid Q_h$
- But conditioning $Y_t$ on $Q_h$ is confusing because at time $t$ event $Q_h$ is still determined in the future
- So write $Y_t'=1$ if $Y_t=1$ and $V_h(t-1) \le \beta_h n$
- Depends only on past, so easy to analyze $$ \begin{align*} \Pr[Y'_t=1] &= \Pr[Y'_t \mid Q_h(t-1)]\cdot\Pr[Q_h(t-1)]\\ &\le \beta_h^2 \cdot 1\\ &= \beta_h^2 \end{align*} $$
- So $Y'_t$ fit our stochastic domination argument above
- $Y'_t$ depends on balls $1,\ldots,t$
- But has probabilty $\le \beta_h^2$ conditioned on any balls $1,\ldots,t-1$
- So Chernoff bound applies. w.h.p. $\sum Y'_t \le 2\beta_h^2 n = \beta_{h+1} n$
- Now note that if $Q_h$ holds ($V_h \le \beta_h n$) then every $V_h(t-1) \le \beta_h n$.
- So every $Y_t=Y'_t$ and thus $\sum Y_t = \sum Y'_t$
- so $Q_h \cap \sum Y'_t > \beta_{h+1}n$ holds iff $Q_h \cap \sum Y_t > \beta_{h+1}n$ does
- so their probabilities are equal, so $$ \begin{align*} \Pr[Q_h \cap \neg Q_{h+1}] &= \Pr[Q_h \cap \neg V_{h+1} > \beta_{h+1}n]\\ &\le \Pr[Q_h \cap \sum Y_t > \beta_{h+1}n]\\ &= \Pr[ Q_h \cap \sum Y'_t > \beta_{h+1}n] &\text{(identical event)}\\ &\le \Pr[\sum Y'_t > \beta_{h+1}n] &\text{(bigger event),}\\ &\le 1/n^2 &\text{by the Chernoff bound above} \end{align*} $$
- Thus, $$ \begin{align*} \Pr[\neg Q_{h+1} \mid Q_{h}] &= \Pr[\neg Q_{h+1} \cap Q_{h}]/\Pr[Q_h]\\ &= O(1/n^2)/(1-O(1/n^2)) \end{align*} $$
- In other words, $Q_{h+1}$ happens w.h.p., conditioned on $Q_h$
What of last few stages?
- At last $i$ above, number of tall bins is $\beta_i n= O(\log n)$ whp
- At this point, prob. a ball is tall is $O((\log n)^2/n^2)$
- So prob. of 2 distinct tall balls is $O((n^2)(\log n)^2/n^4)$ which is negligible (union bound on $n^2$ pairs of balls)
- if only one ball is tall, done.
- because it would need to stand on other tall balls to be really tall
More choices? Only $O(\log_d\log n)$
- slight variation: break ties to the left
- improves to $O((\log\log n)/d)$
- intuition: a slight amount of imbalance helps cope with clusters all attacking same spots
- ensures some spots will be smaller