Approximation Algorithms
- MAXCUT, MAXSAT, MAX-CLIQUE, MIN-VERTEX-COVER are all NP-complete.
- Approximation Algorithms provides a new dimension to view the hardness of these problems.
- Definition of approximation algorithms.
- For maximization problems:
Suppose $\OPT$ = $\max \Phi(\mathbf{x})$ subject to $\mathbf{x} \in \mathcal{C}$.
$\alpha$-approximation algorithm returns $\mathbf{\tilde{x}}$ such that $\Phi(\mathbf{\tilde{x}}) \ge \alpha \cdot \OPT$, where $\alpha \le 1$.
- For minimization problems:
Suppose $\OPT$ = $\min \Phi(\mathbf{x})$ subject to $\mathbf{x} \in \mathcal{C}$.
$\alpha$-approximation algorithm returns $\mathbf{\tilde{x}}$ such that $\Phi(\mathbf{\tilde{x}}) \le \alpha \cdot \OPT$, where $\alpha \ge 1$.
The Probabilistic Method---Value of Random Answers
Idea: to show an object with certain properties exists
- generate a random object
- prove it has properties with nonzero probability
- often, “certain properties” means “good solution to our problem”
- random routing as example
Max-Cut
- Definition
- Exact computation : NP-complete
- Approximation algorithms
- factor 2
- “expected performance,” so doesn't really fit our RP/ZPP framework
- but does show such a cut exists
Set balancing
- Given a family $\mathcal{S}$ of subsets $S_1, \ldots, S_n \subseteq [n]$
- Given a coloring $\varphi: [n] \to \{0,1\}$, define $\mathrm{bias}_{\varphi}(S_i) = |\#\{\varphi^{-1}(1) \cap S_i\} - \#\{\varphi^{-1}(0) \cap S_i\}|$.
- Minimize over coloring $\varphi$, the max bias, i.e. $\max_i \mathrm{bias}_{\varphi}(S_i)$
- $4\sqrt{n\ln n}$.
- Note that for a uniform random coloring, expected bias of $S_i$ is $0$.
- Using Chernoff bound, $\Pr[\mathrm{bias}_{\varphi}(S_i) > t] < 2e^{-t^2/|S_i|} \le 2e^{-t^2/2n}$.
- require $< 1/n$ to apply union bound $$ \begin{align*} e^{-t^2/2n} &< 1/2n\\ t^2/2n &>\ln 2n\\ t &> \sqrt{2n \ln 2n} \end{align*} $$
- so there exists a coloring with max bias atmost $4\sqrt{n\ln n}$
- Spencer---10 lectures on the probabilistic method:
- Six Standard Deviations Suffice
- There exists a coloring with max bias at most $6\sqrt{n}$
- But nonconstructive
- Recent works have given algorithms to find such a coloring; via convex programming and rounding
MAX SAT
Sometimes, it's hard to get hands on a good probability distribution of random answers.Define.
- literals $x_1, \lnot x_1, x_2, \lnot x_2, \ldots, x_n, \lnot x_n$.
- clauses $\mathcal{C} = \{C_1, \ldots, C_m\}$. Clause $C_i$ looks like $(x_1 \vee \lnot x_2 \vee x_3 \vee x_4)$.
- NP-complete
Random assignment
- achieve $1-2^{-k}$ for MAX-$k$-SAT.
- very nice for large $k$, but only $1/2$ for $k=1$
LP: \[\max \sum_{j=1}^{m} z_j \] $$ \begin{align*} &\sum_{i \in C_j^+} y_i + \sum_{i \in C_j^-} (1-y_1) ~\ge~ z_j\\ &y_i ~\in~ [0,1] \qquad \text{(relaxation of $\{0,1\}$)} \end{align*} $$
- Note that $\OPT_{LP}(\mathcal{C}) \ge \OPT(\mathcal{C})$
- Suppose optimal LP solution $p_1, \ldots, p_n$.
- $p_i$ close to $1$ suggests that $x_i$ must probably be $1$. Similarly close to $0$ suggests that $x_i$ must probably be $0$.
- Set $x_i$ to be $1$ with probability $p_i$ and $0$ with prob. $1 - y_i$.
- For simplicity, $C_j$ has all positive variables. Probability that clause $C_j$ is satisfied is $$ \begin{align*} \Pr[C_j \text{ is not satisfied}] &~=~ \prod_{i \in C_j} (1 - p_i)\\ &~\le~ \left ( \frac{\sum_{i \in C_j} (1-p_i)}{k}\right )^k\\ &~=~ \left ( 1 - \frac{\zhat_j}{k}\right )^k \end{align*} $$
- Expected number of clauses satisfied is at least $\sum_{j=1}^{m} 1 - \left ( 1 - \frac{\zhat_j}{k}\right )^k$.
- We want to compare this to the objective of the LP.
- Lemma: $1 - \left ( 1 - \frac{\zhat_j}{k}\right )^k \ge \left ( 1 - \left ( 1 - \frac{1}{k}\right )^k \right ) \zhat_j$.
- $\beta_k = 1-(1-1/k)^k$. values $1,3/4,.704,\ldots, \to (1-1/e)$.
- Lemma: $k$-literal clause sat w/pr at least $\beta_k \zhat_j$.
- Result: $(1-1/e)$ approximation (convergence of $(1-1/k)^k$)
- much better for small $k$: i.e. $1$-approx for $k=1$
LP good for small clauses, random for large.
- Better: try both methods.
- $n_1,n_2$ number in both methods
- Show $(n_1+n_2)/2 \ge (3/4)\sum \zhat_j$
- $n_1 \ge \sum_{C_j \in S^k} (1-2^{-k})\zhat_j$, since $\zhat_j \le 1$.
- $n_2 \ge \sum \beta_k \zhat_j$
- $n_1+n_2 \ge \sum (1-2^{-k}+\beta_k) \zhat_j \ge \sum \frac32\zhat_j$
Set Cover
Definition
- Given a collection of subsets $\mathcal{S} = \{S_1, \ldots, S_n\}$, where $S_i \subseteq [m]$.
- Find smallest collection of subsets $I \subseteq [n]$, such that $\bigcup_{i \in I} S_i = [m]$.
- Weighted version can also be defined.
- Remark: $n$ can be much larger than $m$.
LP: \[\min \sum_{i=1}^{n} y_i \] $$ \begin{align*} \text{for all } v \in [m] \qquad \sum_{i : v \in S_i} y_i \ge 1 & \\ \text{for all } i \in [n] \qquad 0 \le y_i \le 1 & \quad \text{(relaxation of $\{0,1\}$)} \end{align*} $$
Candidate Randomized Rounding:
- Let optimal solution be $(p_1, \ldots, p_n)$.
- Include set $S_i$ with probability $p_i$.
- Suppose $v \in [m]$ occurs in sets $S_1, S_2, \ldots, S_k$.
- We have that $p_1 + \cdots + p_k \ge 1$.
- Probability that $v$ is covered is at least $1 - (1-p_1) \cdots (1-p_k) \ge 1 - e^{-(p_1 + \cdots + p_k)} \ge 1 - 1/e$.
- Risks some element from not being covered!
- It would have been sufficient if we succeeded with some decent probability. But in fact, with high probability might miss a constant fraction of $[m]$.
- Suppose each $j \in [m]$ appears in $2$ sets, and the optimal LP solution is $p_{i_1} = p_{i_2} = 1/2$. Then, $j$ is not covered probability $1/4$. There might be $m/10$ such elements, in which case, $m/40$ elements will remain uncovered with probability $1 - 2^{-\Omega(m)}$.
Actual Rounding:
- While some elements are uncovered:
$\triangleright$ Include set $S_i$ with probability $p_i$.
- Lemma: Probability that this runs for $\ln m + k$ iterations is at most $e^{-k}$.
- Since for any element $v$, probability that $v$ is not covered in $t$ iterations is atmost $(1/e)^t$. Hence, in $\ln m + k$ iterations, each element is covered with probability $1 - e^{-k}/m$.
- Union bound gives that we covered elements with probability $e^{-k}$.
Expected set size:
- If we run for $t$ iterations, expected size is $OPT_{LP}(\mathcal{S})$.
- Hence, expected set size after $t$ iterations is at most, $t \cdot \OPT_{LP}(\mathcal{S})$.
- With probability $1/2$, the size is at most $2 t \cdot \OPT_{LP}(\mathcal{S})$.
- For $t = \ln m + 3$, by a union bound, it follows that all elements are covered and at most $2 t \cdot \OPT_{LP}(\mathcal{S})$ sets are included.
- We have $(2 \ln m + 6)$ approximation.
Wiring
Problem formulation
- $\sqrt{n} \times \sqrt{n}$ gate array
- Manhattan wiring
- boundaries between gates
- fixed width boundary means limit on number of crossing wires
- optimization vs. feasibility: minimize max crossing number
- focus on single-bend wiring. two choices for route.
- Generalizes if you know about multicommodity max-flow
Linear Programs, integer linear programs
- Black box
- Good to know, since great solvers exist in practice
- Solution techniques in other courses
- LP is polytime, ILP is NP-hard
- LP gives hints---rounding.
- $x_{i0}$ means $x_i$ starts horizontal, $x_{i1}$ vertical
- $T_{b0} = \{i \mid \text{net $i$ through $b$ if $x_{i0} = 1$}\}$
- $T_{b1} = \{i \mid \text{net $i$ through $b$ if $x_{i1} = 1$}\}$
- Solution $\xhat_{i0}$, $\xhat_{i1}$, value $\what$.
- rounding is Poisson Binomial vars with mean atmost $\what$.
- For $\delta< 1$ (good approx) $\Pr[\ge (1+\delta)\what] \le e^{-\delta^2\what /4}$
- need $2n$ boundaries, so aim for prob. bound $1/2n^2$.
- solve, $\delta=\sqrt{(4\ln 2n^2)/\what}$.
- So absolute error $\sqrt{8\what\ln n}$
- Good ($o(1)$-error) if $\what \gg 8\ln n$
- Bad ($O(\ln n)$ error) if $\what=2$ (invoke other chernoff bound)
- General rule: randomized rounding good if target logarithmic, not if constant
- Multicommodty flow generalization
- Same rounding works