6.854 Lecture Notes

Approximation Algorithms

MAXCUT, MAXSAT, MAX-CLIQUE, MIN-VERTEX-COVER are all NP-complete.
Approximation Algorithms provides a new dimension to view the hardness of these problems.
Definition of approximation algorithms.
For maximization problems:
Suppose $\OPT$ = $\max \Phi(\mathbf{x})$ subject to $\mathbf{x} \in \mathcal{C}$.
$\alpha$-approximation algorithm returns $\mathbf{\tilde{x}}$ such that $\Phi(\mathbf{\tilde{x}}) \ge \alpha \cdot \OPT$, where $\alpha \le 1$.
For minimization problems:
Suppose $\OPT$ = $\min \Phi(\mathbf{x})$ subject to $\mathbf{x} \in \mathcal{C}$.
$\alpha$-approximation algorithm returns $\mathbf{\tilde{x}}$ such that $\Phi(\mathbf{\tilde{x}}) \le \alpha \cdot \OPT$, where $\alpha \ge 1$.

The Probabilistic Method---Value of Random Answers

Idea: to show an object with certain properties exists

generate a random object
prove it has properties with nonzero probability
often, “certain properties” means “good solution to our problem”
random routing as example

Max-Cut

Definition
Exact computation : NP-complete
Approximation algorithms
factor 2
“expected performance,” so doesn't really fit our RP/ZPP framework
but does show such a cut exists

Set balancing

Given a family $\mathcal{S}$ of subsets $S_1, \ldots, S_n \subseteq [n]$
Given a coloring $\varphi: [n] \to \{0,1\}$, define $\mathrm{bias}_{\varphi}(S_i) = |\#\{\varphi^{-1}(1) \cap S_i\} - \#\{\varphi^{-1}(0) \cap S_i\}|$.
Minimize over coloring $\varphi$, the max bias, i.e. $\max_i \mathrm{bias}_{\varphi}(S_i)$
$4\sqrt{n\ln n}$.
- Note that for a uniform random coloring, expected bias of $S_i$ is $0$.
- Using Chernoff bound, $\Pr[\mathrm{bias}_{\varphi}(S_i) > t] < 2e^{-t^2/|S_i|} \le 2e^{-t^2/2n}$.
- require $< 1/n$ to apply union bound $$ \begin{align*} e^{-t^2/2n} &< 1/2n\\ t^2/2n &>\ln 2n\\ t &> \sqrt{2n \ln 2n} \end{align*} $$
- so there exists a coloring with max bias atmost $4\sqrt{n\ln n}$
Spencer---10 lectures on the probabilistic method:
- Six Standard Deviations Suffice
- There exists a coloring with max bias at most $6\sqrt{n}$
- But nonconstructive
- Recent works have given algorithms to find such a coloring; via convex programming and rounding

MAX SAT

Sometimes, it's hard to get hands on a good probability distribution of random answers.

Define.

literals $x_1, \lnot x_1, x_2, \lnot x_2, \ldots, x_n, \lnot x_n$.
clauses $\mathcal{C} = \{C_1, \ldots, C_m\}$. Clause $C_i$ looks like $(x_1 \vee \lnot x_2 \vee x_3 \vee x_4)$.
NP-complete

Random assignment

achieve $1-2^{-k}$ for MAX-$k$-SAT.
very nice for large $k$, but only $1/2$ for $k=1$

LP: \[\max \sum_{j=1}^{m} z_j \] $$ \begin{align*} &\sum_{i \in C_j^+} y_i + \sum_{i \in C_j^-} (1-y_1) ~\ge~ z_j\\ &y_i ~\in~ [0,1] \qquad \text{(relaxation of $\{0,1\}$)} \end{align*} $$

Note that $\OPT_{LP}(\mathcal{C}) \ge \OPT(\mathcal{C})$

Randomized Rounding

Suppose optimal LP solution $p_1, \ldots, p_n$.
$p_i$ close to $1$ suggests that $x_i$ must probably be $1$. Similarly close to $0$ suggests that $x_i$ must probably be $0$.
Set $x_i$ to be $1$ with probability $p_i$ and $0$ with prob. $1 - y_i$.

Analysis

For simplicity, $C_j$ has all positive variables. Probability that clause $C_j$ is satisfied is $$ \begin{align*} \Pr[C_j \text{ is not satisfied}] &~=~ \prod_{i \in C_j} (1 - p_i)\\ &~\le~ \left ( \frac{\sum_{i \in C_j} (1-p_i)}{k}\right )^k\\ &~=~ \left ( 1 - \frac{\zhat_j}{k}\right )^k \end{align*} $$
Expected number of clauses satisfied is at least $\sum_{j=1}^{m} 1 - \left ( 1 - \frac{\zhat_j}{k}\right )^k$.
We want to compare this to the objective of the LP.
Lemma: $1 - \left ( 1 - \frac{\zhat_j}{k}\right )^k \ge \left ( 1 - \left ( 1 - \frac{1}{k}\right )^k \right ) \zhat_j$.
$\beta_k = 1-(1-1/k)^k$. values $1,3/4,.704,\ldots, \to (1-1/e)$.
Lemma: $k$-literal clause sat w/pr at least $\beta_k \zhat_j$.
Result: $(1-1/e)$ approximation (convergence of $(1-1/k)^k$)
much better for small $k$: i.e. $1$-approx for $k=1$

LP good for small clauses, random for large.

Better: try both methods.
$n_1,n_2$ number in both methods
Show $(n_1+n_2)/2 \ge (3/4)\sum \zhat_j$
$n_1 \ge \sum_{C_j \in S^k} (1-2^{-k})\zhat_j$, since $\zhat_j \le 1$.
$n_2 \ge \sum \beta_k \zhat_j$
$n_1+n_2 \ge \sum (1-2^{-k}+\beta_k) \zhat_j \ge \sum \frac32\zhat_j$

Set Cover

Definition

Given a collection of subsets $\mathcal{S} = \{S_1, \ldots, S_n\}$, where $S_i \subseteq [m]$.
Find smallest collection of subsets $I \subseteq [n]$, such that $\bigcup_{i \in I} S_i = [m]$.
Weighted version can also be defined.
Remark: $n$ can be much larger than $m$.

LP: \[\min \sum_{i=1}^{n} y_i \] $$ \begin{align*} \text{for all } v \in [m] \qquad \sum_{i : v \in S_i} y_i \ge 1 & \\ \text{for all } i \in [n] \qquad 0 \le y_i \le 1 & \quad \text{(relaxation of $\{0,1\}$)} \end{align*} $$

Candidate Randomized Rounding:

Let optimal solution be $(p_1, \ldots, p_n)$.
Include set $S_i$ with probability $p_i$.
Suppose $v \in [m]$ occurs in sets $S_1, S_2, \ldots, S_k$.
We have that $p_1 + \cdots + p_k \ge 1$.
Probability that $v$ is covered is at least $1 - (1-p_1) \cdots (1-p_k) \ge 1 - e^{-(p_1 + \cdots + p_k)} \ge 1 - 1/e$.
Risks some element from not being covered!
It would have been sufficient if we succeeded with some decent probability. But in fact, with high probability might miss a constant fraction of $[m]$.
Suppose each $j \in [m]$ appears in $2$ sets, and the optimal LP solution is $p_{i_1} = p_{i_2} = 1/2$. Then, $j$ is not covered probability $1/4$. There might be $m/10$ such elements, in which case, $m/40$ elements will remain uncovered with probability $1 - 2^{-\Omega(m)}$.

Actual Rounding:

While some elements are uncovered:
$\triangleright$ Include set $S_i$ with probability $p_i$.
Lemma: Probability that this runs for $\ln m + k$ iterations is at most $e^{-k}$.
Since for any element $v$, probability that $v$ is not covered in $t$ iterations is atmost $(1/e)^t$. Hence, in $\ln m + k$ iterations, each element is covered with probability $1 - e^{-k}/m$.
Union bound gives that we covered elements with probability $e^{-k}$.

Expected set size:

If we run for $t$ iterations, expected size is $OPT_{LP}(\mathcal{S})$.
Hence, expected set size after $t$ iterations is at most, $t \cdot \OPT_{LP}(\mathcal{S})$.
With probability $1/2$, the size is at most $2 t \cdot \OPT_{LP}(\mathcal{S})$.
For $t = \ln m + 3$, by a union bound, it follows that all elements are covered and at most $2 t \cdot \OPT_{LP}(\mathcal{S})$ sets are included.
We have $(2 \ln m + 6)$ approximation.

Wiring

Problem formulation

$\sqrt{n} \times \sqrt{n}$ gate array
Manhattan wiring
boundaries between gates
fixed width boundary means limit on number of crossing wires
optimization vs. feasibility: minimize max crossing number
focus on single-bend wiring. two choices for route.
Generalizes if you know about multicommodity max-flow

Linear Programs, integer linear programs

Black box
Good to know, since great solvers exist in practice
Solution techniques in other courses
LP is polytime, ILP is NP-hard
LP gives hints---rounding.

IP formulation

$x_{i0}$ means $x_i$ starts horizontal, $x_{i1}$ vertical
$T_{b0} = \{i \mid \text{net $i$ through $b$ if $x_{i0} = 1$}\}$
$T_{b1} = \{i \mid \text{net $i$ through $b$ if $x_{i1} = 1$}\}$

LP: \[ \min w \] $$ \begin{align*} \text{for every } i : \quad & x_{i0}+x_{i1} ~=~ 1\\ \text{for every edge } b : \quad &\sum_{i \in T_{b0}} x_{i0} + \sum_{i \in T_{b1}} x_{i1} \le w \end{align*} $$ Rounding

Solution $\xhat_{i0}$, $\xhat_{i1}$, value $\what$.
rounding is Poisson Binomial vars with mean atmost $\what$.
For $\delta< 1$ (good approx) $\Pr[\ge (1+\delta)\what] \le e^{-\delta^2\what /4}$
need $2n$ boundaries, so aim for prob. bound $1/2n^2$.
solve, $\delta=\sqrt{(4\ln 2n^2)/\what}$.
So absolute error $\sqrt{8\what\ln n}$
- Good ($o(1)$-error) if $\what \gg 8\ln n$
- Bad ($O(\ln n)$ error) if $\what=2$ (invoke other chernoff bound)
- General rule: randomized rounding good if target logarithmic, not if constant

Generalize

Multicommodty flow generalization
Same rounding works