Approximation Algorithms
- or, when the “polynomial time solution” is impractically slow?
- consider special cases
- assume input is random, do “expected performance.” Eg, Hamiltonian path in all graphs. Problem: agreeing on good distribution.
- run a nonpolynomial (hopefully only slightly) algorithms such as branch and bound. Usually no proven bound on runtime, but sometime can.
- settle for heuristic, but prove it does well enough (our focus)
Definitions:
- optimization problem, instances $I$, solutions $S(I)$ with values $f:S(I) \rightarrow R$
- maximization/minimization: find solution in $S(I)$ maximizing/minimizing $f$
- called OPT$(I)$
- eg bin-packing. instance is set of $s_i \in {0,1}$, partition so no subset exceeds 1
Techincal assumptions we'll often make:
- assumption: all inputs and range of $f$ are integers/rationals (can't represent reals, and allows, eg, LP, binary search).
- assumption $f(\sigma)$ is a polynomial size (num bits) number (else output takes too long)
- look for polytime in bit complexity
NP-hardness
- optimization NP-hard if can reduce an NP-hard decision problem to it
- (eg, problem of “is opt solution to this instance $\le k$?”)
- but use more general notion of turing-reducibility (GJ).
Approximation algorithm:
- any algorithm that gives a feasible answer
- eg, each item in own bin.
- of course, want good algorithm. How measure?
Absolute Approximations
Definition: $k$-abs approx if on any $I$, have $|A(I)-OPT(I)| \le k$
Example: planar graph coloring.
- NP-complete to decide if 3 colorable
- know 4-colorable
- easy to 5-color
- Ditto for edge coloring: Vizing's theorem says opt is $\Delta$ or (constructive) $\Delta+1$
Often, can show impossible by “scaling” the problem.
-
EG knapsack.
- define profits $p_i$, sizes $s_i$, sack $B$
- wlog, integers.
- suppose have $k$-absolute
- multiply all $p_i$ by $k+1$, solve, scale down.
-
EG independent set (clique)
- $k+1$ copies of $G$
Relative Approximation
Definitions:
- An $\alpha$-optimum solution has value at most $\alpha$ times optimum for minimization, at least $1/\alpha$ times optimum for maximization.
- an algorithm has approximation ratio $\alpha$ if on any input, it outputs an $\alpha$-approximate feasible solution.
- called an $\alpha$-approximation algorithm
How do we prove algorithms have relative approximations?
- Can't describe opt, so can't compare to it
- instead, comparison to computable lower bounds.
Greedy Algorithms
Do obvious thing at each step.
- Hard part is proving it works.
- Usually, by attention to right upper/lower bound.
Max cut
- Upper bound trivial
- Max-cut greedy.
Set cover
- $n$ items
- $\OPT=k$
- At each step, can still cover remainder with $k$ sets
- So can cover $1/k$ of remainder
Vertex cover:
- define problem
- suppose repeatedly pick any uncovered edge and cover: $O(log n)$ from set cover
- suppose pick uncovered edge and cover both sides: 2-approx
- sometime, need to be extra greedy
- Explicit attention to where lower bound is coming from---lower bound informs algorithm.
Graham's rule for $P||C_{\max}$ is a $2-\frac1m$ approximation algorithm
- explain problem: $m$ machines, $n$ jobs with proc times $p_j$, min proc time.
- can also think of minimizing max load of continuously running jobs
- use a greedy algorithm to solve
- proof by comparison to lower bounds
- first lower bound: average load: $\OPT \ge \frac1m\sum p_j$
- second lower bound: $\OPT \ge \max p_j$
- consider max-load machine
- load before adding was less than average load, so less than OPT
- then add one job, length less than OPT
- so final weight is at most 2OPT
- Tighter: suppose added last-finishing/tight $p_j$ to machine with load $L$
- Then average load is at least $L+p_j/m \le \OPT$
- We achieve $$ \begin{align*} L+p_j &= (L+p_j/m) + p_j(1-1/m)\\ &\le \OPT+(1-1/m)\OPT\\ &=(2-1/m)\OPT \end{align*} $$
- this algorithm is online, competitive ratio $2-\frac1m$
- we have no idea of optimum schedule; just used lower bounds.
- we used a greedy strategy
- tight bound: consider $m(m-1)$ size-1 jobs, one size-$m$ job
- where was problem? Last job might be big
- LPT achieves 4/3, but not online
Scheduling Theory
General notation: machines $\mid$ constraints/features $\mid$ objective
- $m$ machines $i$ (and $n$ jobs $j$)
- $1$: $1$ machine
- $P$: $m$ identical machines
- $Q$: $m$ machines of different speeds, but all related
- $R$: $m$ unrelated machines, may have completely different runtimes on same jobs
- $F$: Flow shop (job is specific sequence of tasks on many machines)
- $O$: Open shop (job is specific set of tasks on many machines)
- Objectives
- $\sum C_j$ average (total) completion time
- $\max C_j$ time to complete whole set
- $\sum U_j$ number of “late” jobs (post deadline)
- Constraints
- $p_j=1$ unit time tasks
- $r_j$ release dates before which jobs cannot run
- $d_j$ deadlines to complete
- pmtn jobs can be preempted/interrupted
- prec precedence constraints on jobs
Approximation Schemes
So far, we've seen various constant-factor approximations.
- What is best constant achievable?
- defer APX-hardness discussion until later
An approximation scheme is a family of algorithms $A_\epsilon$ such that
- each algorithm polytime
- $A_\epsilon$ achieve $1+\epsilon$ approx
But note: runtime might be awful as function of $\epsilon$
FPAS, Pseudopolynomial algorithms
Knapsack example.
Dynamic program for bounded integer profits
- $B(j,p)=$ smallest (total size) subset of jobs $1,\ldots,j$ of total profit $\ge p$.
- because if can achieve it, can achieve it at minimum size
- and now we have sub-problem optimality for dynamic program
- $B(j+1,p) = \min \left( B(j,p),\quad s_{j+1}+B(j,p-p_{j+1}) \right)$
- For each $p$, solve all $n$ values of $j$
- Gives runtime $nP_{\OPT}$
did this prove $P=NP$?
- No
- recall pseudopoly algorithms $O(mU)$
- contrast to weakly polynomial $O(m\log U)$
rounding
- Let opt be $P$.
- Scale prices to $\lfloor (n/\epsilon P) p_i \rfloor$
- New opt is it least $n/\epsilon-n = (1-\epsilon)n/\epsilon$
- so seek solution of this value
- Table size/runtime $n^2/\epsilon$ ($n$ items, range $n/\epsilon$)
- Gives solution within $1-\epsilon$ of original opt
Wait, how do we know $P$?
- simple option: use a lower bound on $P$, namely $\max p_i$
- then opt scales to $(n/\epsilon \max p_i)(\sum p_i) < n^2/\epsilon$
- just slows algorithm to $n^3/\epsilon$
- Faster: suppose we guess $P>\OPT/(1-\epsilon)$
- OPT scales to at most $(n(1-\epsilon)/\epsilon P)\OPT < n/\epsilon-n$
- scaled solution of desired value $n/\epsilon-n$ doesn't exist
- so won't find
- Suppose we guess $P< \OPT$?
- OPT will scale to at least $n/\epsilon-n$
- so will find solution of this value
- possibly much less than $\OPT$, but that's ok
- Of two cases, outcome tells us whether we guessed $P$ too high
or low
- so, can use binary search to find $P$
- note once we are close---$P< OPT< P/(1-\epsilon)$---unclear which outcome happens
- but that's OK, because we are close
- quickly get within $(1-\epsilon)$ of $P$
- adds another $(1-\epsilon)$ factor to approximation ratio
- but that gives $(1-\epsilon)^2 \approx 1-2\epsilon$
- so just changes the constant.
- wait, need initial low and high values?
- lower bound? $\max p_i$
- upper bound? $\sum p_i$
- ratio of $n$
- so, look at powers of $(1+\epsilon)$ in that range
- $\log_{1+\epsilon}n = (\log n)/\epsilon$ of them
- gives a solution within $(1+\epsilon)$
pseudopoly gives FPAS; converse almost true
- Knapsack is only weakly NP-hard
- strong NP-hardness (define) means no pseudo-poly
From FPAS to pseudo poly:
- Suppose input instance has integers bounded by $t$
- Solution value is $O(nt)$
- Find $\epsilon$-approx with $\epsilon=1/(nt+1)$
- Solution will be integer, so equal optimum.
Enumeration
More powerful idea: $k$-enumeration
- Return to $P || C_{\max}$
- Schedule $k$ largest jobs optimally
- scheduling remainder greedily
- analysis: claim $A(I) \le OPT(I)+p_{k+1}$
- Consider job with max $c_j$
- If one of $k$ largest, done and at opt
- Else, was assigned to min load machine, so $c_j \le OPT+p_j \le OPT+p_{k+1}$
- so done if $p_{k+1}$ small
- but note $OPT(I) \ge (k/m)p_{k+1}$
- deduce $A(I) \le (1+m/k)OPT(I)$.
- So, for fixed $m$, can get any desired approximation ratio
- i.e., PAS
Scheduling any number of machines
- As with Knapsack, focus on decision problem
- Can I complete all in time $T$?
- If can answer this, binary search will optimize $T$
- Again, need $T_{\min}$ and $T_{\max}$ but these are easy
Combine enumeration and rounding
- Suppose only $k$ job sizes
- Vector of “number of each type” on a given machine---gives “machine type”
- Only $n^k$ distinct vectors/machine types
- Similarly, only $n^k$ problem instances
- find out which instances are feasible with $m$ machines
- By deciding how many of each machine type.
- Use dynamic program:
- enumerate all input instances that can be completed by $j$ machines in time $T$
- Feasible instance is sum of feasible $j-1$ machine instance and 1-machine instance (machine type)
- Works because only poly many instances and types.
- Use rounding to make few important job types
- Guess OPT $T$ to within $\epsilon$ (by binary search)
- All jobs of size exceeding $\epsilon T$ are “large”
- Round each up to next power of $(1+\epsilon)$ in range $\epsilon T \ldots T$
- Only $\log_{1+\epsilon}1/\epsilon = O(1/\epsilon \ln 1/\epsilon)$ large types
- Solve optimally
- Greedy schedule remainder
- If last job to finish is large, are optimal for rounded problem so within $\epsilon$ of opt
- If last job small, greedy analysis showes we are within $\epsilon$ of opt.
Hardness
- Is there always a PAS?
- MAX-SNP hard: some unbeatable constant
- Numerous problems in class (vertex cover, independent set, etc)
- Amplifications can prove no constant.
Relaxations
So far we've seen two techniques:
- Greedy: do the obvious
- Dynamic Programming: try everything
TSP
- Requiring tour: no approximation (finding hamiltonian path NP-hard)
- Undirected Metric: MST relaxation 2-approx, christofides
- recent progress has gotten below 3/2
- Directed: Cycle cover relaxation (HW) gives $O(\log n)$
- recent progress: $O(\log n / \log\log n)$
- Also, Euclidean TSP and planar graph TSP has PAS
intuition: SPT for $1||\sum C_j$
- suppose process longer before shorter
- then swap improves
- note haven't shown local opt implies global opt
- rather, have relied on global opt being local opt
$1 | r_j | \sum C_j$
- NP-hard
- relaxation: allow preemption
- optimum: SRPT
- assume no $r_j$: claim SPT optimal
- proof: local optimality argument
- consider swapping two pieces of two jobs
- suppose currently processing $k$ instead of SRPT $j$
- remaining times $p_j$ and $p_k$
- total $p_j+p_k$ time
- use first $p_j$ to process $j$, then do $k$
- some job must finish at $p_j+p_k$
- and no job can finish before $p_j$
- so this is locally optimal (for these 2 jobs)
- rounding: schedule jobs in preemptive completion order
- take preemptive schedule and insert $p_j$ time at $C_j$
- now room to schedule nonpreemptively
- how much does this slow down $j$?
- $p_k$ space inserted before $j$ for every job completing before $j$ in preemptive schedule
- in other words, only inserted $C_j$ time
- so $j$ completes in $2C_j$ time
- so 2-approx
- More recently: rounding, enumeration gives PAS
LP relaxations
Three steps
- write integer linear program
- relax
- round
Vertex cover
$$ \begin{align*} &\min \sum x_i\\ x_i+x_j &\ge 1& (i,j)\in E\\ x_i&\in{0,1} \end{align*} $$Rounding
- At least one end of edge $\ge 1/2$
- round up to 1
- others to 0
- each vertex at most doubles
- so 2-approx
- this is tight (Nemhauser-Trotter)
Even weighted (unlike greedy).
Approximation hardness:
- 7/6 Hastad 2001
- 10 √5 -21 (1.3606..) Dinur Safra 2002
- $\sqrt{2}$ Khot Minzer Safra 2017
- $2$ if unique games conjecture is true
- $2-O(1/\sqrt{\log n})$ achieved (and useful).
Facility Location
Metric version, with triangle inequality.- general version as hard as set cover.
ILP $$ \begin{align*} &\min \sum f_iy_i + \sum c_{ij}x_{ij}\\ x_{ij} &\le y_i\\ \sum_i x_{ij} &\ge 1 \end{align*} $$
Step 1: filtering
- Want to assign $j$ to one of its “partial” assignments $x_{ij}>0$
- $C_j = \sum_i x_{ij}c_{ij}$ is “average” assignment cost
- and is amount accounted for in fractional optimum
- but some $x_{ij}>0$ may have huge $c_{ij}$
- which wouldn't be accounted for
- rely on “can't have everything above average”
- claim at most $1/\rho$ fraction of assignment can have $c_{ij}>\rho C_j$
- if more, average exceeds $(1/\rho)(\rho C_j)=C_j$, impossible
- so, zero out an $x_{ij}$ with $c_{ij} \ge \rho C_j$
- and compensate by scaling up other $x_{ij}$ by $1/(1-1/\rho)$ factor
- Also have to increase $y_i$ by $1/(1-1/\rho)$ factor to “make room”
- New feasible solution to LP
- no longer necessarily optimal
- Now, assignment of client $j$ to any nonzero $x_{ij}$ costs at most $\rho C_j$
- So, total assignment cost at most $\rho\sum C_j$
Step 2: Facility opening: intuition
- To assign, need to open facilities
- If $y_i$ small, opening facility isn't paid for
- So, find a cluster of facilities of total $y_i > 1$
- Open minimum cost facility
- Cost $f_{\min}\le f_{\min}\sum y_i = \sum f_{\min}y_i \le \sum f_i y_i$ so LP upper bounds cost
- Everything in cluster nearby, so using opened facility as “proxy” for all others without adding much cost
Step 2: Facility opening: details
- Choose client with minimum $C_j$
- Take all his “available” facilities ($c_{ij} < \rho C_j$)
- Open the cheapest and zero the others
- So cost at most $\sum f_i y_i$ over $i$ in cluster
- assign every client that has nonzero $x_{ij}$ to
any node in cluster
- cost of assigning $j'$
- $\le \rho C_{j'}$ to reach its nonzero $x_{i'j'}$ in cluster
- then distance from $i'$ to $i$ is at most $2\rho C_j \le 2\rho C_{j'}$ by choice of $j'$
- so total $3\rho C_j$ by metric property
- Now iterate
Combine:
- multiplied $y_i$ by $1/(1-1/\rho)=\rho/(\rho-1)$
- multiplied assignment costs by $3\rho$
- note $x_{ij}$ are used only to identify “feasible” facilities to connec to
- so assignment cost not affected by scaling up $x_{ij}$
- for balance, set $\rho=4/3$ and get 4-approx
- other settings of $\rho$ yield bicriteria approximation trading facility and connection approximation costs
Further research progress has yielded
- 1.5-approximation
- .463-hardness result.
- Other algorithms based on greedy and local search.
MAX SAT
Define.
- literals
- clauses
- NP-complete
random setting
- achieve $1-2^{-k}$ (k is the number of literals in each clause)
- very nice for large $k$, but only $1/2$ for $k=1$
LP $$ \begin{eqnarray*} \max &&\sum z_j\\ \sum_{i \in C_j^+} y_i + \sum_{i \in C_j^-} (1-y_1) &\ge &z_j \end{eqnarray*} $$
- $\beta_k = 1-(1-1/k)^k$. values $1,3/4,.704,\ldots$
- Random round $y_i$
- Lemma: $k$-literal clause sat w/pr at least $\beta_k \zhat_j$.
- proof:
- assume all positive literals.
- prob $1-\prod(1-y_i)$
- maximize when all $y_i=\zhat_j/k$.
- Show $1-(1-\zhat/k)^k \ge \beta_k \zhat$.
- at $z=0,1$ these two sides are equal
- in between, right hand side is linear
- first deriv of LHS is $(1-z/k)^k=1$ at $z=0$; second deriv is $-(1-1/k)(1-z/k)^{k-2}< 0$
- so LHS cannot cross below and then return, must always be above RHS
- Result: $(1-1/e)$ approximation (convergence of $(1-1/k)^k$)
- much better for small $k$: i.e. $1$-approx for $k=1$
LP good for small clauses, random for large.
- Better: try both methods.
- $n_1,n_2$ number in both methods
- Show $(n_1+n_2)/2 \ge (3/4)\sum \zhat_j$
- $n_1 \ge \sum_{C_j \in S^k} (1-2^{-k})\zhat_j$
- $n_2 \ge \sum \beta_k \zhat_j$
- $n_1+n_2 \ge \sum (1-2^{-k}+\beta_k) \zhat_j \ge \sum \frac32\zhat_j$