What do you do when a problem is NP-complete?
or, when the “polynomial time solution” is impractically slow?
consider special cases
assume input is random, do “expected performance.” Eg, Hamiltonian path in all graphs. Problem: agreeing on good distribution.
run a nonpolynomial (hopefully only slightly) algorithms such as branch and bound. Usually no proven bound on runtime, but sometime can.
settle for heuristic, but prove it does well enough (our focus)
Definitions:
optimization problem, instances \(I\), solutions \(S(I)\) with values \(f:S(I) \rightarrow R\)
maximization/minimization: find solution in \(S(I)\) maximizing/minimizing \(f\)
called OPT\((I)\)
eg bin-packing. instance is set of \(s_i \in {0,1}\), partition so no subset exceeds 1
Techincal assumptions we’ll often make:
assumption: all inputs and range of \(f\) are integers/rationals (can’t represent reals, and allows, eg, LP, binary search).
assumption \(f(\sigma)\) is a polynomial size (num bits) number (else output takes too long)
look for polytime in bit complexity
NP-hardness
optimization NP-hard if can reduce an NP-hard decision problem to it
(eg, problem of “is opt solution to this instance \(\le k\)?”)
but use more general notion of turing-reducibility (GJ).
Approximation algorithm:
any algorithm that gives a feasible answer
eg, each item in own bin.
of course, want good algorithm. How measure?
Definition: \(k\)-abs approx if on any \(I\), have \(|A(I)-OPT(I)| \le k\)
Example: planar graph coloring.
NP-complete to decide if 3 colorable
know 4-colorable
easy to 5-color
Ditto for edge coloring: Vizing’s theorem says opt is \(\Delta\) or (constructive) \(\Delta+1\)
Known only for trivial cases, where opt is bounded by a constant.
Often, can show impossible by “scaling” the problem.
EG knapsack.
define profits \(p_i\), sizes \(s_i\), sack \(B\)
wlog, integers.
suppose have \(k\)-absolute
multiply all \(p_i\) by \(k+1\), solve, scale down.
EG independent set (clique)
\(k+1\) copies of \(G\)
Definitions:
An \(\alpha\)-optimum solution has value at most \(\alpha\) times optimum for minimization, at least \(1/\alpha\) times optimum for maximization.
an algorithm has approximation ratio \(\alpha\) if on any input, it outputs an \(\alpha\)-approximate feasible solution.
called an \(\alpha\)-approximation algorithm
How do we prove algorithms have relative approximations?
Can’t describe opt, so can’t compare to it
instead, comparison to computable lower bounds.
Do obvious thing at each step.
Hard part is proving it works.
Usually, by attention to right upper/lower bound.
Max cut
Upper bound trivial
Max-cut greedy.
Set cover
\(n\) items
\({\mbox{OPT}}=k\)
At each step, can still cover remainder with \(k\) sets
So can cover \(1/k\) of remainder
Vertex cover:
define problem
suppose repeatedly pick any uncovered edge and cover: \(O(log n)\) from set cover
suppose pick uncovered edge and cover both sides: 2-approx
sometime, need to be extra greedy
Explicit attention to where lower bound is coming from—lower bound informs algorithm.
Graham’s rule for \(P||C_{\max}\) is a \(2-\frac1m\) approximation algorithm
explain problem: \(m\) machines, \(n\) jobs with proc times \(p_j\), min proc time.
can also think of minimizing max load of continuously running jobs
use a greedy algorithm to solve
proof by comparison to lower bounds
first lower bound: average load: \({\mbox{OPT}}\ge \frac1m\sum p_j\)
second lower bound: \({\mbox{OPT}}\ge \max p_j\)
consider max-load machine
load before adding was less than average load, so less than OPT
then add one job, length less than OPT
so final weight is at most 2OPT
Tighter: suppose added last-finishing/tight \(p_j\) to machine with load \(L\)
Then average load is at least \(L+p_j/m \le {\mbox{OPT}}\)
We achieve \[\begin{aligned} L+p_j &= (L+p_j/m) + p_j(1-1/m)\\ &\le {\mbox{OPT}}+(1-1/m){\mbox{OPT}}\\ &=(2-1/m){\mbox{OPT}}\end{aligned}\]
Notice:
this algorithm is online, competitive ratio \(2-\frac1m\)
we have no idea of optimum schedule; just used lower bounds.
we used a greedy strategy
tight bound: consider \(m(m-1)\) size-1 jobs, one size-\(m\) job
where was problem? Last job might be big
LPT achieves 4/3, but not online
General notation: machines \(\mid\) constraints/features \(\mid\) objective
\(m\) machines \(i\) (and \(n\) jobs \(j\))
\(1\): \(1\) machine
\(P\): \(m\) identical machines
\(Q\): \(m\) machines of different speeds, but all related
\(R\): \(m\) unrelated machines, may have completely different runtimes on same jobs
\(F\): Flow shop (job is specific sequence of tasks on many machines)
\(O\): Open shop (job is specific set of tasks on many machines)
Objectives
\(\sum C_j\) average (total) completion time
\(\max C_j\) time to complete whole set
\(\sum U_j\) number of “late” jobs (post deadline)
Constraints
\(p_j=1\) unit time tasks
\(r_j\) release dates before which jobs cannot run
\(d_j\) deadlines to complete
pmtn jobs can be preempted/interrupted
prec precedence constraints on jobs
So far, we’ve seen various constant-factor approximations.
What is best constant achievable?
defer APX-hardness discussion until later
An approximation scheme is a family of algorithms \(A_\epsilon\) such that
each algorithm polytime
\(A_\epsilon\) achieve \(1+\epsilon\) approx
But note: runtime might be awful as function of \(\epsilon\)
Knapsack example.
Dynamic program for bounded integer profits
\(B(j,p)=\) smallest (total size) subset of jobs \(1,\ldots,j\) of total profit \(\ge p\).
because if can achieve it, can achieve it at minimum size
and now we have sub-problem optimality for dynamic program
\(B(j+1,p) = \min \left( B(j,p),\quad s_{j+1}+B(j,p-p_{j+1}) \right)\)
For each \(p\), solve all \(n\) values of \(j\)
Gives runtime \(nP_{{\mbox{OPT}}}\)
did this prove \(P=NP\)?
No
recall pseudopoly algorithms \(O(mU)\)
contrast to weakly polynomial \(O(m\log U)\)
rounding
Let opt be \(P\).
Scale prices to \(\lfloor (n/\epsilon P) p_i \rfloor\)
New opt is it least \(n/\epsilon-n = (1-\epsilon)n/\epsilon\)
so seek solution of this value
Table size/runtime \(n^2/\epsilon\) (\(n\) items, range \(n/\epsilon\))
Gives solution within \(1-\epsilon\) of original opt
Wait, how do we know \(P\)?
simple option: use a lower bound on \(P\), namely \(\max p_i\)
then opt scales to \((n/\epsilon \max p_i)(\sum p_i) < n^2/\epsilon\)
just slows algorithm to \(n^3/\epsilon\)
Faster: suppose we guess \(P>{\mbox{OPT}}/(1-\epsilon)\)
OPT scales to at most \((n(1-\epsilon)/\epsilon P){\mbox{OPT}}< n/\epsilon-n\)
scaled solution of desired value \(n/\epsilon-n\) doesn’t exist
so won’t find
Suppose we guess \(P<{\mbox{OPT}}\)?
OPT will scale to at least \(n/\epsilon-n\)
so will find solution of this value
possibly much less than \({\mbox{OPT}}\), but that’s ok
Of two cases, outcome tells us whether we guessed \(P\) too high or low
so, can use binary search to find \(P\)
note once we are close—\(P<OPT<P/(1-\epsilon)\)—unclear which outcome happens
but that’s OK, because we are close
quickly get within \((1-\epsilon)\) of \(P\)
adds another \((1-\epsilon)\) factor to approximation ratio
but that gives \((1-\epsilon)^2 \approx 1-2\epsilon\)
so just changes the constant.
wait, need initial low and high values?
lower bound? \(\max p_i\)
upper bound? \(\sum p_i\)
ratio of \(n\)
so, look at powers of \((1+\epsilon)\) in that range
\(\log_{1+\epsilon}n = (\log n)/\epsilon\) of them
gives a solution within \((1+\epsilon)\)
pseudopoly gives FPAS; converse almost true
Knapsack is only weakly NP-hard
strong NP-hardness (define) means no pseudo-poly
From FPAS to pseudo poly:
Suppose input instance has integers bounded by \(t\)
Solution value is \(O(nt)\)
Find \(\epsilon\)-approx with \(\epsilon=1/(nt+1)\)
Solution will be integer, so equal optimum.
More powerful idea: \(k\)-enumeration
Return to \(P || C_{\max}\)
Schedule \(k\) largest jobs optimally
scheduling remainder greedily
analysis: claim \(A(I) \le OPT(I)+p_{k+1}\)
Consider job with max \(c_j\)
If one of \(k\) largest, done and at opt
Else, was assigned to min load machine, so \(c_j \le OPT+p_j \le OPT+p_{k+1}\)
so done if \(p_{k+1}\) small
but note \(OPT(I) \ge (k/m)p_{k+1}\)
deduce \(A(I) \le (1+m/k)OPT(I)\).
So, for fixed \(m\), can get any desired approximation ratio
i.e., PAS
Scheduling any number of machines
As with Knapsack, focus on decision problem
Can I complete all in time \(T\)?
If can answer this, binary search will optimize \(T\)
Again, need \(T_{\min}\) and \(T_{\max}\) but these are easy
Combine enumeration and rounding
Suppose only \(k\) job sizes
Vector of “number of each type” on a given machine—gives “machine type”
Only \(n^k\) distinct vectors/machine types
Similarly, only \(n^k\) problem instances
find out which instances are feasible with \(m\) machines
By deciding how many of each machine type.
Use dynamic program:
enumerate all input instances that can be completed by \(j\) machines in time \(T\)
Feasible instance is sum of feasible \(j-1\) machine instance and 1-machine instance (machine type)
Works because only poly many instances and types.
Use rounding to make few important job types
Guess OPT \(T\) to within \(\epsilon\) (by binary search)
All jobs of size exceeding \(\epsilon T\) are “large”
Round each up to next power of \((1+\epsilon)\) in range \(\epsilon T \ldots T\)
Only \(\log_{1+\epsilon}1/\epsilon = O(1/\epsilon \ln 1/\epsilon)\) large types
Solve optimally
Greedy schedule remainder
If last job to finish is large, are optimal for rounded problem so within \(\epsilon\) of opt
If last job small, greedy analysis showes we are within \(\epsilon\) of opt.
Is there always a PAS?
MAX-SNP hard: some unbeatable constant
Numerous problems in class (vertex cover, independent set, etc)
Amplifications can prove no constant.
So far we’ve seen two techniques:
Greedy: do the obvious
Dynamic Programming: try everything
Can we be more clever?
TSP
Requiring tour: no approximation (finding hamiltonian path NP-hard)
Undirected Metric: MST relaxation 2-approx, christofides
recent progress has gotten below 3/2
Directed: Cycle cover relaxation (HW) gives \(O(\log n)\)
recent progress: \(O(\log n / \log\log n)\)
Also, Euclidean TSP and planar graph TSP has PAS
intuition: SPT for \(1||\sum C_j\)
suppose process longer before shorter
then swap improves
note haven’t shown local opt implies global opt
rather, have relied on global opt being local opt
\(1 | r_j | \sum C_j\)
NP-hard
relaxation: allow preemption
optimum: SRPT
assume no \(r_j\): claim SPT optimal
proof: local optimality argument
consider swapping two pieces of two jobs
suppose currently processing \(k\) instead of SRPT \(j\)
remaining times \(p_j\) and \(p_k\)
total \(p_j+p_k\) time
use first \(p_j\) to process \(j\), then do \(k\)
some job must finish at \(p_j+p_k\)
and no job can finish before \(p_j\)
so this is locally optimal (for these 2 jobs)
rounding: schedule jobs in preemptive completion order
take preemptive schedule and insert \(p_j\) time at \(C_j\)
now room to schedule nonpreemptively
how much does this slow down \(j\)?
\(p_k\) space inserted before \(j\) for every job completing before \(j\) in preemptive schedule
in other words, only inserted \(C_j\) time
so \(j\) completes in \(2C_j\) time
so 2-approx
More recently: rounding, enumeration gives PAS
Three steps
write integer linear program
relax
round
\[\begin{aligned} &\min \sum x_i\\ x_i+x_j &\ge 1& (i,j)\in E\\ x_i&\in{0,1}\end{aligned}\]
Rounding
At least one end of edge \(\ge 1/2\)
round up to 1
others to 0
each vertex at most doubles
so 2-approx
this is tight (Nemhauser-Trotter)
Even weighted (unlike greedy).
Approximation hardness:
7/6 Hastad 2001
\(10\sqrt{5} -21 \approx 1.3606\ldots\) Dinur Safra 2002
\(\sqrt{2}\) Khot Minzer Safra 2017
\(2\) if unique games conjecture is true
\(2-O(1/\sqrt{\log n})\) achieved (and useful).
Metric version, with triangle inequality.
general version as hard as set cover.
ILP \[\begin{aligned} &\min \sum f_iy_i + \sum c_{ij}x_{ij}\\ x_{ij} &\le y_i\\ \sum_i x_{ij} &\ge 1\end{aligned}\]
Step 1: filtering
Want to assign \(j\) to one of its “partial” assignments \(x_{ij}>0\)
\(C_j = \sum_i x_{ij}c_{ij}\) is “average” assignment cost
and is amount accounted for in fractional optimum
but some \(x_{ij}>0\) may have huge \(c_{ij}\)
which wouldn’t be accounted for
rely on “can’t have everything above average”
claim at most \(1/\rho\) fraction of assignment can have \(c_{ij}>\rho C_j\)
if more, average exceeds \((1/\rho)(\rho C_j)=C_j\), impossible
so, zero out an \(x_{ij}\) with \(c_{ij} \ge \rho C_j\)
and compensate by scaling up other \(x_{ij}\) by \(1/(1-1/\rho)\) factor
Also have to increase \(y_i\) by \(1/(1-1/\rho)\) factor to “make room”
New feasible solution to LP
no longer necessarily optimal
Now, assignment of client \(j\) to any nonzero \(x_{ij}\) costs at most \(\rho C_j\)
So, total assignment cost at most \(\rho\sum C_j\)
Step 2: Facility opening: intuition
To assign, need to open facilities
If \(y_i\) small, opening facility isn’t paid for
So, find a cluster of facilities of total \(y_i > 1\)
Open minimum cost facility
Cost \(f_{\min}\le f_{\min}\sum y_i = \sum f_{\min}y_i \le \sum f_i y_i\) so LP upper bounds cost
Everything in cluster nearby, so using opened facility as “proxy” for all others without adding much cost
Step 2: Facility opening: details
Choose client with minimum \(C_j\)
Take all his “available” facilities (\(c_{ij} < \rho C_j\))
Open the cheapest and zero the others
So cost at most \(\sum f_i y_i\) over \(i\) in cluster
assign every client that has nonzero \(x_{ij}\) to any node in cluster
cost of assigning \(j'\)
\(\le \rho C_{j'}\) to reach its nonzero \(x_{i'j'}\) in cluster
then distance from \(i'\) to \(i\) is at most \(2\rho C_j \le 2\rho C_{j'}\) by choice of \(j'\)
so total \(3\rho C_j\) by metric property
Now iterate
Combine:
multiplied \(y_i\) by \(1/(1-1/\rho)=\rho/(\rho-1)\)
multiplied assignment costs by \(3\rho\)
note \(x_{ij}\) are used only to identify “feasible” facilities to connec to
so assignment cost not affected by scaling up \(x_{ij}\)
for balance, set \(\rho=4/3\) and get 4-approx
other settings of \(\rho\) yield bicriteria approximation trading facility and connection approximation costs
Further research progress has yielded
1.5-approximation
.463-hardness result.
Other algorithms based on greedy and local search.
Define.
literals
clauses
NP-complete
random setting
achieve \(1-2^{-k}\) (k is the number of literals in each clause)
very nice for large \(k\), but only \(1/2\) for \(k=1\)
LP \[\begin{aligned} \max &&\sum z_j\\ \sum_{i \in C_j^+} y_i + \sum_{i \in C_j^-} (1-y_1) &\ge &z_j\end{aligned}\]
Analysis
\(\beta_k = 1-(1-1/k)^k\). values \(1,3/4,.704,\ldots\)
Random round \(y_i\)
Lemma: \(k\)-literal clause sat w/pr at least \(\beta_k {\hat z}_j\).
proof:
assume all positive literals.
prob \(1-\prod(1-y_i)\)
maximize when all \(y_i={\hat z}_j/k\).
Show \(1-(1-{\hat z}/k)^k \ge \beta_k {\hat z}\).
at \(z=0,1\) these two sides are equal
in between, right hand side is linear
first deriv of LHS is \((1-z/k)^k=1\) at \(z=0\); second deriv is \(-(1-1/k)(1-z/k)^{k-2}<0\)
so LHS cannot cross below and then return, must always be above RHS
Result: \((1-1/e)\) approximation (convergence of \((1-1/k)^k\))
much better for small \(k\): i.e. \(1\)-approx for \(k=1\)
LP good for small clauses, random for large.
Better: try both methods.
\(n_1,n_2\) number in both methods
Show \((n_1+n_2)/2 \ge (3/4)\sum {\hat z}_j\)
\(n_1 \ge \sum_{C_j \in S^k} (1-2^{-k}){\hat z}_j\)
\(n_2 \ge \sum \beta_k {\hat z}_j\)
\(n_1+n_2 \ge \sum (1-2^{-k}+\beta_k) {\hat z}_j \ge \sum \frac32{\hat z}_j\)