So far, our crowning achievement was an \(O(m^2 \log U)\) maximum flow algorithm
via scaling
polynomial time in representation size (including numeric values in binary)
but weakly polynomial because depends on (bits of) capacities
now, aim for strongly polynomial running time.
running time that does not depend at all on \(U\)
only number of vertices and edges of the graph.
(We assume that arithmetic operations can be performed in \(O(1)\) time.)
Why weakly polynomial?
Recall: Our algorithms so far were “primal” greedy.
improve our current solution in each step
lower bounded progress made by each such step relative to max flow.
so runtime will depend in some way on value of max-flow
which will depend on capacities.
We need a different progress measure, independent of max-flow value.
How strongly polynomial?
Recall: certified optimality of flow \(f\) by looking whether \(s\) and \(t\) are connected in residual \(G_f\).
If not, \(f\) is a max flow;
if yes, \(f\) not maximum
(and an \(s\)-\(t\) augmenting path exists to improve it).
Key: test is independent of flow value
Can we make this yes/no test quantitative?
to differentiate between the flows that are “almost maximum” and the ones that are “far” from being maximum?
compare an \(n\)-edge \(s\)-\(t\) path to \(n\) parallel \(s\)-\(t\) edges
both have same “room” for flow
why is one able to carry so much more?
because on path each unit of flow uses up a lot of capacity
suggests looking at \(s\)-\(t\)- distance
Idea: consider \(s\)-\(t\) distance \(d_f(s,t)\) in the residual graph \(G_f\).
edge with positive residual capacity has length \(1\)
edge with zero residual capacity has length \(+\infty\).
network has volume \(=\) sum of capacities
flow fits—has volume less than network
\(s\) and \(t\) far apart \(\Rightarrow\) each flow path uses a lot of volume.
\(\Rightarrow\) not many paths can fit.
(Consider \(s\)-\(t\) path vs. a multiple parallel \((s,t)\) edges.)
If \(d_f(s,t)\geq n\), \(s\) and \(t\) are disconnected in \(G_f\).
\(\Rightarrow\) \(f\) is already a maximum flow.
Our New Goal: Design an augmenting path based algorithm that aims to increase the \(s\)-\(t\) distance \(d_f(s,t)\) in the residual graph.
What are the best augmenting paths to increase \(d_f(s,t)\)?
Intuition: Shortest (residual) paths prevent \(d_f(s,t)\) being large.
So "destroy" them.
How? Augment the flow using them!
Note: Finding the shortest augmenting path correspond to running BFS in the residual graph.
So it take \(O(m)\) time.
same as the "normal" augmenting path search.
Challenge: How to ensure that this augmentation does not introduce new shortest paths in \(G_f\)?
Maybe even reduce \(d_f(s,t)\) instead of increasing it.
Fortunately, this can’t happen the case. But we need to prove that!
Lemma: For any vertex \(v\), if \(d(v)\) (resp. \(d'(v)\) is the distance from \(s\) to \(t\) in the residual graph before (resp. after) augmenting the flow along some shortest augmenting path, then \(d(v)\leq d'(v)\).
So, the distance from \(s\) does not decrease not only for \(t\) but for every vertex. (Note that the proof relies on establishing this stronger claim - it is one of the examples when sometime proving a stronger claim is actually easier.)
By symmetry, one can argue that the distance from any vertex \(v\) to \(t\) is non-decreasing as well.
Proof:
Assume for the sake of contradiction that this is not the case.
Let \(A\) be the (non-empty) set of vertices \(u\) for which \(d(u)>d'(u)\). Take \(v\) to be the vertex with minimal \(d'(v)\) among all vertices in \(A\).
Let \(P'\) be the shortest \(s\)-\(v\) path in the residual graph after augmentation, and let \(w\) be the vertex preceding \(v\) on this path. (Note that we cannot have that \(v=s\), so such path and \(w\) exist.)
Note \(d'(v)=d'(w)+1\). Moreover, we must have that \(d(w)\leq d'(w)\) as otherwise \(w\in A\) and \(d'(w)<d'(v)\), which would contradict minimality of \(v\).
We claim that the last edge of \(P'\), i.e., \((w,v)\), had zero residual capacity before augmentation by \(P\). Otherwise, we would have that \[d(v)\leq d(w)+1 \leq d'(w)+1 = d'(v),\] and thus contradict the fact that \(v\in A\).
The only way for the edge \((w,v)\) to have non-zero residual capacity after augmentation by \(P\) would be if the edge \((v,w)\) belonged to \(P\).
But \(P\) was the shortest path before the augmentation.
\(\Rightarrow\) \(d(w)=d(v)+1\).
However, all of that means that \[d(v)=d(w)-1\leq d'(w)-1=d'(v)-2\leq d'(v),\] which contradicts our assumption that \(v\in A\).
So, augmenting the flow using shortest paths indeed does not make things worse. But does it make them better?
Yes! But, again, we need to prove that.
Lemma: We have at most \(\frac{mn}{2}\) shortest path flow augmentations before \(d_f(s,t)\geq n\) (and thus \(f\) is the maximum flow).
Proof:
Each flow augmentation saturates at least one "bottlenecking" edge \((u,v)\).
Before this edge is used saturated again in some subsequent flow augmentation, we must have pushed some flow via an augmenting path that contained the opposite edge \((v,u)\).
Let \(d(w)\) be the distance of \(s\) to \(w\) in the residual graph just before the first saturation of \((u,v)\), and let \(d'(w)\) be the corresponding distance just before the flow was pushed along \((v,u)\).
As we always use shortest paths to augmenting flow we need to have that \(d(v)=d(u)+1\) and \(d'(u)=d'(v)+1\).
But, by the lemma we proved above, we know that \(d(v)\leq d'(v)\).
\(\Rightarrow\) We thus need to have that \[d'(u)=d'(v)+1\geq d(v)+1 = d(u)+2.\]
\(\Rightarrow\) The distance from \(s\) to \(u\) had to increased by at least \(2\) by the time the edge \((u,v)\) can again be saturated by some augmenting path.
\(\Rightarrow\) \((u,v)\) can be saturated at most \(\frac{n}{2}\) times before the \(s\)-\(u\) distance becomes \(\geq n\) and thus \(d_f(s,t)\geq n\) as well.
Each edge can be saturated at most \(\frac{n}{2}\) times. So, there is at most \(\frac{mn}{2}\) saturations and thus flow augmentations.
Total running time: \(O(m^2 n)\). Strongly polynomial!
Note: In this analysis we have no way of lower bounding how much flow a particular flow augmentation pushed. We just can argue that over all the \(\frac{mn}{2}\) augmentations we managed to push the whole max flow value (no matter how large it was!). This is an important feature of so-called primal-dual algorithms. (Will get back to this later on.)