I implemented Kargers randomized Min Cut algorithm, the improved Karger-Stein randomized Mincut algorithm as well as an Algorithm for approximating the Normalized Cut of a graph which I applied to Image Segmentation.
The Min Cut algorithms are structured to fit in with the LightGraphs.jl package for Graphs in Julia.
using GraphPlot, LightGraphs, Networks;
using Base.Collections
#Pkg.add("Benchmarks")
using BenchmarkTools
The Minimum cut of a graph is the set of edges with minimum weight such that if removed, the graph is disconnected
Mincut is often brought up as a result of a fun theorem called the mincut-maxflow theorem which is itself a special case of a more general property of linear programming.
The Mincut-Maxflow theorem says that in a directed graph, the maximum flow that can travel from one vertex S to another Vertex T, is equal to the minimum cut that disconnects S and T.
As a consequence, the global minimum cut for a graph is equal to the maximum flow between any two vertices of a graph.
There exist max flow algorithms that take $O(V^2*E)$ so a naive way to find the mincut could be to do all V choose 2 max flows which would be $O(V^4*E)$ which is slow and seems dumb. If we assume the graph is dense and $E \approx V^2$ then the algorithm is $O(V^6)$ which is dreadful.
So can we do better?
Maybe not exactly, but (David) Karger (faculty member at MIT) came up with a randomized algorithm that probably will find the answer, much faster than the previously mentioned method.
The mincut of a graph with 2 Vertices is easy! There is only one cut! So the basic gameplan is to reduce the big graph, to one with only 2 super-vertices!
The main operation is squishing vertices together. Take U, V to UV with every edge from U and V still included and any vertex that had an edge from each U and V, becomes one edges with the two weights summed
Pseudocode is this:
While number of Vertices >2:
pick edge at random (u,v)
squish u and v together
return cut
So that gives a cut, but this seems like a pretty terrible idea. There almost no chance that will give the right answer!
That is true! But it is fast (How fast? $O(E)$)! And there is a non-zero chance of being correct! So if we do it a lot, it won't take a lot of time and eventually it'll be right!
The tough part is how many times do we need to try?
Turns out about $O(V^2*log(V))$ to get the probability of being wrong down to $1/|V|$
In $O(V^2*E*log(V))$ time, I can find the mincut with 1-1/V probability. Even in a dense graph, this is $O(V^4*log(V))$ which is much faster that $O(V^6)$
The LightGraphs.jl only supports unweighted graphs. In order to accomodate weighted graphs (which is necessary for most useful mincut applications), we must pass in an adjacency matrix where the (i, j) entry corresponds to the weight of the edge from vertex i to vertex j. By definition, the diagonal is zero.
#my implementation
function matrix_mincut(mat)
top = size(mat)[1]
inds = collect(1:top)
while size(inds)[1]>2
pick = rand(inds)
pick2 = rand(inds)
while pick==pick2
pick2 = rand(inds)
end
mat[:, pick] = mat[:,pick] + mat[:,pick2]
mat[pick, :] = mat[pick, :] + mat[pick2, :]
mat[pick, pick] = 0
mat[pick2, :] = 0
mat[:, pick2] = 0
deleteat!(inds, findin(inds, [pick2]))
end
return maximum(mat)
end
function kargers_mincut(graph, mat_in, ep = 1)
n = nv(graph)
howmany = n^2*log(n)*ep
mincut = Inf
#mat = adjacency_matrix(graph)
#cousin = mat
for i in 1:howmany
mat = copy(mat_in)
contender = matrix_mincut(mat)
if contender<mincut
mincut = contender
end
end
return mincut
end
Does it work? Well, unlike my midterm project, Yes! The ep parameter allows the user to make a time/accuracy tradeoff by increasing or decreasing the amount of iterations the algorithm goes through.
graphobj = Graph(10, 34)
gplot(graphobj)
cut = kargers_mincut(graphobj, adjacency_matrix(graphobj))
Actually quite fast! At least compared to the first c++ and python implementations that come up on google! (I had to fix the c++ one, they did twice as much work as they had to do. I am not a very good c++ programmer)
I timed them on the same V=200, E=2517 graph but because actually evaluating the mincut would take forever, I capped the iterations at 200.
3 Python times (sec): 14.1346, 14.2936, 14.2722
3 C++ times (sec): 4.7646, 4.8759, 4.6284
3 Julia times (sec): 4.0618, 4.0675, 4.0925
The c++ code is not optimal but this still demonstrates Julia's ability to perform at a high level and compete with c++
$O(V^4*log(V))$ for dense graphs still isn't very good. Can we do better?
Yes. Turns out Karger thought this too and got together with Stein and they did do much better!
Here is the important insight. The first squish, probably isnt going to squish across the mincut, but as we go on, that becomes more and more likely. So we should sample more frequently from later in the process.
Here is the pseudocode to the Karger-Stein Mincut Algorithm
fastmincut(G)
if V<6
return mincut
else
t = 1+1/sqrt(2)
squish graph randomly down to |V|= t --> G1
squish graph randomly down to |V|= t --> G2
return min( fastmincut(G1), fastmincut(G2) )
This process takes longer (it is $O(V^2*log(V))$) but it gets the right answer much more frequently so we only need to repeat it $O(log(V)^2)$ times to achieve a $1/V$ probability of not being correct
This bring our final run time to $O(V^2*log(V)^3)$ which is much much faster than our initial naive algorithm. The state of the art deterministic algorithm (Stoer-Wagner) is $O(V*E + V^2*log(V))$ so they are comparable for sparse graphs but Karger Stein is much faster for big, dense graphs
Also, consider that for a dense graph, an algorithm must touch at least every edge which is $O(V^2)$. $O(log(V)^3)$ is a relatively small multiplicative factor to be away from optimal.
For my implementation, I used a slightly modified version of my original Kargers Mincut to find the mincut for when the graph gets down to below 6 vertices.
# so here is my implementation of Karger Stein
function kargers_mincut_mat(mat_in, n)
howmany = n^2*log(n)*100 #seems high but the graph only has 6 nodes and I cant make mistakes here. 1/600 chance of mistake
mincut = Inf
#mat = adjacency_matrix(graph)
#cousin = mat
for i in 1:howmany
mat = copy(mat_in)
contender = matrix_mincut(mat)
if contender<mincut && contender != 0
mincut = contender
end
end
return mincut
end
function squish(mat, inds, t)
while size(inds)[1]>t
pick = rand(inds)
pick2 = rand(inds)
while pick==pick2
pick2 = rand(inds)
end
mat[:, pick] = mat[:,pick] + mat[:,pick2]
mat[pick, :] = mat[pick, :] + mat[pick2, :]
mat[pick, pick] = 0
mat[pick2, :] = 0
mat[:, pick2] = 0
deleteat!(inds, findin(inds, [pick2]))
end
return mat, inds
end
function karger_stein_mincut(matrix, inds)
big = size(inds)[1]
if big<=6
mincut = kargers_mincut_mat(matrix, 2*big)
return mincut
else
t = 1+big/sqrt(2)
in1 = copy(matrix)
in2 = copy(matrix)
#t = convert(Int64, t)
mat1, ind1 = squish(in1, inds, t)
mat2, ind2 = squish(in2, inds, t)
return min(karger_stein_mincut(mat1, ind1), karger_stein_mincut(mat2, ind2))
end
end
function ks_mincut(graph, graph_mat)
n = nv(graph)
top = log(n)^2/(log(10)^2)
mincut = Inf
for i in 1:top
inds = collect(1:n)
graph_mat2 = copy(graph_mat)
maybe = karger_stein_mincut(graph_mat2, inds)
if maybe<mincut
mincut = maybe
end
end
return mincut
end
So how well does this work? It should be faster! Lets start with a small-ish graph. V = 30, E = 120
graphobj = Graph(30, 120)
gplot(graphobj)
@time ks_mincut(graphobj, adjacency_matrix(graphobj))
@time kargers_mincut(graphobj, adjacency_matrix(graphobj))
So theyre about the same at this point
Lets try and even bigger graph, V=60, E = 400
graphobj = Graph(60, 400)
gplot(graphobj)
@time ks_mincut(graphobj, adjacency_matrix(graphobj))
@time kargers_mincut(graphobj, adjacency_matrix(graphobj))
Now there is starting to be a pronounced difference between the two algorithms in run time. Lets exaggerate it more and more
A bigger graph V = 80, E = 700
graphobj = Graph(80, 700)
gplot(graphobj)
@time ks_mincut(graphobj, adjacency_matrix(graphobj))
@time kargers_mincut(graphobj, adjacency_matrix(graphobj))
It is clear that the Karger-Stein implementation is better.
This is all adapted from "Normalized Cuts and Image Segmentation" by Jianbo Shi and Jitendra Malik http://repository.upenn.edu/cgi/viewcontent.cgi?article=1101&context=cis_papers
I was going to use Min-cut to do image segementation but it turns out there are better ways to do image segmentation and this one is more interesting so I'm going to focus on this.
We want to cluster together pixels that come from the same object. You can image this is important in a lot of applications but lets focus on Photoshop. You wanna be able to drag an object out of its background easily
Min-cut favors sparse regions, but that isn't super helpful for image segmentation.
We want a "normalized cut" We want to split the graph into roughly equal sizes with as little cost as possible, unfortunately, this is NP-Complete. It can be reduced to Partition in polynomial time.
Can we "almost" do it? Yes. Approximation algorithms dont always work well but in this case we can articulate normalized cut as an Integer Linear Programming problem and relax the integral contraint and end up with a decent answer.
Let's define normalized cut. Define assoc(A, V): V is a graph and A is a subset of the graph. Assoc(A, V) is equal to the sum of the weights of edges connected to a vertex in A and a vertex outside of A.
We define the normalized cut problem as minimize over cuts NCut(V) = Cut(A, B)/assoc(A, V) + Cut(B, A)/assoc(B, V). Note that this quantity is minimized if the Cut(A, B) is minimal and the Assoc(A,V) is equal to Assoc(B, V).
By making a continuous approximation we can make this solvable-ish
We a little bit of work it reduces to to an eigenvalue problem $D^{(-1/2)}(D-W)D^{(-1/2)}v = \lambda D v$
This turns out to be an interesting setup from a strictly graph theory perspective but all we need is the second smallest eigenvalue and eigenvector which correspond to the optimal partition. In the end we will have an eigenvector of length |V| and the cut will be defined by if a component of the vector is above or below a cutoff value.
First step: It isn't trivially obvious how you would turn an image into a graph. So how do we do this?
So here is our image. It is 64 x 64. First were gonna turn it into a matrix
Im going to convert it to HSV and make a tensor out of it
using Images, Colors, ImageCore;
pic = Images.load("dog.jpg")
cimg = float(Array(ImageCore.channelview(pic)))
size(cimg)
imhsv = convert(Image{HSV}, pic)
size(imhsv)
hsvimg = float(Array(ImageCore.channelview(imhsv)))
size(hsvimg)
using PyPlot
How are we gonna turn this into a graph problem? Let's start by making a vertex for every pixel.
What about the edges? We're going to set up this graph as an adjacency matrix.
We're gonna make edges between nearby ish pixels and the magnitude of the weight will be related to the similarity of the pixel intensity. Here is the equation
$w_{ij} = e^{(-|| F(i) - F(j)||^2/ \sigma_i^2)}* \alpha$
where $\alpha$ = 0 if $|| X(i) - X(j) || > r$
else $\alpha$ = $e^{(-|| X(i) - X(j)||^2/ \sigma_X^2)}$
function getF(h, s, v)
# h = pixel[1]
# s = pixel[2]
# v = pixel[3]
out = [v, v*s*sin(h), v*s*cos(h)]
return out
end
function get_top(f1, f2, sigma)
dif = f1-f2
out = norm(dif)^2/sigma
return out
end
function xtop(i1, j1, i2, j2, sigma)
top = (i1-i2)^2+(j1-j2)^2
out = top/sigma
return out, top
end
function precomputeF(img)
one = size(img)[2]
two = size(img)[3]
out = ones(3, one,two)
for i in 1:one
for j in 1:two
out[:, i,j] = getF(img[1,i,j], img[2,i,j], img[3,i,j])
end
end
return out
end
function get_val(F1, F2, i1, i2, j1, j2, r, sigmaI, sigmaX)
xtop1, dist = xtop(i1, j1, i2, j2, sigmaX)
if dist>r
return 0
else
almostf = get_top(F1, F2, sigmaI)
first = e^(-almostf)
second = e^(-xtop1)
return first*second
end
end
function makeW(img, r, sigmaI, sigmaX)
one = size(img)[2]
two = size(img)[3]
precompute1 = precomputeF(img)
out = eye(one*two, one*two)
for i in 1:one
for j in 1:two
for k in 1:one
for m in 1:two
F1 = precompute1[:, i,j]
F2 = precompute1[:, k,m]
val = get_val(F1, F2, i, k, j, m, r, sigmaI, sigmaX)
if i==k && j==m
val = 1
end
first_coor = (i-1)*one+j
sec_coor = (k-1)*one+m
out[first_coor, sec_coor]=val
end
end
end
end
return out
end
@time W = makeW(hsvimg, 15, 0.1, 4)
Now we need to compute the degree matrix which is the a diagonal matrix with each diagonal element equal to the sum of the corresponding row in W.
function makeD(W)
one = size(W)[1]
out = ones(one)
for i in 1:one
val = sum(W[i,:])
out[i] = val
end
return out
end
d = makeD(W)
D = Diagonal(d)
We then set up our eigenvalue problem
first = inv(sqrt(D))
left = first*(D-W)*first
gonna = sparse(left)
We use Julia's wonderful eigenvalue calculator to find the 6 smallest eigenvalues and corresponding eigenvectors. It can't be over-emphasized how much the success of this depends on Julias wonderful eigensolver.
u, v = eigs(left, nev = 6, which ="SM", maxiter = 600)
u = real(u)
v = real(v)
I make a quick function which sets the values of the eigenvector to be between 0 and 1 and then plot the second smallest eigenvector as an image. Light and Dark regions correspond to belonging to different sides of the cut.
function vecZero2One(vec)
top = maximum(vec)
bottom = minimum(vec)
shape = size(vec)
outgoing = (vec - bottom*ones(shape))/(top-bottom)
return outgoing
end
maybe = reshape(v[:,2], (64, 64))
maybe = vecZero2One(maybe)
matshow(maybe, cmap="Greys")
function putting_it_all_together(cimg, r, sigmaI, sigmaX, how_many)
W = makeW(cimg, r, sigmaI, sigmaX)
d = makeD(W)
D = Diagonal(d)
first = inv(sqrt(D))
left = first*(D-W)*first
u, v = eigs(left, nev = how_many, which ="SM", maxiter=500)
return real(u), real(v)
end
@time u, v = putting_it_all_together(hsvimg, 50, 0.03, 4, 6)
All things considered, it runs pretty fast. It has to manipulate and find the eigenvectors of a 4096 x 4096 matrix, note that the cut is quite sensitive to the parameters in calculating the W matrix.
maybe = reshape(v[:,2], (64, 64))
maybe = vecZero2One(maybe)
matshow(maybe, cmap = "Greys")
pic2 = Images.load("dude.jpeg")
imhsv2 = convert(Image{HSV}, pic2)
hsvimg2 = float(Array(ImageCore.channelview(imhsv2)));
@time u2, v2 = putting_it_all_together(hsvimg2, 5, 0.1, 0.5, 6)
maybe2 = reshape(v2[:,2], (50, 50))
maybe2 = vecZero2One(maybe2)
matshow(maybe2, cmap = "Greys")
pic3 = Images.load("car2.jpg")
imhsv3 = convert(Image{HSV}, pic3)
hsvimg3 = float(Array(ImageCore.channelview(imhsv3)));
@time u3, v3 = putting_it_all_together(hsvimg3, 50, 0.03, 3, 6)
maybe3 = reshape(v3[:,2], (50, 50))
maybe3 = vecZero2One(maybe3)
matshow(maybe3, cmap="Greys")
It does a pretty good job of identifying the object in the image! It seems to pick out the object and half of the background which is more or less what is expected. The full method for image segmentation using normalized cut is to iteratively cut each subgraph until it is satisfactorily segmented.
function cut_pic(pic, v, val, x, set)
pic3 = copy(pic)
pic2 = reshape(pic3, (3, x^2))
vu = v'
vu = reshape(vu, (1, x^2))
for i in 1:x^2
if vu[i]<val
pic2[:, i] = set
end
end
out = reshape(pic2, (3, x, x))
return out
end
vcut = vecZero2One(v3[:,2])
vcut = reshape(vcut, (50, 50))
gonna_try3 = cut_pic(hsvimg3, vcut, 0.26, 50, 0);
colorview(HSV, gonna_try3[1, :, :] , gonna_try3[2, :, :], gonna_try3[3, :, :])
colorview(HSV, hsvimg3[1, :, :] , hsvimg3[2, :, :], hsvimg3[3, :, :])
This worked really well! It found the border of the car really well!
vcut = vecZero2One(v2[:,2])
vcut = reshape(vcut, (50, 50))
gonna_try2 = cut_pic(hsvimg2, vcut, 0.32, 50, 0);
colorview(HSV, gonna_try2[1, :, :] , gonna_try2[2, :, :], gonna_try2[3, :, :])
colorview(HSV, hsvimg2[1, :, :] , hsvimg2[2, :, :], hsvimg2[3, :, :])
It finds the edge of the head but unfortunately swallows up the hand and gets the pipe edge confused. The pipe makes send because it is a hard edge.
vcut = vecZero2One(v[:,2])
vcut = reshape(vcut, (64, 64))
gonna_try = cut_pic(hsvimg, vcut, 0.85, 64, 0);
colorview(HSV, gonna_try[1, :, :] , gonna_try[2, :, :], gonna_try[3, :, :])
colorview(HSV, hsvimg[1, :, :] , hsvimg[2, :, :], hsvimg[3, :, :])
This one works so well! It gets the edge of the paws precisely!!!
Using normalized cut to do image segmentation worked really well! It only took a few tweaks away from the code I showed in my presentation but it ended up as a viable product.
This project shows that graph cuts are both interesting and useful and that Julia is an excellent platform for conducting graph algorithms for high performance.
I was able to implement two versions of randomized min cut algorithms that perform at a speed competitive with C++.
I then represented images as graphs and used normalized cut to segment the images. This was done with great success.
Pkg.add("Inkscape")