6.046 Introduction to Algorithms                  April 6th, 2004

Today: Dynamic Programming

Dynamic Programming is a general method which is widely applied in
operation-research to solve optimization problems and in computational
biology to solve DNA related sequencing problems. It often allows you
to get an O(n^d) solution where the naļve approach would take
exponential time O(c^n).


Basic Idea: 
1. Break problems into sub-problems.

2. Solve smallest sub-problem first, and then use the solution to the
sub-problems to solve the next smallest till solve the original
problem.  Time will be usually dominated by the number of possible
sub-problems.

This is similar to divide and conquer method in certain respects, but
in Divide and conquer you divide the problem into sub-problems and
solve recursively in a top-down approach. Whereas, in Dynamic
Programming many sub-problems are repeated. Thus, instead of solving
them each time from scratch, will only solve once and store the
solution in a table. Whenever encounter it again, look up the solution
in the table.  A bottom-up approach. Accordingly, the time complexity
will be the size of the table times the cost of computing an entry.

There are many fantastic examples to illustrate this approach.
In class we will do a few (but I add a list here which I recommend
you read and understand as each illustrates an interesting case of this
useful technique).

List of problems: World Series, Longest Common Subsequence,
Matrix Multiplication, Knapsack, Traveling Salesman Problem,

First Example: Longest common subsequence problem: 
	Given two strings x,y over some common alphabet, the problem is to
	find find a longest common sub-string (not necessarily consecutive
	characters) of both x and y.
	
Example: x = A B C B D A B
         Y = B D C A B A

Then, BCBA is sub-string of both x and y and is of maximal length
possible among all common sub-strings. Note that it is not unique,
BDAB is also a longest common sub-string. We are interested in finding
one of maximal length.  We define LCS(x,y) = a longest sequence of
characters that appears left to right in both strings.

This problem is useful in genetics. Given 2 DNA fragments, LCS gives
information about what they have in common and what is the best way to
line them up.  Moreover, an equivalent problem is finding the minimum
edit-distance (sequence of inserts and deletes) to transform string x
into string y.

To see that: to transform x to y is need to do |x|-|LCS(x,y)| deletes
(to get rid of all the characters of x which are not in the longest
common sub-sequence with y) followed by |y|-|LCS(x,y)| inserts (to
insert the characters necessary to turn the longest common sequence
into y).

So, how do we solve the LCS(x,y)?

Brute Force Solution: O(n2^m) where m=|x| and y=|y| simply try all
possible sub-sequences of x and see if they appear in y. There are 2^m
possible sub-sequences of x, and checking whether one appears in y is
O(n).

Dynamic Programming Idea:

SIMPLIFY: First lets find the LENGTH of the LCS(x,y) rather than the
longest common sub-sequence itself. Then we shall extend this to
finding the sequence itself.  

STRATEGY: Consider prefixes of x and of y and find the LCS(x,y) in
terms of those.

Define c(i,j) = length of LCS(x[1,..i],y[1,..,j]) namely using only
the i-length prefix of x and j-length prefix of y.  Clearly, our goal
then is to compute c[m,n] which is the length of
LCS(x[1…m],y[1…n])=LCS(x,y).

First stage is to define c(i,j) in terms of smaller sub-problems.

Recursive formulation (working backwards).

C(0,1)=c(1,0)=c(0,0) =0

CLAIM: c(i, j) = 1+c(i-1,j-1) if x[i]=y[j] and 
                max{c[i-1,j), c(I,j-1)} if x[i] != y[j] 
PROOF: There are 2 cases to consider.
Case 1: x[i]=y[j].

In this case the LCS(x[1,…,i],y[1,..,j]) might as well match up
x[i] and y[j].  Why? Suppose there exists a longest substring of
x[,…i] and y[1,..,j], call it z[1…k] which does not match up
x[i] with y[j]. Then, either both x[i] and y[j] are not in it, in
which case we can extend it by 1 character more from x, namely x[i],
and 1 character more from y, namely y[j], and thus there exists a k+1
long common substring which contradicts the fact that z was the
longest. Or, x[i] is in it and y[j] is not (or vice versa). But, in
this case z[k]=x[i, and we might as well match it with y[j]. Thus,
z[1…k-1] is the longest common substring of x[1…i-1] and
y[1…j-1] and c(i,j) = 1+c(i-1,j-1).

Case 2: x[i] != y[j]. Then, the lCS of x[1,…,i] and y[1,…,j]
cannot contain both x[i] and y[j], so either the answer ignores x[i]
or the answer ignores y[j] or both. Thus, again c(i, j)= max
{c[i-1,j), c(i,j-1)).

Lets see the code corresponding to this recursive formulation.

LCS(x,y,i,j)
{
  if (i=0 or j=0) return 0;
  if (x[i] = y[j]) return 1 + LCS(x,y,i,j)
  else return max( LCS(x,y,i-1,j), LCS(x,y,i,j-1) );
}
Unfortunately, the running time for this 2^{n+m} !!!!

New Idea:  lets store intermediate results in an array, and instead of
re-computing them again, look them up.
This process is called memoization.

Modify the code above as follows. Initialize an m by n array 
c[i, j] = nil for all i, j. Set c[i, 0]=c[0,j]=0 for all i, j. 

LCS(x,y,i,j)
{
  if (i=0 or j=0) return 0;
  if c[I,j] != nil, return c[i,j]
  else if (x[i] == y[j]), set c[I,j] = 1 + LCS(x,y,i,j); 
       else set c[i, j]=  max( LCS(x,y,i-1,j), LCS(x,y,i,j-1) );
  return c[i,j]}

Call this routine with LCS(x,y,m,n)


Running time and space is O(nm). 

Finally, to compute an actual LCS(x,y) we initialize the substring z
to empty, go to the right lower corner of c[m,n] and trace backwards
as follows. Go as far as you can either up or to the left whichever
has the largest value (break ties arbitrarily). When you encounter an
entry with a new lower value, the last entry you visit before the
entry with the lower value, is your new match. E.g. say the match is
in c[m-2,n], then let z=c[m-2,n]z and goto the diagonal entry
c[m-3,n-1]. From this place, continue in the same fashion as above
till arrive to either c[0,j] or c[I,0]. You are done.

Example 2 knapsack problem


Yin the knapsack problem, you are given n items.  
Each has size s_i and value v_i. 
You have a knapsack of total size S. Your goal is to find a 
subset of the items of max possible total value such that sum
of sizes will fit within S inside the knapsack.

Can solve in time O(2^n) by trying all possible subsets. Using dynamic
programming we shall solve in time O(n*S). Note that this is not
polynomial time in the size of the problem as we are working in time
proportional to S and not log S. Indeed, this is the best we know for
this problem.

The idea is to decide item by item whether it should or should not be
included in the knapsack. We first will find the maximal value that
can be achieved.

 We initialize an array VAL of size n by S to NIL where entry VAL[i,w]
= maximal best value that can be achieved using items 1,..,I and
having space w to work with. Thus, what we are after is VAL[n,S].

The recursive formula is the following:
VAL[n,S] = max {VAL[n-1,S], v_n + VAL[n-1,S-s_n]}. Namely, maximize
the value between the 2 choices of either putting the n-th item in the
knapsack or not.

The code is very simple.

Knapsack(n,S)

1.Initialize VAL[i, 0]=VAL[0,w]=0 for all i for all w.
2. for i=1 to n 
       for w=1 to S do
	   if s_i > w, then VAL[i,w]=VAL[i-1,w]
           VAL[i, w] = max {VAL[i-1,w],VAL[i-1,w-s_i] + v_i}
3. return VAL[n,S]

Clearly the time and space are bounded by the size of the array which is O(nS).

To get the items themselves out of the array VAL. Start at VAL[n,S]
and go up till you encounter a different value say at VAL[m,S]. At the
point the value changes is when we did put an item in. Thus, put item
m+1 in the knapsack and goto entry VAL[m,S-s_n]. Keep working your way
up using the same rule from entry VAL[m,S-s_n] till you encounter
either VAL[0,w] or VAL[I,0].