Introduction to Programming, Aug-Dec 2008 Lecture 20, Wednesday 05 Nov 2008 Memoization and Dynamic Programming ----------------------------------- Last time, we saw how we could make the computation of an inductive definition with overlapping subcomputations more efficient by "memoizing" intermediate results. Today we look at some more examples and also study a related idea from the theory of algorithms called dynamic programming. Pinball game ------------ A board has obstacles arranged in a triangle, as follows. () / \ () () / \ / \ () () () / \ / \ / \ () () () () / \ / \ / \ / \ () () () () () Each obstacle has a number of points associated with it. For instance, we could have the following assignment of points to the obstacles in the board above. 15 / \ 28 33 / \ / \ 18 22 16 / \ / \ / \ 35 15 11 17 / \ / \ / \ / \ 29 13 14 26 12 When we drop a ball on the topmost obstacle, it bounces off and goes down, either left or right. Depending on which way it goes, it bounces off an obstacle at the next level, and again gets deflected left or right. This continues till it reaches the bottom row of obstacles. When the ball hits the final obstacle on its run, we add up the points of all the obstacles that it has collided against to obtain our score for the game. With sufficient skill, it is possible to control the ball as it moves down the board and select the path along with it travels. The aim is to choose a path that maximizes the score. For the moment, let us concentrate on calculating the maximum score that one can obtain on a given board (note that there may be more than one way to achieve this maximum score). Later, we will come back to the question of determining a path that actually yields this maximum score. First, we need a way to identify each obstacle unambiguously. We can assign a pair of coordinates (i,j) to each obstacle where i in {1,2,...,n} is the row in which the obstacle lies and j in {1,2,...,i} is the position of the obstacle within that row, from left to right. Let points(i,j) be the points assigned to the obstacle at location (i,j). If we rearrange the triangle a bit, we have the following arrangement. | 1 2 3 4 5 ---> j ---+---------------------------- 1 | 15 | 2 | 28 33 | 3 | 18 22 16 | 4 | 35 15 11 17 | 5 | 29 13 14 26 12 | | v | i | We define score(i,j) to be the maximum score that we can achieve if the ball starts its path at the obstacle labelled (i,j). The quantity we want to calculate is given by score(1,1), corresponding to the obstacle at the top of the triangle. There is a natural inductive definition of score(i,j). After bouncing of (i,j), the ball will next go to either (i+1,j), if it goes left, or (i+1,j+1), if it goes right. If we already know the scores at these two positions, we can computute score(i,j) as follows: score(i,j) = points(i,j) + max [ score(i+1,j), score(i+1,j+1) ] The base case is when i = n, corresponding to the last row. For all j in {1,2,...,n} we have score(n,j) = points(n,j) Computing score(i,j) naively would result in recomputing intermediate values. For instance, both score(3,2) and score(3,3) would require the value of score(4,3). Through memoiziation, we can avoid wasteful recomputation, just as we saw in the fibonacci example. Dynamic programming ------------------- Another way to approach the problem of overlapping subcomputations is to examine the order in which values are required. In each inductive definition, we have a base case for which the value is immediately available. The inductive definition specifies a relationship between a value and its "neighbours". Once we have calculated the base cases, we can compute the positions that lie in the neighbourhood of these base cases. In this way, we can systematically "grow" the set of known values, ensuring at each stage that we already have at hand the values that we need to compute the current value. For example, in the fibonacci case, the base cases are "fib 0" and "fib 1". Having calculated these, we can immediately calculate "fib 2", since the values it depends on are both known. With "fib 2" in hand, we can compute "fib 3". In this way, we proceed to "fib 4", "fib 5", ... We can regard this as filling up the memo table systematically for n = 0,1,2,... Notice that this corresponds to the natural way in which we enumerate the fibonnacci sequence as 1 1 2 3 5 ... Consider our second problem, the pinball game. The base cases in this computation are the values score(n,j), for j in {1,2,...,n}. With this in hand, we can immediately compute the values score(n-1,k), k in {1,2,...,n-1} since both neighbours of the obstacle at (n-1,k) lie in the range of values that are already known. Having calculated row n-1, we can proceed back to row n-2, n-3, ... till we reach the top row. This technique is called dynamic programming. In dynamic programming, we systematically compute the entries of the memo table "bottom up" by analyzing the order in which values would be computed had we used the normal "top down" inductive definition. In a sense, both approaches are equivalent, because we actually compute each value only once. However, in practice, dynamic programming is more efficient because we avoid the overhead associated with keeping many function calls pending while we move down to the base case. Expression evaluation --------------------- Consider infix arithmetic expressions over integers using the operators + and * without parentheses and without any assumptions about the order in which to evaluate subexpressions. Thus, an expression such as 6*3+2*5 may be evaluated as (6*3)+(2*5) = 28 or 6*((3+2)*5) = 150 or ((6*3)+2)*5 = 100, depending on the order of evaluation. Our aim is to find a way of bracketing the expression so as to achieve the maximum overall value. As with the pinball game, for the moment we disregard the problem of finding the actual bracketing and concentrate on calculating the best possible value we can achieve with this expression. For simplicity, let us assume that the numbers in the expression are all single digits. If we number the positions in the input expression from 1, every odd position is an integer and every even position is an arithmetic operator. When we bracket an expression, we explicitly fix an order in which subexpressions are evaluated. Starting with the initial expression, we combine values in pairs, using one of the operators, and keep reducing it till we end up with a single number. Let us focus on what happens in the last step. Consider the expression we saw earlier, 6*3+2*5. The last step could involve any of the three operators. We have no reason to believe, a priori, that one of these three is better than the others, so we have to consider all of them: Case 1: The first occurrence of * is the last operator applied. Then, prior to this, we have optimally evaluated subexpressions 6 and 3+2*5. Case 2: The + is the last operator applied. Then, prior to this, we have optimally evaluated subexpressions 6*3 and 2*5. Case 3: The second occurrence of * is the last operator applied. Then, prior to this, we have optimally evaluated subexpressions 6*3+2 and 5. In each case, the best value we can achieve by applying the last operator is obtained by maximizing the values we obtain from the corresponding subexpressions. (This is a consequence of the choice of operators + and *, which grow monotonically in both their arguments. What would happen if we allowed the operator - in our expression?) The subexpression we have to evaluate is a part of the original expression. The subexpressions generated by subcomputations could overlap. For instance, in Case 2, we have to compute the values of 6*3 and 2*5. The case 6*3 will also occur one step later in Case 3, while 2*5 will occur one step later in Case 1. Hence, it makes sense to memoize these values. How do we identify subexpressions unambiguously, to ensure that we can recognize when we have (or have not) yet computed the value for a given subexpression? Notice that each subexpression is actually a substring of the original expression. Thus, we can identify it by noting its starting and ending points. Thus, in general, we want to compute a function maxval(i,j) where i < j are indices in the range 1,2,...,length(e) for a input expression e. Moreover, assuming that all the integers in the expression are single digit numbers, we have already observed that a subexpression will start and end at an odd index. We can write down an inductive defintion for maxval(i,j). maxval(i,j) = max op(k)(maxval(i,k-1),maxval(k+1,j)) k in [i..j] k even Here, op(k) refers to the function corresponding to the operator at position k --- op(k) is multiplication if the symbol is * and addition if the symbol is +. All the arithmetic operators are at even positions, so we restrict ourselves to even numbers k in the range [i..j]. We pick all such k, inductively compute the subexpressions resulting from splitting the expression at k, and take the maximum. The base case is when the expression has length 1, so that we have a single value. For odd positions i in the range [1..length(e)] maxval(i,i) = value(i) where value(i) is the integer value corresponding to the digit at position i. Using memoization, we can calculate the values we need without recomputing any intermediate values. The final answer we want is maxval(1,length(e)). How would we systematically calculate thse values using dynamic programming. If we place the values maxval(i,j) in a two dimensional array, we have the following picture: | 1 3 5 7 9 ---> j ---+---------------------------- 1 | o x x x x | 3 | . o x x x | 5 | . . o x x | 7 | . . . o x | 9 | . . . . o | | v | i | Notice that we only consider odd values for i and j because each subexpression starts and ends at an odd position. Positions marked . correspond to pairs (i,j) where j < i and hence do not denote valid subexpressions. Positions marked "o" correspond to values of the form maxval(i,i) which are the base case. The remaining positions, marked "x" have to be computed using the inductive definition. According to the inductive definition, to compute maxval(i,j) we need to know maxval(i,k), where i < k <= j and maxval(k,j) where i <= k < j. Suppose we are trying to compute maxval(1,7). This requires us to know maxval(1,5), maxval(1,3) and maxval(1,1) along the row i = 1 and maxval(3,7), maxval(5,7) and maxval(7,7) along the column j = 7. Pictorially, we have the following situation, where y marks the value we are trying to compute and z marks the values that are required to compute it. | 1 3 5 7 9 ---> j ---+---------------------------- 1 | z z z y x | 3 | . o x z x | 5 | . . o z x | 7 | . . . z x | 9 | . . . . x | | v | i | Thus, in the array, to compute a value, we need to know all values to its left and all values directly below it. Notice that each value to the left pairs up with a value below it, according to the inductive definition --- for instance, maxval(1,1) goes with maxval(3,7) corresponding to splitting the expression at position 2, maxval(1,3) goes with maxval (5,7) corresponding to splitting the expression at position 4 and maxvao(5,7) goes with maxval(7,7) corresponding to splitting the expression at position 6. As we observed earlier, the base cases are the values on the diagonal. Having computed the values on the diagonal, we can next tackle the values one off the diagonal, of the form maxval(i,i+2), because, for these positions, we know all the values in the same row and column. We can then proceed to the elements two positions off the diagonal of the from maxval(i,i+4) and so on. In the picture below, we describe the order in which values are calculated. The diagonal elements can be calculated (in any order) in the first pass, so they are labelled 1. The off diagonal elements can then be calculated (in any order) in the second pass, so they are labelled 2. Thus, in 5 passes, we can compute the value maxval(1,9) that is four steps off the diagonal. | 1 3 5 7 9 ---> j ---+---------------------------- 1 | 1 2 3 4 5 | 3 | . 1 2 3 4 | 5 | . . 1 2 3 | 7 | . . . 1 2 | 9 | . . . . 1 | | v | i | ======================================================================