Introduction to Programming, Aug-Dec 2008
Lecture 20, Wednesday 05 Nov 2008

Memoization and Dynamic Programming
-----------------------------------

Last time, we saw how we could make the computation of an
inductive definition with overlapping subcomputations more
efficient by "memoizing" intermediate results.

Today we look at some more examples and also study a related idea
from the theory of algorithms called dynamic programming.

Pinball game
------------

A board has obstacles arranged in a triangle, as follows.


                           ()
                          /  \
                        ()    ()
                       /  \  /  \
                     ()    ()    ()
                    /  \  /  \  /  \
                  ()    ()    ()    ()
                 /  \  /  \  /  \  /  \
               ()    ()    ()    ()    ()


Each obstacle has a number of points associated with it.  For
instance, we could have the following assignment of points to the
obstacles in the board above.


                           15
                          /  \
                        28    33
                       /  \  /  \
                     18    22    16
                    /  \  /  \  /  \
                  35    15    11    17
                 /  \  /  \  /  \  /  \
               29    13    14    26    12


When we drop a ball on the topmost obstacle, it bounces off and
goes down, either left or right.  Depending on which way it goes,
it bounces off an obstacle at the next level, and again gets
deflected left or right.  This continues till it reaches the
bottom row of obstacles.  When the ball hits the final obstacle
on its run, we add up the points of all the obstacles that it has
collided against to obtain our score for the game.

With sufficient skill, it is possible to control the ball as it
moves down the board and select the path along with it travels.
The aim is to choose a path that maximizes the score.

For the moment, let us concentrate on calculating the maximum
score that one can obtain on a given board (note that there may
be more than one way to achieve this maximum score).  Later, we
will come back to the question of determining a path that
actually yields this maximum score.

First, we need a way to identify each obstacle unambiguously.  We
can assign a pair of coordinates (i,j) to each obstacle where i
in {1,2,...,n} is the row in which the obstacle lies and j in
{1,2,...,i} is the position of the obstacle within that row, from
left to right.  Let points(i,j) be the points assigned to the
obstacle at location (i,j).  If we rearrange the triangle a bit,
we have the following arrangement.

             |   1     2     3     4     5   ---> j
          ---+----------------------------
          1  |  15
             |
          2  |  28    33
             |
          3  |  18    22    16
             |
          4  |  35    15    11    17
             |
          5  |  29    13    14    26    12
          |  |
          v  |
          i  |

We define score(i,j) to be the maximum score that we can achieve
if the ball starts its path at the obstacle labelled (i,j).  The
quantity we want to calculate is given by score(1,1),
corresponding to the obstacle at the top of the triangle.

There is a natural inductive definition of score(i,j).  After
bouncing of (i,j), the ball will next go to either (i+1,j), if it
goes left, or (i+1,j+1), if it goes right.  If we already know
the scores at these two positions, we can computute score(i,j) as
follows:

  score(i,j) = points(i,j) + max [ score(i+1,j), score(i+1,j+1) ]

The base case is when i = n, corresponding to the last row.  For
all j in {1,2,...,n} we have

  score(n,j) = points(n,j)

Computing score(i,j) naively would result in recomputing
intermediate values.  For instance, both score(3,2) and
score(3,3) would require the value of score(4,3).  Through
memoiziation, we can avoid wasteful recomputation, just as we saw
in the fibonacci example.

Dynamic programming
-------------------

Another way to approach the problem of overlapping
subcomputations is to examine the order in which values are
required.  In each inductive definition, we have a base case for
which the value is immediately available.  The inductive
definition specifies a relationship between a value and its
"neighbours".  Once we have calculated the base cases, we can
compute the positions that lie in the neighbourhood of these base
cases.  In this way, we can systematically "grow" the set of
known values, ensuring at each stage that we already have at hand
the values that we need to compute the current value.

For example, in the fibonacci case, the base cases are "fib 0"
and "fib 1".  Having calculated these, we can immediately
calculate "fib 2", since the values it depends on are both known.
With "fib 2" in hand, we can compute "fib 3".  In this way, we
proceed to "fib 4", "fib 5", ...  We can regard this as filling
up the memo table systematically for n = 0,1,2,...  Notice that
this corresponds to the natural way in which we enumerate the
fibonnacci sequence as 1 1 2 3 5 ...

Consider our second problem, the pinball game.  The base cases in
this computation are the values score(n,j), for j in {1,2,...,n}.
With this in hand, we can immediately compute the values
score(n-1,k), k in {1,2,...,n-1} since both neighbours of the
obstacle at (n-1,k) lie in the range of values that are already
known.  Having calculated row n-1, we can proceed back to row
n-2, n-3, ... till we reach the top row.

This technique is called dynamic programming.  In dynamic
programming, we systematically compute the entries of the memo
table "bottom up" by analyzing the order in which values would be
computed had we used the normal "top down" inductive definition.
In a sense, both approaches are equivalent, because we actually
compute each value only once.  However, in practice, dynamic
programming is more efficient because we avoid the overhead
associated with keeping many function calls pending while we move
down to the base case.

Expression evaluation
---------------------

Consider infix arithmetic expressions over integers using the
operators + and * without parentheses and without any assumptions
about the order in which to evaluate subexpressions.  Thus, an
expression such as 6*3+2*5 may be evaluated as (6*3)+(2*5) = 28
or 6*((3+2)*5) = 150 or ((6*3)+2)*5 = 100, depending on the order
of evaluation.  Our aim is to find a way of bracketing the
expression so as to achieve the maximum overall value.  As with
the pinball game, for the moment we disregard the problem of
finding the actual bracketing and concentrate on calculating the
best possible value we can achieve with this expression.

For simplicity, let us assume that the numbers in the expression
are all single digits.  If we number the positions in the input
expression from 1, every odd position is an integer and every
even position is an arithmetic operator.

When we bracket an expression, we explicitly fix an order in
which subexpressions are evaluated.  Starting with the initial
expression, we combine values in pairs, using one of the
operators, and keep reducing it till we end up with a single
number.  Let us focus on what happens in the last step.

Consider the expression we saw earlier, 6*3+2*5.  The last step
could involve any of the three operators.  We have no reason to
believe, a priori, that one of these three is better than the
others, so we have to consider all of them:

Case 1:  The first occurrence of * is the last operator applied.
         Then, prior to this, we have optimally evaluated
         subexpressions 6 and 3+2*5.

Case 2:  The + is the last operator applied.
         Then, prior to this, we have optimally evaluated
         subexpressions 6*3 and 2*5.

Case 3:  The second occurrence of * is the last operator applied.
         Then, prior to this, we have optimally evaluated
         subexpressions 6*3+2 and 5.

In each case, the best value we can achieve by applying the last
operator is obtained by maximizing the values we obtain from the
corresponding subexpressions.  (This is a consequence of the
choice of operators + and *, which grow monotonically in both
their arguments.  What would happen if we allowed the operator -
in our expression?)

The subexpression we have to evaluate is a part of the original
expression.  The subexpressions generated by subcomputations
could overlap.  For instance, in Case 2, we have to compute the
values of 6*3 and 2*5.  The case 6*3 will also occur one step
later in Case 3, while 2*5 will occur one step later in Case 1.
Hence, it makes sense to memoize these values.

How do we identify subexpressions unambiguously, to ensure that
we can recognize when we have (or have not) yet computed the
value for a given subexpression?  Notice that each subexpression
is actually a substring of the original expression.  Thus, we can
identify it by noting its starting and ending points.

Thus, in general, we want to compute a function maxval(i,j) where
i < j are indices in the range 1,2,...,length(e) for a input
expression e.  Moreover, assuming that all the integers in the
expression are single digit numbers, we have already observed
that a subexpression will start and end at an odd index.  We can
write down an inductive defintion for maxval(i,j).

  maxval(i,j) =    max        op(k)(maxval(i,k-1),maxval(k+1,j))
                k in [i..j]
                  k even

Here, op(k) refers to the function corresponding to the operator
at position k --- op(k) is multiplication if the symbol is * and
addition if the symbol is +.  All the arithmetic operators are at
even positions, so we restrict ourselves to even numbers k in the
range [i..j].  We pick all such k, inductively compute the
subexpressions resulting from splitting the expression at k, and
take the maximum.

The base case is when the expression has length 1, so that we
have a single value.  For odd positions i in the range
[1..length(e)]

  maxval(i,i) = value(i)

where value(i) is the integer value corresponding to the digit at
position i.

Using memoization, we can calculate the values we need without
recomputing any intermediate values.  The final answer we want is
maxval(1,length(e)).

How would we systematically calculate thse values using dynamic
programming.   If we place the values maxval(i,j) in a two
dimensional array, we have the following picture:


             |   1     3     5     7     9   ---> j
          ---+----------------------------
          1  |   o     x     x     x     x
             |
          3  |   .     o     x     x     x
             |
          5  |   .     .     o     x     x
             |
          7  |   .     .     .     o     x
             |
          9  |   .     .     .     .     o
          |  |
          v  |
          i  |

Notice that we only consider odd values for i and j because each
subexpression starts and ends at an odd position.  Positions
marked . correspond to pairs (i,j) where j < i and hence do not
denote valid subexpressions.  Positions marked "o" correspond to
values of the form maxval(i,i) which are the base case.  The
remaining positions, marked "x" have to be computed using the
inductive definition.

According to the inductive definition, to compute maxval(i,j) we
need to know maxval(i,k), where i < k <= j and maxval(k,j) where
i <= k < j.

Suppose we are trying to compute maxval(1,7).  This requires us
to know maxval(1,5), maxval(1,3) and maxval(1,1) along the row i
= 1 and maxval(3,7), maxval(5,7) and maxval(7,7) along the column
j = 7.

Pictorially, we have the following situation, where y marks the
value we are trying to compute and z marks the values that are
required to compute it.

             |   1     3     5     7     9   ---> j
          ---+----------------------------
          1  |   z     z     z     y     x
             |
          3  |   .     o     x     z     x
             |
          5  |   .     .     o     z     x
             |
          7  |   .     .     .     z     x
             |
          9  |   .     .     .     .     x
          |  |
          v  |
          i  |


Thus, in the array, to compute a value, we need to know all
values to its left and all values directly below it.  Notice that
each value to the left pairs up with a value below it, according
to the inductive definition --- for instance, maxval(1,1) goes
with maxval(3,7) corresponding to splitting the expression at
position 2, maxval(1,3) goes with maxval (5,7) corresponding to
splitting the expression at position 4 and maxvao(5,7) goes with
maxval(7,7) corresponding to splitting the expression at position
6.

As we observed earlier, the base cases are the values on the
diagonal.  Having computed the values on the diagonal, we can
next tackle the values one off the diagonal, of the form
maxval(i,i+2), because, for these positions, we know all the
values in the same row and column.  We can then proceed to the
elements two positions off the diagonal of the from maxval(i,i+4)
and so on.

In the picture below, we describe the order in which values are
calculated.  The diagonal elements can be calculated (in any
order) in the first pass, so they are labelled 1.  The off
diagonal elements can then be calculated (in any order) in the
second pass, so they are labelled 2.  Thus, in 5 passes, we can
compute the value maxval(1,9) that is four steps off the
diagonal.


             |  1     3     5     7     9   ---> j
          ---+----------------------------
          1  |  1     2     3     4     5
             |
          3  |  .     1     2     3     4
             |
          5  |  .     .     1     2     3
             |
          7  |  .     .     .     1     2
             |
          9  |  .     .     .     .     1
          |  |
          v  |
          i  |


======================================================================