Introduction to Programming, Aug-Dec 2008
Lecture 10, Monday 15 Sep 2008

Conditional polymorphism
------------------------

So far, we have seen two ways of assigning types to functions.
One way is to assign a specific type, in terms of some of the
built in types.  Examples include:

   power :: Float -> Int -> Float
   times :: Int -> Int -> Int

The other option is to give a "generic" type in terms of type
variables.  In such a generic type, any real type can be
uniformly substituted for a type variable.  Examples include:

   reverse :: [a] -> [a]
   length  :: [a] -> Int

As the type of length shows, we can mix type variables and
concrete types in the definition of a function.

Can we use these types to arrive at the type for quicksort?  If
we use concrete types, we have do define a separate version of
quicksort for each type of array:

   iquicksort :: [Int] -> [Int]
   fquicksort :: [Float] -> [Float]
   ilistquicksort :: [[Int]] -> [[Int]]

This is clearly undesirable because the actual definition of
quicksort in all these cases is the same. 

The other option is to declare it to be of type [a] -> [a].  Is
this reasonable?  Can we sort a list of functions of type
Int->Int->Int such as [plus, times, max]?

The answer is that we can sort a list of values provided we can
compare them with each other.  In other words, we need to assign
the following type to quicksort:

   quicksort :: [a] -> [a], provided we can compare values of
                            type a

This information about the additional properties is represented
in Haskell by identifying a subset of types (called a type class)
with the required properity.  In this case, the type class in
question is called Ord and contains all types which support
comparison.  (Aside:  From a formal point of view, it is not
clear that the collection of all types forms a set, but we use
set and subset informally to describe collections of types.)

A subset X of a set Y can also be described in terms of its
characteristic function f_X where

   f_X(x) = True, if x in X
          = False, otherwise

Thus, we can think of Ord as a function that maps types to Bool
and write "Ord t" to denote whether or not t belongs to Ord.  The
type of quicksort now becomes:

   quicksort :: (Ord a) => [a] -> [a]

This is read as "If a is in Ord, then quicksort is of type [a] ->
[a]".  Note the different double arrow that follows (Ord a).

An even more basic type class is Eq, the set of all types that
support checking for equality.  Why is this a nontrivial type
class?  Once again, it is not clear how to define equality for
functions, just as it was not clear how to compare functions.
At the very least, we would expect two functions that are equal
to agree on all inputs.

   If f == g then for all x, (f x) == (g x)

Note that this is a fairly weak definition --- if this was the
only property defining equality of functions in a computational
setting, then all functions that sort lists would be equal, since
they produce the same output on a given input.  This is clearly
not what we intuitively understand --- we do not expect insertion
sort, mergesort and quicksort to all be equal to each other.
However, this is certainly a minimum requirement for two
functions to be deemed equal.

In fact, we need to first check an even more basic property ---
for an input x, do both f and g terminate in a finite amount of
time with a sensible output?  Recall, for example, what happens
when we supply a negative argument to this definition of
factorial.
 
   factorial 0 = 1
   factorial n = n * factorial (n-1)

With this definition, factorial (-1) results in an infinite
sequence of rewriting steps and never terminates.  Thus, even
before checking whether the values produced by f and g agree on
all inputs, we would need to check 

   If f == g then for all x, f terminates on x iff 
                             g terminates on x

Asking whether "f terminates on x" is called the Halting
Problem.  Alan Turing showed that this cannot be computed --- in
other words, we cannot write a function

      halting :: (a->b) -> a -> Bool

that takes as input a function f and an input to x to f and
returns True if the computation of f halts on x and False if f
does not halt on x.  Note that halting itself is therefore
expected to terminate on all inputs and report True or False in a
finite amount of time.  This is one of the earliest and most
fundamental results in the theory of computable functions, and
since we cannot check this most basic property, we surely cannot
check if two functions are equal, in general, no matter what
definition of equality we choose for functions.

Thus, functions do not belong to Eq.  A typical example of a
function that depends on Eq is the builtin function elem that
checks if a value belongs to a list.  Here is an inductive
definition of elem.

   elem x [] = False
   elem x (y:ys) 
     | x == y     = True
     | otherwise  = elem x ys

The most general type for elem is

   elem :: (Eq a) => a -> [a] -> Bool

Observe that a type can belong to Ord only if it belongs to Eq.
This is because comparison involves not only the functions < and
> but also <=, >= etc which imply that we can check equality.
Thus, as subsets of types, Ord is a subset of Eq.  Alternatively,
we have that Ord a implies Eq a for any type a.

Another typical type class in Haskell is Num, the collection of
all types that supports "numeric" operations such as +, - and *.
For instance, we had written a function sum that adds up values
in a list.  For specific types of lists, we have:

   sum :: [Int] -> Int
   
etc.

In general, sum will work on any list whose underlying type
supports addition.  This means that we can assign the following
generic type to sum.

   sum :: (Num a) => [a] -> a

The output of sum is the same type as the underlying type of the
list.

When we invoke a function whose type depends on some property of
the underlying type, Haskell will first check that this property
is satisfied.  Otherwise, there will be an error message.  For
instance, if we write

   elem reverse [tail, reverse]

Haskell will complain that it cannot infer "Eq reverse" to ensure
that reverse is of a type that supports equality checking.


Defining type classes
---------------------

We saw that Haskell organizes types into subsets called type
classes that satisfy additional properties.  How are these
subsets defined?  Haskell uses a very simple idea:  a type
belongs to a class if it has defined on it a specific collection
of functions that the class requires.

For instance, a type belongs to Eq if it has defined on it the
functions == and /=, both of which take two elements of the type
and return Bool.  Classes are defined using the following syntax:

  class Eq a where
    (==) :: a -> a -> Bool
    (/=) :: a -> a -> Bool

This definition specifies that the class Eq holds for type a
provided a has defined on it the functions == and /= of the
appropriate type signature.  In fact, since /= is derivable from
== and vice versa, the definition of Eq even fills in these
connections, so that only one of /= or == actually needs to be
defined.

  class Eq a where
    (==) :: a -> a -> Bool
    (/=) :: a -> a -> Bool

    x /= y = not (x == y)
    x == y = not (x /= y)

An important fact to note is that these only serve as default
definitions so that if the user defines ==, /= is automatically
instantiated and vice versa.  However, the definition does not
prevent the user from defining both == and /= and, in particular,
does not attempt to check that such definitions have the property
that == and /= are complementary, as suggested in the default
definitions.  Thus, the default definitions only provide a way of
deriving one function from another, but cannot enforce any
internal consistency between the actual functions defined by the
users. 

Likewise, for a type a to be in the class Ord, it must support
the functions ==, /=, <, <=, >= and >.  The first two come from
Eq, so if a belongs to Ord, it must belong to Eq.  Once again,
some of the functions can be derived from others, so we might
have a definition of Ord that looks as follows:

  class (Eq a) => Ord a where
    (<)  :: a -> a -> Bool
    (<=) :: a -> a -> Bool
    (>)  :: a -> a -> Bool
    (>=) :: a -> a -> Bool

    x <= y = (x < y) || (x == y)
    x > y  = not (x <= y)
    x >= y = (x > y) || (x == y)

    ... [ Similar definitions using <=, >, >= as basic operator ]

Notice the conditional dependence in the class definition,
similar to the way we use conditional dependence in type
definitions.  We can read this as " if type a belongs to Eq, then
it belongs to Ord provided the following functions are defined
..."

We can add a type to a class by giving an instance definition
that declares the relevant function.  For instance, we can make
all functions belong to Eq as follows:

  instance Eq (a->b) where
    _ == _ = False

This provides a trivial definition of equality for all functions
of type a->b, which says that no two functions are equal.  The
definition makes the function type (a->b) a member, or an
instance, of the class Eq.

It is important to recognize that this definition only defines ==
and /= for functions of the same underlying type.  For instance,
with this instance definition we can compare reverse and tail,
because both these functions are of type [a] -> [a], but not tail
and head because these functions have different types [a] -> [a]
and [a] -> a, respectively.

In an instance declaration, all type variables have to be
distinct.  Thus, we cannot, for instance, modify the previous
instance definition as

  instance Eq (a->a) where
    _ == _ = False

to restrict our definition to only those functions whose input
types are the same as their output types.

For lists and tuples, equality and ordering are defined based on
the underlying type.  If the underlying type is in Eq, list
equality is defined by demanding that all elements of the lists
be equal.  The instance definition for lists is given as follows:

instance (Eq a) => Eq [a] where
  [] == []         = True 
  (x:xs) == (y:ys) = (x == y) && (xs == ys)

First of all, observe the condition that Eq a must hold to have
Eq [a].  Secondly, note that the definition of == for lists is
given in a familiar inductive form, like many other functions we
have seen for lists.

A nontrivial example of adding instances
----------------------------------------

The examples given above are all builtin to Haskell.  Let us look
at a nontrivial example of adding an instance.  (This example is
due to Vipul Naik.)

In mathematics, we can lift operations from sets to functions
over those sets.  For instance, given functions f and g on
integers, we can define f+g as the function:

  f+g(x) = f(x) + g(x)

Similarly, we can define

  f*g(x) = f(x) * g(x)

Recall that Num is the type class consisting of types that
support +,-,*.  In fact, types in Num support four additional
functions:

  abs         :: a -> a   (Absolute value)
  signum      :: a -> Int (Sign function, -1 if neg, 0 if zero,
                           +1 if pos)
  negate      :: a -> a   (Negates the value)
  fromInteger :: Integer -> a 
                 (Constructs a value of type a from an Integer)

The last function requires a little explanation.  This tells how
to construct a value of the given type from an Integer.  We have
not seen the type Integer before --- Integer is like Int except
that it has no upper and lower bound.  So, Integer can represent
integers that are arbitrarily large or small.  Coming back to
fromInteger --- the intention is that an Integer is the most
basic type of Num value and we can use an Integer in any Num
expression, which will be automatically converted to the right
type by fromInteger.  For instance, if a is Float, fromInteger
might be the function that converts an Integer to a Float with
the same magnitude (e.g., 4 becomes 4.0 etc).

Our goal is to say that if the type b belongs to Num, then all
functions of type a->b belong to Num, using the lifting of +,-,*
and other operations from the underlying type b to functions a->b
as described earlier.

It turns out that Num also implies Eq and Ord, so we need the following.

  instance (Num b) => Eq (a->b) where
      _ == _ = False

  instance (Num b) => Ord (a->b) where
      _ < _ = False

There is a type Show that describes when values of a type can be
printed out on the screen using a function show :: a->String.
Num is a subset of Show, so we need the following definition.

  instance (Num b) => Show (a->b) where
      show f = "Cannot display functions"

Now, we can get to the juicy part, where we define how to lift
+, -, *, abs, signum, negate and fromInteger from b to (a->b).

  instance (Num b) => Num (a->b) where
      (+) f g x     = (f x) + (g x)
      (-) f g x     = (f x) - (g x)
      (*) f g x     = (f x) * (g x)
      abs f x       = abs (f x)
      signum f x    = signum (f x)
      negate f x    = negate (f x)
      fromInteger n = always n
                        where
                        always n y = (fromInteger n)

These definitions are selfexplanatory except, perhaps the last
one.  The last one defines how to convert an Integer to a
function of type a->b.  The class definition permits us to define
any function we want here, but our choice is to map an integer n
to a constant function that returns the value n on all inputs.
This would normally be achieved by a function such as

   almostalways n
   where
     almostalways n y = n

However, note that the type of almostalways is
Integer->a->Integer, so the type of (almostalways n) is
a->Integer.  Our goal is to associate with an integer n a
function of type (a->b).  Since the type b is known to belong to
Num, it has its own definition of fromInteger::Integer->b.  So,
we fix almostalways by invoking fromInteger for type b on n, which
gives us the definition above in which always::Integer->a->b and
hence (always n)::a->b. 

[Note: Actually, the type definition always::Integer->a->b yields
an error.  This is because the definition of always is local to
fromInteger and in this context, ghci/hugs lose track of the
assumption (Num b), which is required to invoke fromInteger
within always.  Thus, we have to write 
always::(Num b)=>Integer->a->b to get the type of always correct.]

Note the similarity of the way +, - and * are lifted.  All are
instances of a general operation lift 

   lift::(b->b->b) -> (a->b)->(a->b)->a->b
   lift oper f g x = oper (f x) (g x)

that lifts an operator over b to functions a->b.  Using lift, we
can define:

instance (Num b) => Num (a->b) where
    (+) = lift (+)
    (-) = lift (-)
    (*) = lift (*)
     ... rest as before ...

You can check that once these instance declarations have been
added, we can write definitions such as

   plustwo :: Int -> Int
   plustwo m = m+2

   timestwo :: Int -> Int
   timestwo m  = m*2

and then evaluate

   (plustwo + timestwo) 7

to get the value

   (plustwo 7) + (timestwo 7) = 9 + 14 = 23

In fact, the definition also works for functions with two inputs
such as

   plus :: Int -> Int -> Int
   plus m n = m+n

   times :: Int -> Int -> Int
   times m n  = m*n

   (plus + times) 7 8 = (plus 7 8) + (times 7 8) = 15 + 56 = 71

----------------------------------------------------------------------