Introduction to Programming, Aug-Dec 2008 Lecture 4, Wed 13 Aug 2008 Tuple types ----------- What if we want to group together values of different types? For instance, we may want to maintain a collection of lists of [Float] in which we store, along with each list, its length. Thus, each element in our collection is a pair (fs,n) where fs::[Float] and n::Int. In Haskell terminology, the values we store belong to the tuple type ([Float],Int). A list of tuples of this form will therefore have type [([Float],Int)]. Notice that this does not violate our requirement that a list has a uniform underlying type --- each element of the list is of type ([Float],Int). For instance, here is a function that takes a list of strings and returns, for each string in the list, the string and its length. stringlengths :: [String] -> [(String,Int)] stringlengths [] = [] stringlengths (x:xs) = (x,length x):(stringlengths xs) Here is an example in which we compute the distance between points in two-dimensional space. Each point is represented as a pair of Floats. distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt ((x2-x1)*(x2-x1) + (y2-y1)*(y2-y1)) Notice that in the definition of the function, we can use pattern matching to directly decompose the tuple into its constituent parts. We are not restricted to pairs when defining tuple types: we can construct n-tuples. For instance, we could easily generalize the earlier definition to three-dimensional points as follows: distance :: (Float,Float,Float) -> (Float,Float,Float) -> Float distance (x1,y1,z1) (x2,y2,z2) = sqrt ((x2-x1)*(x2-x1) + (y2-y1)*(y2-y1) + (z2-z1)*(z2-z1)) Some times, we might want to supply a new name for a compound type to ease understanding of a program. For instance, we might want to use the word Point to denote (Float,Float) so that we can define the type of distance to be distance :: Point -> Point -> Float which reads better than the original definition. This is achieved using a "type" definition, as follows: type Point = (Float,Float) It is important to recognize that this definition does not define a new type --- it just says that Point is a synonym for (Float,Float). Thus, if we have functions as follows: f :: Float -> Float -> Point g :: (Float,Float) -> (Float,Float) -> Float it is legal to write an expression of the form g (f x1 y1) (f x2 y2) In this expression, the two instances of f produce outputs of type Point, while g is expecting its inputs to be of type (Float,Float). However, Point and (Float,Float) are just different names for the same basic type, so the outputs of the two instance of f are compatible with the type declared for the inputs to g. Defining local functions using where ------------------------------------ Let us return to our function of distance. distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt ((x2-x1)*(x2-x1) + (y2-y1)*(y2-y1)) The expressions (x2-x1)*(x2-x1) and (y2-y1)*(y2-y1) are instances of a more general function that squares its input. So, we could write sqr :: Float -> Float sqr z = z*z distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt (sqr (x2-x1) + sqr(y2-y1)) It is immediately that this version of distance is more readable than the previous version, thanks to the use of an auxiliary function sqr. However, one undesirable aspect of this definition is that the auxiliary function sqr, which is required only in distance, is now globally available. One problem with this is that we cannot now define any other function called sqr because the name is in use. What we need is a way to temporarily define sqr so that it is part of the definition of distance, but is not visible outside. This can be achieved as follows: distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt (sqr (x2-x1) + sqr(y2-y1)) where sqr :: Float -> Float sqr z = z*z In this version of distance, the defintion of sqr is local to distance and is not visible outside. Observe that the word where is indented with respect to the main definition. Since the word where has a special status (Haskell knows that it is not the name of function being defined) we need not indent sqr with respect to where. Another reason to use local declarations is to identify common subexpressions and ensure that they are computed only once. Returning to the original version of distance distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt ((x2-x1)*(x2-x1) + (y2-y1)*(y2-y1)) we observe two instances of the subexpressions (x2-x1) and (y2-y1) in the definition. In general, Haskell will evaluate these quantities afresh each time they are encountered. To indicate that these expressions are the same, we could write distance :: (Float,Float) -> (Float,Float) -> Float distance (x1,y1) (x2,y2) = sqrt ( xdiff*xdiff + ydiff*ydiff) where xdiff :: Float xdiff = x2-x1 ydiff :: Float ydiff = y2-y1 In this definition, xdiff and ydiff can be thought of as constant functions of 0 arguments that always return a fixed argument. Functions on lists ------------------ Map --- Often, we need to operate on lists by transforming each element in a fixed manner. For instance, suppose we want to square each element in a list of integers. We can do this inductively, as usual: sqrall :: [Int] -> [Int] sqrall [] = [] sqrall (n:ns) = (n*n):(sqrall ns) The function stringlengths we wrote above is also of the same general variety. Here, each string in the input list is transformed into a pair of values in the output list. The builtin function map allows us to apply a function f "pointwise" to each element of a list. In other words, map f [x0,x1,..,xk] = [(f x0),(f x1),...,(f xk)] Here is an inductive definition of map. map f [] = [] map f (x:xs) = (f x):(map f xs) What is the type of map? The function f is in general of type a->b. The list that map operates on must be compatible with the function f, so it must be of type [a]. The list generated by map is of type [b]. Thus, we have map :: (a -> b) -> [a] -> [b] Thus, map is a polymorphic function, like the functions length, sum, reverse etc that we wrote for lists earlier. Notice that there are two type variables, a and b, in the type definition for map. These can be independently instantiated to different types, and these instantiations apply uniformly to all occurrences of a and b. An important point to notice is that we can pass a function to another function, as in the definition and use of "map", without any fuss in Haskell. There is no restriction in Haskell about what we can pass as an argument to a function: if it can be assigned a type, it can be passed. We can now write the functions sqrall and stringlengths in terms of map: sqrall l = map sqr l where sqr :: Int -> Int sqr n = n*n stringlengths l = map strlen l where strlen :: String -> (String,Int) strlen s = (s,length s) In sqrall, the input and output types of the function sqr passed to map are of the same type, Int, whereas in stringlengths, the input type of strlen is String and the output type is (String,Int). Filter ------ Another useful operation on lists is to select elements that match a certain property. For instance, we can select the even numbers in a list of even integers as follows. evenonly :: [Int] -> [Int] evenonly [] = [] evenonly (n:ns) | mod n 2 == 0 = n:(evenonly ns) | otherwise = evenonly ns We have used the builtin function mod to check that a number is even: mod m n returns the remainder that results when m is divided by n. A related function is div --- div m n returns the integer part of m divided by n. We can think of evenonly as the result of applying the check iseven :: Int -> Bool iseven n = (mod n 2 == 0) to each element of the input list and retaining those values that pass this check. This is a general principle that we can use to "filter" out values from a list. Let l be of type [a] and let p be a function from a to Bool. Then, we have the function filter : (a -> Bool) -> [a] -> [a] filter p [] = [] filter p (x:xs) | (p x) = x:(filter p xs) | otherwise = filter p xs Notice that the output of filter is a sublist of the original list, so the output list has the same type as the original list.