Introduction to Programming, Aug-Dec 2008 Lecture 16, Monday 20 Oct 2008 lazy IO ------- In the last lecture, we discussed an example of "lazy IO" in Haskell. copyfile :: Handle -> Handle copyfile fromhandle tohandle = do s <- hGetContents fromhandle hPutStr tohandle s We said that hGetContents will not necessarily read in the entire file associate with fromhandle. Typically, if its argument s is beyond a certain limit, hPutStr will write out its argument in blocks, generating a fixed amount of text at time. What happens is that hGetContents reads the contents of fromhandle in blocks corresponding to the way hPutStr writes out its values. Lazy IO has to be handled carefully. Suppose we expand copy file to do the opening and closing of handles as well as the actual reading of the file. Then, the function we have written above expands as: newcopyfile :: FilePath -> FilePath newcopyfile fromfile tofile = do fromhandle <- openFile fromfile ReadMode tohandle <- openFile tofile WriteMode s <- hGetContents fromhandle hPutStr tohandle s hClose fromfile hClose tofile This works in the same way as the previous copyfile. Logically speaking, since hGetContents is supposed to read the entire file into s, we can close fromhandle before we write s to tohandle. This gives us the following variant of copyfile. badcopyfile :: FilePath -> FilePath badcopyfile fromfile tofile = do fromhandle <- openFile fromfile tohandle <- openFile tofile s <- hGetContents fromhandle hClose fromfile hPutStr tohandle s hClose tofile How does this version behave? Well, hGetContents is lazy and there is no demand made on s, so nothing is read before fromhandle is closed. So, this version does nothing. The moral of the story is that lazy IO should be handled with care. The results you get may vary unexpectedly with slight perturbations of your code, as we saw here. Actions are like values ----------------------- Actions can be thought of as special types of functions. Thus, just as we can use functions in place of simple types --- for instance, we can construct lists of functions and pass a function as an argument or obtain a function as a resutl --- we can use actions like simple types. Here is a list of actions of type [IO ()] [ putChar 'c', putChar 'z', echo ] We can write, for instance, a function that takes a list of actions and executes them as a sequence: dolist :: [IO ()] -> IO () dolist [] = return () dolist (c:cs) = do c dolist cs Haskell has a builtin function sequence of the following type: sequence :: [IO a] -> IO [a] In other words, sequence combines the results of a list of actions into a single list. Here is how sequence is defined: sequence [] = return [] sequence (c:cs) = do r <- c rs <- sequence cs return (r:rs) Notice the similarity in structure between sequence and getLine. getLine = do c <- getChar ------------------ | if (c == '\n') | | return "" | | else | ------------------ cs <- getLine return (c:cs) This is not surprising, since getLine combines the result of a sequence of getChar's into a list of Char, or String. The only difference is that the list of actions in getLine is terminated by reading '\n', so there is a condition to be checked before making a recursive call to itself. User defined "control" structures --------------------------------- getLine is an example of a "loop" in which we call an action recursively. We can easily control this behaviour a bit more. Suppose we want to write a version of getLine that reads n lines, for a input integer n, and returns a list of strings, one per line read. getNlines :: Int -> IO [String] getNlines 0 = return [] getNlines n = do thisline <- getLine morelines <- getNlines (n-1) return (thisline:morelines) In general, if we want to repeat an action n times, we can write doNtimes 1 act = act doNtimes n act = do act doNtimes (n-1) act Using let --------- So far, for local values in a function, we have used where. For example: mergesort l = merge (mergesort left) (mergesort right) where n = (length l) `div` 2 left = take n l right = drop n l Dually, we can put the local definitions before the function using let, as follows: let n = (length l) `div` 2 left = take n l right = drop n l in mergesort l = merge (mergesort left) (mergesort right) Inside a do block, we can use a variation of let, without "in", to reuse the return value of a function. do line <- getLine let revline = reverse line putStr revline In other words, <- allows us to "remember" the return value of an action and "let" allows us to "remember" the return value of a function. Using the Haskell compiler -------------------------- One of the standard Haskell compilers is the Glasgow Haskell Compiler which can be invoked using the command ghc. When you use an interpreter, you interact directly and can choose the function you want to evaluate. A compiled program runs autonomously, so there has to be an unambiguous way of specifying where the computation should start. Like many other languages, ghc expects computation to start with a function called main of type IO(), located in a module Main. One useful way to organize Haskell code is to put the actual code in a separate module and use main in module Main to just call the relevant function and print out its result using the builtin function. print :: Show a => a -> IO () For instance, suppose all our code is in a module called MyModule and the function to be invoked in MyModule is mymainfunction. Then, the module Main would look like the following: module Main where import MyModule main = print (mymainfunction) How do we actually compile the file? The command is ghc --make Main.hs -o outputfilename In this command, ghc is the name of the compiler while Main.hs is the module to compile. The flag "--make" tells ghc to look up and compile all modules referred to and required by Main.hs. The flag "-o" is used to specify the name of the final executable command. If this is left out, the default is to produce an executable called a.out.