Introduction to Programming, Aug-Dec 2008 Lecture 15, Wednesday 15 Oct 2008 return: promoting Values to Actions ----------------------------------- Suppose we want to write a function that reads a character and checks if it is '\n', the character corresponding to a newline. The function we want has the following type isnewline :: IO Bool because it takes no argument, does some IO (reads a character) and generates a result of type Bool (was the character equal to '\n'?). Here is an attempt at writing isnewline. isnewline = do c <- getChar if (c == '\n') then ? else ?? At ? and ?? we have to generate a result of type Bool. Unfortunately, it is not enough at this point to just generate True or False, as follows. isnewline = do c <- getChar if (c == '\n') then True else False This is because we want the final item in the do to be an action of type IO Bool, not a value of type Bool. The function return allows us to promote a simple value to an action. This leads us to the following version of isnewline. isnewline = do c <- getChar return (c == '\n') Composing actions recursively ----------------------------- Suppose we want to read a line of characters, terminated by '\n', and return the line as a String. The function we want is getLine = IO String which clearly requires as sequence of getChar actions. Here is an inductive definition of getLine: getLine = read a character c if c is '\n' the line is empty, return "" else read the rest of the line as s return (c:s) The actual definition is pretty much the same: getLine = do c <- getChar if (c == '\n') return "" else cs <- getLine return (c:cs) Notice the recursive call to getLine within the do. Exception handling ------------------ One of the complications associated with input/output in any language is that interaction with the real world often produces unexpected problems. For instance, the programe may be asked to read from a nonexistent file, or the disk might become full when writing. A program that interacts over the network might find that the network connection has temporarily failed. Such problems are called exceptions and should be clearly differentiated from computational errors such as dividing by zero or trying to extract the head of an empty list. There is no sensible way to continue execution when such a computational error arises. However, an exception such as "file not found" can be dealt with by asking the user to supply an alternate file name. Like many other languages, Haskell provides a mechanism for exception handling. The idea is to bundle an external "exception handler" along with the main function that may "raise" (or "throw") an exception. If no exception arises, the main function executes normally and produces a result. If an exception arises, the exception is passed to the handler function, which takes suitable action. The main point to be kept in mind is that the exception handler is expected, in the best case, to restore the computation to its normal course. Thus, the type of the result produced by the exception handler should be identical to that of the original function so that, when composed with other functions, the function+exception handler combination looks the same as the function alone. Schematically, we have the following picture in mind: ------------------- Argument | -------- | Result -------------->|function|----------------> | -------- | | | | | | | exception | | | | | | | v | | | --------- | | | |exception| | | | | handler |--- | | --------- | ------------------- The argument is passed to the function. If the function terminates normally, it produces a result. If it terminates abnormally, information about the exception is passed to the handler. If the handler can recover from the exception, it produces a result whose type is compatible with the original function. Haskell provides a function "catch" to combine a function with an exception handler into a single entity. Before going into the details of catch, let us look at a concrete example. In the function getLine that we wrote above, we assume that each line of text is terminated by a '\n'. It is possible, however, that the last line of text terminates with an end of file marker, without an explicit '\n'. Thus, while reading till the end of the current line, getLine could encounter an exceptional situation where end of file is reached. How should we recover from this? If we assume that the program can recognize an end of file situation, a good strategy is to treat this as a pseudo "end of line" and return the characters read at the end as the last line of text. Since the actual reading of input is done by getChar, the end of file error will arise there. In getLine, we check whether the character returned by getChar is '\n' is newline. What we can do is to bundle getChar with an excepttion handler that generates a '\n' when an end of file error occurs. Since, getLine uses getChar as a blackbox to read the next character, it has no way of knowing whether the '\n' that getChar returns was genuinely supplied in the input or was inserted by the exception handler in response to an end of file error. Thus, our aim is to provide an exception handler for getChar. This handler takes an exception as its argument and generates a Char as its result. Haskell has a type IOError for exceptions that arise from input/output operations. Since getChar is of type IO Char, the exception handler we seek can be written as eofhandler :: IOError -> IO Char so that its result type is compatible with that of getChar. What eofhandler has to do is to check if the error it receives is indeed an end of file error --- any error that getChar encounters would be passed onto eofhandler, but only an end of file error can be dealt with in the way we have described. The function isEOFError :: IOError -> Bool can be used to test whether the argument passed to eofhandler is an end of file error. Note that we do not actually check the value of IOError explicitly but rely on an abstract predicate to let us know what type of error we have got. An alternative would be to make IOError a datatype that enumerated all possible IO errors. The main reason for not doing this is that some implementations may support different kinds of such errors, and datatypes in Haskell are not extensible. Although the predicate approach is more cumbersome to use, it is easy to add support for a new IO error by simply including an additional predicate. Here then is a definition for eofhandler. eofhandler :: IOError -> IO Char eofhandler e | isEOFError e = return '\n' Note the "return" to make sure that the result type is IO Char and not just Char. Note also that eofhandler will itself generate an error if any other IOError is passed to it, since it can only respond to one case, where the argument is an EOFError. We will return to this point later. First, we return to catch. The function catch simply combines a function and its handler into a single unit. For instance, if we write getCharEOF = catch getChar eofhandler we get a function getCharEOF :: IO Char that can be used in place of getChar and that can convert end of file exceptions into '\n'. Thus, we can seamlessly replace getChar by getCharEOF in getLine as follows. getLine = do c <- getCharEOF if (c == '\n') return "" else cs <- getLine return (c:cs) The type of catch is catch :: IO a -> (IOError -> IO a) -> IO a In other words, catch combines a function and an exception handler to produce a new function with the same type as the original one. Let us get back to the question of what happens when getCharEOF receives an IOError other than an EOFError. As things stand, eofhandler will crash, saying "pattern match error". Instead, we can explicitly pass the error back up one level to the function that called getCharEOF using ioError, as follows. eofhandler :: IOError -> IO Char eofhandler e | isEOFError e = return '\n' | otherwise = ioError e The error passed on by ioError can be caught by getLine. For example, we could write getLinehandler :: IOError -> IO String getLinehandler e = return ("Error: "++(show e)) and getLinenew = catch getLine getLinehandler In this way, each exception or error can be caught and passed on to a handler associated with the function where it occurs. The handler can either explicitly deal with the error, pass it on or ignore it. If the handler ignores the error, the program terminates. If the handler passes it on, the error propogates one level up. Ultimately, the error will reach the top level. We can assume that there is an implicit error hander at the top level that prints out information about the error and terminates the program. This is the error message we see when a program fails. Reading and writing files ------------------------- In principle, reading from and writing to a file is no different from keyboard input and screen output. We just need to supply an extra parameter, namely the file name. In Haskell, a file name (or a fully qualified path) is just a String, but for the sake of abstractness, Haskell uses the type definition type FilePath = String We could then conceive of a function getCharFile that reads a character from a file with type getCharFile :: FilePath -> IO Char In other words, getCharFile takes the file name as its argument and returns the next character from that file. In practice, however, invoking a file explicitly when we read or write is very inefficient. If we supply the file name with getChar and putChar, each time we read or write a character, we have to open and close the file. Operating systems incur some overhead with opening and closing files so it is better not to repeatedly do this. Instead, we set up a connection to a file through a device called a "handle". When we want to use a file, we first open it and associate with it a handle. We then perform our input/output with respect to the handle. Finally, when we are done, we close the file and discard the handle. To begin with, we need to import the IO module to use the file functions. import IO Opening a file requires the file name and the mode of opening the file. The different modes in which a file can be opened are given by an enumerated type, whose values are self-explanatory. data IOMode = ReadMode | WriteMode | AppendMode | ReadWriteMode Here, then, is the function to open a file. openFile :: FilePath -> IOMode -> IO Handle When we open a file, we get back a Handle. We can then use the functions hGetChar and hPutChar to read and write characters via the given Handle (the h at the beginning of the function name indicates that the function works with respect to handles). Here are the types of hGetChar and hPutChar. hGetChar :: Handle -> IO Char hPutChar :: Handle -> Char -> IO() There are also functions hGetLine :: Handle -> IO String, and hPutStr :: Handle -> String -> IO() that read a line from a file and write a string to a file, respectively. Note that hPutStr does not automatically insert '\n' to signify the end of a line --- you have to put this in explicitly wherever you want a line to end. Finally, hClose :: Handle -> IO () closes a file. There is also a function hGetContents :: Handle -> IO String that reads in the entire text from a file as a String. This is done lazily, so a function such as copyfile :: Handle -> Handle copyfile fromhandle tohandle = do s <- hGetContents fromhandle hPutStr tohandle s will not necessarily read in the entire file associate with fromhandle. Typically, if its argument s is beyond a certain limit, hPutStr will write out its argument in blocks, generating a fixed amount of text at time. What happens is that hGetContents reads the contents of fromhandle in blocks corresponding to the way hPutStr writes out its values. Here is now a more elaborate program that reads two file names from the keyboard and copies the contents of the first file into the second one. putStr is a builtin function that prints a String to the screen. main = do fromhandle <- getAndOpenFile "Copy from: " ReadMode tohandle <- getAndOpenFile "Copy to: " WriteMode copyfile fromhandle tohandle hClose fromhandle hClose tohandle putStr "Done." getAndOpenFile :: String -> IOMode -> IO Handle getAndOpenFile prompt mode = do putStr prompt name <- getLine catch (openFile name mode) openhandler openhandler :: (String -> IOMode -> IO Handle) -> (IOError -> String -> IOMode -> IO Handle) -> (String -> IOMode -> IO Handle) openhandler e = do putStr ("Cannot open "++ name ++ "\n") getAndOpenFile prompt mode getAndOpenFile reads a file name and opens it with the appropriate mode. If opening the file generates an error, openhandler prints an error message and repeats the process of asking for a filename to open. ======================================================================