Next: Logic programming Up: Introducing types into the Previous: A unification algorithm Contents

Type inference with shallow types

Consider the basic lambda calculus, enhanced with constants. We have some built-in types $i,j,k,\ldots$ and a set of constants for each built-in type . For instance, if is the type Char, is the set of character constants. The syntax of lambda terms is then

$\begin{displaymath} \Lambda = c \mid x \mid \lambda x. M \mid M N \end{displaymath}$

where $c \in C_i$ for some built-in type , is a variable and $M,N \in \Lambda$ .

To infer types for these terms, we can set up equations involving type variables inductively as follows. Let $M \in \Lambda$ be a lambda term. Then:

If is a constant $c \in C_i$ , the type of is .
If is a variable , assign a fresh type variable $\alpha$ to .
If is of the form $\lambda x. M'$ , assign the type $\alpha \to \beta$ for fresh type variables $\alpha$ and $\beta$ . Inductively, let $\gamma$ be the type of in . Then, the input type $\alpha$ of $\lambda x. M'$ should match the type of , so add the equation $\alpha = \gamma$ to the set of type equations.
If is of the form , we must inductively have assigned a type of the form $\alpha \to \beta$ and a type of the form $\gamma$ . Assign the type $\beta$ and add the equation $\alpha = \gamma$ to the set of type equations to enforce that this application is well-typed.

With this syntax, suppose we want to write a function equivalent to the following Haskell definition.

   applypair f x y = (f x,f y)

In this definition, we might like to permit f to be a polymorphic function and x and y to be of different types. For instance, we might ask for the following expression to be well typed, where id is the identity function id z = z.

   applypair id 7 'c' = (id 7, id 'c') = (7,'c')

However, if we try to assign a type to this function, we find that we have the following set of constraints, which cannot be unified.

   id  :: a -> a
   7   :: Int
   'c' :: Char
   a = Int            (from id 7)
   a = Char           (from id 'c')

In fact, the type that Haskell assigns to applypair is (a -> b) -> b -> b -> (b,b). To see why this is so, let us look how this function would be defined in the version of the lambda calculus we have just defined. The corresponding expression is

$\begin{displaymath} \lambda f x y. \mathit{pair~} (f x) (f y) \mbox{~where~} \mathit{pair} \equiv \lambda xyz. (z x y) \end{displaymath}$

As the type inference rules we have provided suggest, when we pass a function to this term for the argument with type $\alpha \to \beta$ , the value of $\alpha$ will have to be unified with the types of both and because of the expressions and in the body of the expression. This then forces and to have the same type.

One way to permit the definition of the richer version of applypair is to extend the syntax of lambda terms to allow local definitions as follows.

$\begin{displaymath} \Lambda = C_i \mid x \mid \lambda x. M \mid M N \mid \mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}M \end{displaymath}$

Here in let ...in ... provides a local declaration. We can now define the lambda term for applypair id as

$\begin{displaymath} \mbox{\textsf{let}~}f = \lambda z. z \mbox{~\textsf{in}~} \lambda x y. \mathit{pair~} (f x) (f y) \end{displaymath}$

which translates into Haskell as

  applypair x y = (f x,f y) where f z = z

or, equivalently,

  applypair x y = let f z = z in (f x,f y)

What is the type inference rule for let ...in ...? Here is a first attempt at formulating the rule.

Let $M = \mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}M'$ where inductively has type . Let $\{\alpha,\beta,\ldots\}$ be the set of type variables that occur in . Then for each instance of in , make copies of these variables. Thus, the first instance of in will be assigned type with the variables modified to $\alpha_1,\beta_1,\ldots$ , the second instance will be assigned type with the variables modified to $\alpha_2, \beta_2,\ldots$ , and so on.

With this rule, we find that the two occurrences of in $\mbox{\textsf{let}~}f = \lambda z. z \mbox{~\textsf{in}~}\lambda x y. \mathit{pair~} (f x) (f y)$ have types $\alpha_1 \to \beta_1$ and $\alpha_2 \to \beta_2$ . The expression unifies the type variable $\alpha_1$ with the type of while the expression unifies the type variable $\alpha_2$ with the type of . However, since $\alpha_1$ and $\alpha_2$ are different variables, this does not force the type of to match the type of .

Notice that from the point of view of $\beta$ -reduction, the terms $\mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}\lambda x. M$ and $(\lambda f x. M) e$ are equivalent. But, as we have seen, the type inference rules for the two expressions are quite different.

In a more extreme example, one form may be typable while the other is not. Consider the expressions $(\lambda I. (I I))(\lambda x.x)$ and $\mbox{\textsf{let}~}I = \lambda x.x \mbox{~\textsf{in}~}(I I)$ . Here, the first form cannot be typed because no self-application can be typed. However, in the let ...in ... version, the type of $\lambda x.x$ is $\alpha \to \alpha$ and this is copied as $\alpha_1 \to \alpha_1$ and $\alpha_2 \to \alpha_2$ for the copies of in the body of the function. We can then unify $\alpha_1$ with $\alpha_2 \to \alpha_2$ to derive that the first in has the type $(\alpha_2 \to \alpha_2) \to (\alpha_2 \to \alpha_2)$ and, overall, has type $(\alpha_2 \to \alpha_2)$ .

There is a subtlety that we have overlooked in our type inference rule for let ...in .... Consider the function

 applypair2 w x y = ((tag x),(tag y))
   where 
     tag      = pair w
     pair s t = (s,t)

Intuitively, it is clear that the type of applypair2 is a -> b -> c -> ((a,b),(a,c)) because applypair2 constructs the pair of pairs ((w,x),(w,y)) from its three inputs w, x and y.

However, if we apply our type inference strategy, we begin with the types:

  applypair2 :: a -> b -> c -> (d,e)
  pair       :: f -> g -> (f,g)
  tag        :: h -> (i,h)

Using our variable copying rule for let ...in ..., we get that the two instances of tag in applypair2 have type d = h1 -> (i1,h1) and e = h2 -> (i2,h2). The application tag x gives us the constraint h1 = b and the application tag y gives us the constraint h2 = c. The application pair w gives us the constraint a = i, but this constraint is not propagated to the copies i1 and i2 that we have made of i when expanding the type of applypair2. Thus, we get the type applypair2 :: a -> b -> c -> ((i1,b),(i2,c)) instead of the correct type a -> b -> c -> ((a,b),(a,c)).

Clearly, the problem lies in the copies that we made of the type variable i when instantiating the types for the two copies of tag. The distinction between the variables h and i that appear in the type of tag is that the variable i is unified with the type of one of the arguments to the main function within which tag is defined. In the literature, variables like h that are ``free'' within the local definition are called generic variables. The corrected type inference rule for let ...in ... requires that copies be made only for generic variables.

Let $M = \mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}M'$ where inductively has type . Let $\{\alpha,\beta,\ldots\}$ be the set of generic type variables that occur in . Then for each instance of in , make copies of these generic variables. Thus, the first instance of in will be assigned type with the generic variables modified to $\alpha_1,\beta_1,\ldots$ , the second instance will be assigned type with the generic variables modified to $\alpha_2, \beta_2,\ldots$ , and so on. The non-generic type variables in retain their identity in all instances of .

If we apply this new rule when assigning a type to applypair2, we observe that the type variable i should not be duplicated. This results in the type a -> b -> c -> ((i,h1),(i,h2)) being assigned to applypair2 after copying the generic variables of the type of tag into the two instances of tag that occur in the body of applypair2. The constraints a = i, b = h1 and c = h2 then result in the correct type a -> b -> c -> ((a,b),(a,c)) being assigned to applypair2.

The type inference strategy that we have sketched here can be efficiently implemented using a set of inference rules in the same style as those that we described for the second-order polymorphic typed lambda calculus in an earlier section. This algorithm was first described by Milner (he called it Algorithm W) and was implemented in the programming language ML and, subsequently, in other typed functional languages like Haskell. The important fact is that for these languages, type inference is decidable, so the programmer can write function definitions without providing explicit type information. The type inference system can then recognize whether the expression is well-typed. Freeing the programmer from the burden of providing explicit types makes these strongly typed functional languages significantly easier and more attractive to use. In a sense, the combination of let ...in ... definitions in the lambda calculus and shallow types seems to provide an optimum combination of expressiveness and useability in the typed lambda calculus.

Next: Logic programming Up: Introducing types into the Previous: A unification algorithm Contents

Madhavan Mukund 2004-04-29