next up previous contents
Next: Logic programming Up: Introducing types into the Previous: A unification algorithm   Contents

Type inference with shallow types

Consider the basic lambda calculus, enhanced with constants. We have some built-in types $i,j,k,\ldots$ and a set of constants $C_i$ for each built-in type $i$. For instance, if $i$ is the type Char, $C_i$ is the set of character constants. The syntax of lambda terms is then


\begin{displaymath}
\Lambda = c \mid x \mid \lambda x. M \mid M N
\end{displaymath}

where $c \in C_i$ for some built-in type $i$, $x$ is a variable and $M,N \in \Lambda$.

To infer types for these terms, we can set up equations involving type variables inductively as follows. Let $M \in \Lambda$ be a lambda term. Then:

With this syntax, suppose we want to write a function equivalent to the following Haskell definition.

   applypair f x y = (f x,f y)

In this definition, we might like to permit f to be a polymorphic function and x and y to be of different types. For instance, we might ask for the following expression to be well typed, where id is the identity function id z = z.

   applypair id 7 'c' = (id 7, id 'c') = (7,'c')

However, if we try to assign a type to this function, we find that we have the following set of constraints, which cannot be unified.

   id  :: a -> a
   7   :: Int
   'c' :: Char
   a = Int            (from id 7)
   a = Char           (from id 'c')

In fact, the type that Haskell assigns to applypair is (a -> b) -> b -> b -> (b,b). To see why this is so, let us look how this function would be defined in the version of the lambda calculus we have just defined. The corresponding expression is


\begin{displaymath}
\lambda f x y. \mathit{pair~} (f x) (f y)
\mbox{~where~}
\mathit{pair} \equiv \lambda xyz. (z x y)
\end{displaymath}

As the type inference rules we have provided suggest, when we pass a function to this term for the argument $f$ with type $\alpha \to \beta$, the value of $\alpha$ will have to be unified with the types of both $x$ and $y$ because of the expressions $(f x)$ and $(f y)$ in the body of the expression. This then forces $x$ and $y$ to have the same type.

One way to permit the definition of the richer version of applypair is to extend the syntax of lambda terms to allow local definitions as follows.


\begin{displaymath}
\Lambda = C_i \mid x \mid \lambda x. M \mid M N \mid
\mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}M
\end{displaymath}

Here $f = e$ in let ...in ... provides a local declaration. We can now define the lambda term for applypair id as


\begin{displaymath}
\mbox{\textsf{let}~}f = \lambda z. z \mbox{~\textsf{in}~}
\lambda x y. \mathit{pair~} (f x) (f y)
\end{displaymath}

which translates into Haskell as

  applypair x y = (f x,f y) where f z = z

or, equivalently,

  applypair x y = let f z = z in (f x,f y)

What is the type inference rule for let ...in ...? Here is a first attempt at formulating the rule.

With this rule, we find that the two occurrences of $f$ in $
\mbox{\textsf{let}~}f = \lambda z. z \mbox{~\textsf{in}~}\lambda x y. \mathit{pair~}
(f x) (f y) $ have types $\alpha_1 \to \beta_1$ and $\alpha_2 \to
\beta_2$. The expression $f x$ unifies the type variable $\alpha_1$ with the type of $x$ while the expression $f y$ unifies the type variable $\alpha_2$ with the type of $y$. However, since $\alpha_1$ and $\alpha_2$ are different variables, this does not force the type of $x$ to match the type of $y$.

Notice that from the point of view of $\beta $-reduction, the terms $
\mbox{\textsf{let}~}f = e \mbox{~\textsf{in}~}\lambda x. M$ and $(\lambda f x. M) e$ are equivalent. But, as we have seen, the type inference rules for the two expressions are quite different.

In a more extreme example, one form may be typable while the other is not. Consider the expressions $(\lambda I. (I I))(\lambda x.x)$ and $\mbox{\textsf{let}~}I = \lambda x.x \mbox{~\textsf{in}~}(I I)$. Here, the first form cannot be typed because no self-application $f f$ can be typed. However, in the let ...in ... version, the type of $\lambda x.x$ is $\alpha \to \alpha$ and this is copied as $\alpha_1 \to \alpha_1$ and $\alpha_2 \to \alpha_2$ for the copies of $I$ in the body of the function. We can then unify $\alpha_1$ with $\alpha_2 \to \alpha_2$ to derive that the first $I$ in $I I$ has the type $(\alpha_2 \to
\alpha_2) \to (\alpha_2 \to \alpha_2)$ and, overall, $I I$ has type $(\alpha_2 \to \alpha_2)$.

There is a subtlety that we have overlooked in our type inference rule for let ...in .... Consider the function

 applypair2 w x y = ((tag x),(tag y))
   where 
     tag      = pair w
     pair s t = (s,t)

Intuitively, it is clear that the type of applypair2 is a -> b -> c -> ((a,b),(a,c)) because applypair2 constructs the pair of pairs ((w,x),(w,y)) from its three inputs w, x and y.

However, if we apply our type inference strategy, we begin with the types:

  applypair2 :: a -> b -> c -> (d,e)
  pair       :: f -> g -> (f,g)
  tag        :: h -> (i,h)

Using our variable copying rule for let ...in ..., we get that the two instances of tag in applypair2 have type d = h1 -> (i1,h1) and e = h2 -> (i2,h2). The application tag x gives us the constraint h1 = b and the application tag y gives us the constraint h2 = c. The application pair w gives us the constraint a = i, but this constraint is not propagated to the copies i1 and i2 that we have made of i when expanding the type of applypair2. Thus, we get the type applypair2 :: a -> b -> c -> ((i1,b),(i2,c)) instead of the correct type a -> b -> c -> ((a,b),(a,c)).

Clearly, the problem lies in the copies that we made of the type variable i when instantiating the types for the two copies of tag. The distinction between the variables h and i that appear in the type of tag is that the variable i is unified with the type of one of the arguments to the main function within which tag is defined. In the literature, variables like h that are ``free'' within the local definition are called generic variables. The corrected type inference rule for let ...in ... requires that copies be made only for generic variables.

If we apply this new rule when assigning a type to applypair2, we observe that the type variable i should not be duplicated. This results in the type a -> b -> c -> ((i,h1),(i,h2)) being assigned to applypair2 after copying the generic variables of the type of tag into the two instances of tag that occur in the body of applypair2. The constraints a = i, b = h1 and c = h2 then result in the correct type a -> b -> c -> ((a,b),(a,c)) being assigned to applypair2.

The type inference strategy that we have sketched here can be efficiently implemented using a set of inference rules in the same style as those that we described for the second-order polymorphic typed lambda calculus in an earlier section. This algorithm was first described by Milner (he called it Algorithm W) and was implemented in the programming language ML and, subsequently, in other typed functional languages like Haskell. The important fact is that for these languages, type inference is decidable, so the programmer can write function definitions without providing explicit type information. The type inference system can then recognize whether the expression is well-typed. Freeing the programmer from the burden of providing explicit types makes these strongly typed functional languages significantly easier and more attractive to use. In a sense, the combination of let ...in ... definitions in the lambda calculus and shallow types seems to provide an optimum combination of expressiveness and useability in the typed lambda calculus.


next up previous contents
Next: Logic programming Up: Introducing types into the Previous: A unification algorithm   Contents
Madhavan Mukund 2004-04-29