Albert Y. C. Lai, trebla [at] vex [dot] net
A type inference algorithm uses tools that are not obvious at first sight. To help motivate them, I start with showing how an experienced or clever person does type inference heuristically, which calls for the same tools.
Infer the type of \x > x
I don’t know what should be the type of x, but I can call it u, consider it unknown for now. Then I can deduce:
(\x > x) :: u > u
I will use names starting with u for unknowns, e.g., u1, u2. (Note: They are not necessarily type variables for polymorphism—to be determined later. In contrast, I will write polymorphic types like “∀a. a → a”.)
Assume 5::Integer, True::Bool
Infer the type of \x y > if False then (x, 5) else (True, y)
(x,5) :: (u1,Integer) with u1 unknown for x
(True,y) :: (Bool,u2) with u2 unknown for y
ifthenelse wants them to be the same type, so:
(u1,Integer) = (Bool,u2)
Solving that equation gives u1 = Bool, u2 = Integer.
Answer: Bool > Integer > (Bool, Integer)
let b=False in \x y > if b then (x, 5) else (True, y)
If you’re fast, it’s just like the above.
If you go just a bit slower, it goes like this: I want to zoom into
\x y > if b then (x, 5) else (True, y)
but with the knowledge b::Bool at the back of my mind, so that I can check that “if b” makes sense.
This indicates that an algorithm, or even a person, keeps track of a “type environment” consisting of the variables in scope and their types for easy lookup.
The algorithm also does that when working on lambdas, e.g., now zoom into
if b then (x, 5) else (True, y)
and the type environment has b::Bool, x::u1, y::u2.
\x > if False then (x, 5) else (True, x)
Similar thought process but the equation is
(u1,Integer) = (Bool,u1)
This means Integer = Bool. This is a type error, the “type mismatch” kind.
The other kind of type errors goes like this:
\x > if False then x else [x, x]
I use unknown u1 for x. The equation to solve goes like
u1 = [u1]
Now, theoretically there is nothing wrong with that; the feature name is “infinite types”, and a few (very few) academic toy languages have it.
But people find that it is rarely useful, it is hard to follow even when it is useful, and if a piece of practical code seems to do it, more than 99.9% of the time it’s a typo rather than an intention, e.g., the author meant “then [x]
”. So all practical languages and almost all academic languages ban it. (Statically typed languages, that is.)
In general, if an equation to solve goes like
u1 = … something that mentions u1 …
and is not simply “u1 = u1”, we say that it is a type error, and give the reason “infinite type”. You sometimes also see the wording “occurs check”; it means that the algorithm checks whether u1 occurs on the right side, and finds it.
In the rare case that you find an infinite type useful, the recommendation is to define an ordinary recursive wrapper type:
newtype T = MkT [T]
 so MkT :: [T] > T
unT :: T > [T]
unT (MkT list) = list
You now have to keep wrapping and unwrapping, but the logic is clearer because the code is explicit about when to think of one single T, when to think of a list of T’s.
I need to create new unknowns all the time.
I will run into equations like (u1,Integer) = (Bool,u2), and I need to solve them and find out u1 = Bool, u2 = Integer. This solving is called “unification”; I was unifying (u1,Integer) with (Bool,u2).
Because of that, I need a mutable table of solved unknowns. Every time I solve an equation, it causes an update. Since each entry is telling me to replace an unknown by its solution, this table is a table of substitutions.
I need the type inference algorithm to take one more parameter: a type environment to tell me which variables are in scope and what their types are. It is also a good place to store the types of library functions.
Unification solves two problems in one go: Checking whether two types match up, and if they contain unsolved unknowns, solve them, i.e., find out whether and what can be substituted for the unsolved unknowns to make the two types match up.
It doesn’t return any value. It gives its answer by how it terminates: If the two types match up (perhaps after adding substitutions to the table), return normally; if not, throw an error.
A mutable table of substitutions is maintained, initially empty. Every call to the unification algorithm can update the table. Notation for substitutions: “u := type”.
For clarity, I will make sure that solved unknowns don’t appear in the RHS, e.g., I won’t let this table happen:
u1 := [u2] (I won’t write this, I’ll write u1 := [Integer])
u2 := Integer
Remark: Real implementations allow this for efficiency and play clever tricks elsewhere to compensate. But I am going for least confusion for beginners.
Notation for applying all substitutions in the current table to rewrite a type: “applysubst type”. This returns the rewritten type. I omit coding it up. Example: If the current table is
u3 := Integer
u2 := u1 → u1
then applysubst (u2 → u4 → Bool) returns ((u1→u1) → u4 → Bool).
The type inference algorithm is not always aware that an unknown is already solved and has an entry in the substitution table. It can call the unification algorithm with solved unknowns. For clarity, I first use applysubst to get rid of them:
Then the algorithm for unifyintern can assume that unknowns in inputs are unsolved.
Remark: Real implementations don’t do this, they play clever tricks elsewhere to compensate.
unifyintern is just structural recursion on the syntax of type expressions:
In this example, I start with a nonempty substitution table: It already has an entry “u1 := u2 → u3” (imagine that it was caused by a previous call to unify). We are now requested to unify u1 with u3→u4.
Current table:
u1 := u2 → u3  
unify u1 (u3→u4)
unifyintern (u2→u3) (u3→u4) unifyintern u2 u3 add u3:=u2 replace u3 in the rest of the table  
Updated table:
u1 := u2 → u2 u3 := u2  
unifyintern (applysubst u3) (applysubst u4)
unifyintern u2 u4 add u4:=u2 replace u4 in the rest of the table  
Updated table:
u1 := u2 → u2 u3 := u2 u4 := u2 
Later on, in example steps of the type inference algorithm, when it comes to unification, I will just write the call and the updated table (usually easy to see), like this:
Current table:
u1 := u2 → u3  
unify u1 (u3→u4)
unifyintern (u2→u3) (u3→u4)  
Updated table:
u1 := u2 → u2 u3 := u2 u4 := u2 
Notation:
“infer env term”: Procedure call to infer the type of term; look up variables and types in env, the type environment. The answer is meant to be monomorphic, i.e., if there are unsolved unknowns, let them be, don’t generalize to a polymorphic type. Example: If term is \x > x
, return u1→u1, leaving u1 as a new unsolved unknown.
“inferpoly env term”: call infer, then generalize to polymorphic type. Example: If the \x > x
is part of
let f = \x > x in (f True, f 1)
then we will call inferpoly on \x > x
because we do want ∀a.a→a, so that f can be used on both booleans and integers.
If success, both return the type inferred; if failure, both throw an error.
Again, for least confusion for beginners, we also use applysubst on all answers before returning them.
Notation for environments: For example, an environment that says “the type of x is u3, the type of y is Bool” is written “{x::u3, y::Bool}”. The empty environment is written “{}”. When I clone an environment and add something, I write like “env plus z::u2” to add z of type u2.
The type inference algorithm is a structural recursion over the syntax of term expressions. Here are the cases:
Literals have basically hardcoded types, so just return them. For simplicity, I ignore type classes, so for example I just pretend that 4 is always Integer.
Similarly for most other literals.
Some literals are data constructors of parametrized algebraic data types, e.g., Nothing, the empty list []. I postpone them to the section on algebraic data types later.
The type of a variable is simply given in the environment, or else it’s a “variable not in scope” error—the 3rd and final source of errors in type inference. (If you care about certain library functions, be sure to include them in the initial environment!)
If the type is polymorphic, there is some followup work to do. But first why it is possible: 1. many library functions have polymorphic types; 2. the let
construct (shown later) can put polymorphic types in the environment.
The followup work is to instantiate the type variables with new unknowns. (Let the context worry about solving them.)
Example of finding a polymorphic type and instantiating it:
For a lambda like \v → expr, we can create an unknown, say u, for the type of v, and wait for it to be solved later. Then we can zoom into expr, but keep in mind the additional v::u; in other words, a recursive call but give it a larger environment that has the additional v::u.
If the recursive call figures out that expr has type T, that means the whole lambda has the function type u→T.
Remark: Although u starts as a new unknown, after the recursive call for expr, u may have been solved, so u→T may need a rewrite, even though T doesn’t.
Example: \x → x
For lambdas of two or more parameters, you could treat as nested lambdas, e.g., \x → (\y → expr). If you find that tedious, we can shortcut it:
Example:
For a function application like f e, we can first use recursion to find the types of f and e respectively.
f is being used as a function, so we also need to check whether its type is a function type. Its argument is e, so we also need to check whether the domain type matches the type of e. These checks are what unification is for! We don’t have any requirement on the codomain type at this point, so just create a new unknown for it.
The type of the whole f e is the codomain type, i.e., whatever the new unknown turns out to be.
A small example: The library has id :: ∀a. a→a. Let’s use it like “id True”.
infer {id :: ∀a. a→a} (id True)
Tf := infer {id :: ∀a. a→a} id T := look up id, it’s ∀a. a→a it’s polymorphic create new unknown u1 for a T’ := instantiate ∀a. a→a T’ = u1→u1 return (applysubst (u1→u1)) = u1→u1 Tf = u1→u1 Te := infer {id :: ∀a. a→a} True return Bool Te = Bool create new unknown u2 unify Tf (Te → u2) unifyintern (u1 → u1) (Bool → u2)  
Updated table:
u1 := Bool u2 := Bool  
return (applysubst u2) = Bool

A larger example: \f x → f (f x)
infer {} (\f x → f (f x)):
create new unknowns u1 (for f), u2 (for x) T1 := infer {f::u1, x::u2} (f (f x)): Tf := infer {f::u1, x::u2} f return u1 Tf = u1 T2 := infer {f::u1, x::u2} (f x) Tf2 := infer {f::u1, x::u2} f return u1 Tf2 = u1 Tx := infer {f::u1, x::u2} x return u2 Tx = u2 create new unknown u3 unify Tf2 (Tx → u3) unifyintern u1 (u2 → u3)  
Updated table:
u1 := u2 → u3  
return (applysubst u3) = u3
T2 = u3 create new unknown u4 unify Tf (T2 → u4) unifyintern (u2 → u3) (u3 → u4)  
Updated table:
u1 := u2 → u2 u3 := u2 u4 := u2  
return (applysubst u4) = u2
T1 = u2 return (applysubst (u1→u2→T1)) = (u2→u2)→u2→u2 
For multiple parameters, e.g., f e1 e2, again you can either treat as two function applications, (f e1) e2, or do the obvious shortcut.
ifthenelse requires:
An algebraic data type definition creates two aspects:
Accordingly, it means adding two groups of inference rules. I will illustrate by an example type. Then you extrapolate to other types.
data MI = Non  Has Integer
Making:
Using: For simplicity I just cover caseexpressions with simple patterns:
case eMI of { Non > e0 ; Has v > e1 }
This is like ifthenelse on steroid. We require:
So here we go:
Parameterized algebraic data types are similar, but there are additional unknowns to create for instantiating the type variables. Example:
data Maybe a = Nothing  Just a
The data constructors are polymorphic—you can think of them as:
Nothing :: ∀a. Maybe a
Just :: ∀a. a > Maybe a
Making:
Using:
Extrapolate this to standard parameterized types such as lists and tuples.
The 2tuple type can be understood as
data Tuple2 a b = MkTuple2 a b
except for the syntactic difference between (e1, e2) :: (T1, T2)
and MkTuple2 e1 e2 :: Tuple2 T1 T2
.
Likewise, with a syntactic change, the list type can be understood as
data List a = Nil  Cons a (List a)
People expect these to be legal:
let f = \x > x  expect f :: ∀a. a > a
in (f True, f 4)
let e = []  expect e :: ∀a. [a]
in (True : e, 4 : e)
This means when “let” is used for local definitions (and similarly for “where”), we need to generalize an inferred type to a polymorphic type, e.g., \x > x
is normally just u→u with an unknown u, but here it is generalized to ∀a. a → a so f is polymorphic.
Example of generalize:
Suppose T = u2 → u3 → u4
Suppose u3 appears in envupdated, u2 and u4 don’t.
Then we generalize u2 to a2, u4 to a4, but leave u3 as is.
Result: ∀a2 a4. a2 → u3 → a4
If an unknown ui appears in envupdated, we don’t want to generalize it in the answer. This is because ui participates in an outer context that may solve it to a specific type later. So ui is not supposed to be generalized, at least not now.
Example: let f = \x→x in (f True, f 4)
infer {} (let f = \x → x in (f True, f 4))
T := infer {} (\x → x) … T = u → u Tg := generalize {} (u → u) envupdated := {} create new type variable a for u in (u → u) return (∀a. (u → u but replace u by a)) = ∀a. a → a Tg = ∀a. a → a infer {f :: ∀a. a → a} (f True, f 4)  
T1 := infer {f :: ∀a. a → a} (f True)
Tf := infer {f :: ∀a. a → a} f (∀a. a → a needs instantiation) create new unknown u2 return (u2 → u2) Tf = (u2 → u2) Te := infer {f :: ∀a. a → a} True return Bool Te = Bool create new unknown u3 unify Tf (Te → u3) unifyintern (u2 → u2) (Bool → u3)  
Updated table:
u2 := Bool u3 := Bool  
return (applysubst u3) = Bool
T1 = Bool T2 := infer {f :: ∀a. a → a} (f 4) Tf := infer {f :: ∀a. a → a} f create new unknown u4 return (u4 → u4) Tf = (u4 → u4) Te := infer {f :: ∀a. a → a} 4 return Integer Te = Integer create new unknown u5 unify Tf (Te → u5) unifyintern (u4 → u4) (Integer → u5)  
Updated table:
u2 := Bool u3 := Bool u4 := Integer u5 := Integer  
return (applysubst u5) = Integer
T2 = Integer return (applysubst (T1, T2)) = (Bool, Integer) 
Multiple definitions and recursive definitions. Programmerprovided type signatures. Type classes.
I have more Haskell Notes and Examples