Type Inference

Albert Y. C. Lai, trebla [at] vex [dot] net

Preparation

A type inference algorithm uses tools that are not obvious at first sight. To help motivate them, I start with showing how an experienced or clever person does type inference heuristically, which calls for the same tools.

Creating unknowns

Infer the type of \x -> x

I don’t know what should be the type of x, but I can call it u, consider it unknown for now. Then I can deduce:

(\x -> x) :: u -> u

I will use names starting with u for unknowns, e.g., u1, u2. (Note: They are not necessarily type variables for polymorphism—to be determined later. In contrast, I will write polymorphic types like “∀a. a → a”.)

Solving for unknowns

Assume 5::Integer, True::Bool

Infer the type of \x y -> if False then (x, 5) else (True, y)

(x,5) :: (u1,Integer) with u1 unknown for x

(True,y) :: (Bool,u2) with u2 unknown for y

if-then-else wants them to be the same type, so:

(u1,Integer) = (Bool,u2)

Solving that equation gives u1 = Bool, u2 = Integer.

Answer: Bool -> Integer -> (Bool, Integer)

Type environment

let b=False in \x y -> if b then (x, 5) else (True, y)

If you’re fast, it’s just like the above.

If you go just a bit slower, it goes like this: I want to zoom into

\x y -> if b then (x, 5) else (True, y)

but with the knowledge b::Bool at the back of my mind, so that I can check that “if b” makes sense.

This indicates that an algorithm, or even a person, keeps track of a “type environment” consisting of the variables in scope and their types for easy lookup.

The algorithm also does that when working on lambdas, e.g., now zoom into

if b then (x, 5) else (True, y)

and the type environment has b::Bool, x::u1, y::u2.

Type mismatch error

\x -> if False then (x, 5) else (True, x)

Similar thought process but the equation is

(u1,Integer) = (Bool,u1)

This means Integer = Bool. This is a type error, the “type mismatch” kind.

“Infinite type” error

The other kind of type errors goes like this:

\x -> if False then x else [x, x]

I use unknown u1 for x. The equation to solve goes like

u1 = [u1]

Now, theoretically there is nothing wrong with that; the feature name is “infinite types”, and a few (very few) academic toy languages have it.

But people find that it is rarely useful, it is hard to follow even when it is useful, and if a piece of practical code seems to do it, more than 99.9% of the time it’s a typo rather than an intention, e.g., the author meant “then [x]”. So all practical languages and almost all academic languages ban it. (Statically typed languages, that is.)

In general, if an equation to solve goes like

u1 = … something that mentions u1 …

and is not simply “u1 = u1”, we say that it is a type error, and give the reason “infinite type”. You sometimes also see the wording “occurs check”; it means that the algorithm checks whether u1 occurs on the right side, and finds it.

In the rare case that you find an infinite type useful, the recommendation is to define an ordinary recursive wrapper type:

newtype T = MkT [T]
-- so MkT :: [T] -> T

unT :: T -> [T]
unT (MkT list) = list

You now have to keep wrapping and unwrapping, but the logic is clearer because the code is explicit about when to think of one single T, when to think of a list of T’s.

The tools needed

I need to create new unknowns all the time.

I will run into equations like (u1,Integer) = (Bool,u2), and I need to solve them and find out u1 = Bool, u2 = Integer. This solving is called “unification”; I was unifying (u1,Integer) with (Bool,u2).

Because of that, I need a mutable table of solved unknowns. Every time I solve an equation, it causes an update. Since each entry is telling me to replace an unknown by its solution, this table is a table of substitutions.

I need the type inference algorithm to take one more parameter: a type environment to tell me which variables are in scope and what their types are. It is also a good place to store the types of library functions.

Unification

Unification solves two problems in one go: Checking whether two types match up, and if they contain unsolved unknowns, solve them, i.e., find out whether and what can be substituted for the unsolved unknowns to make the two types match up.

It doesn’t return any value. It gives its answer by how it terminates: If the two types match up (perhaps after adding substitutions to the table), return normally; if not, throw an error.

Substitutions

A mutable table of substitutions is maintained, initially empty. Every call to the unification algorithm can update the table. Notation for substitutions: “u := type”.

For clarity, I will make sure that solved unknowns don’t appear in the RHS, e.g., I won’t let this table happen:

u1 := [u2]    (I won’t write this, I’ll write u1 := [Integer])
u2 := Integer

Remark: Real implementations allow this for efficiency and play clever tricks elsewhere to compensate. But I am going for least confusion for beginners.

Notation for applying all substitutions in the current table to rewrite a type: “apply-subst type”. This returns the rewritten type. I omit coding it up. Example: If the current table is

u3 := Integer
u2 := u1 → u1

then apply-subst (u2 → u4 → Bool) returns ((u1→u1) → u4 → Bool).

Unification algorithm

The type inference algorithm is not always aware that an unknown is already solved and has an entry in the substitution table. It can call the unification algorithm with solved unknowns. For clarity, I first use apply-subst to get rid of them:

unify T1 T2:
    unify-intern (apply-subst T1) (apply-subst T2)

Then the algorithm for unify-intern can assume that unknowns in inputs are unsolved.

Remark: Real implementations don’t do this, they play clever tricks elsewhere to compensate.

unify-intern is just structural recursion on the syntax of type expressions:

unify-intern u T
or
unify-intern T u
where u is an unknown:
    if T=u:
        return
    else if u occurs in T:
        infinite type error
    else:
        (update table)
        add u:=T to the table
        in the rest of the table, replace u by T

(If it is unify-intern u2 u4, should you add u2:=u4 or u4:=u2?
Answer: Doesn’t matter, up to you.)

unify-intern TC1 TC2
where TC1 and TC2 are type constants, e.g., Integer, Bool, Char:
    if TC1 ≠ TC2:
        type mismatch error
    else
        return

unify-intern (Maybe T1) (Maybe T2)
or
unify-intern [T1] [T2]
or similar cases:
    unify-intern T1 T2

unify-intern (T1S1) (T2S2)
or
unify-intern (T1, S1) (T2, S2)
or similar cases:
    unify-intern T1 T2
    (Unknowns in S1 and S2 may now become solved, so I need apply-subst:)
    unify-intern (apply-subst S1) (apply-subst S2)

unify-intern (T1, S1, R1) (T2, S2, R2):
    similar, you get the idea

All other cases:
(e.g., unify-intern Integer Bool, unify-intern [u1] (Maybe u2))
    type mismatch error

Example

In this example, I start with a non-empty substitution table: It already has an entry “u1 := u2 → u3” (imagine that it was caused by a previous call to unify). We are now requested to unify u1 with u3→u4.

Current table:
u1 := u2 → u3
unify u1 (u3→u4)
    unify-intern (u2→u3) (u3→u4)
        unify-intern u2 u3
            add u3:=u2
            replace u3 in the rest of the table
Updated table:
u1 := u2 → u2
u3 := u2
        unify-intern (apply-subst u3) (apply-subst u4)
        unify-intern u2 u4
            add u4:=u2
            replace u4 in the rest of the table
Updated table:
u1 := u2 → u2
u3 := u2
u4 := u2

Later on, in example steps of the type inference algorithm, when it comes to unification, I will just write the call and the updated table (usually easy to see), like this:

Current table:
u1 := u2 → u3
unify u1 (u3→u4)
    unify-intern (u2→u3) (u3→u4)
Updated table:
u1 := u2 → u2
u3 := u2
u4 := u2

Type inference algorithm

Notation:

“infer env term”: Procedure call to infer the type of term; look up variables and types in env, the type environment. The answer is meant to be monomorphic, i.e., if there are unsolved unknowns, let them be, don’t generalize to a polymorphic type. Example: If term is \x -> x, return u1→u1, leaving u1 as a new unsolved unknown.

“infer-poly env term”: call infer, then generalize to polymorphic type. Example: If the \x -> x is part of

let f = \x -> x in (f True, f 1)

then we will call infer-poly on \x -> x because we do want ∀a.a→a, so that f can be used on both booleans and integers.

If success, both return the type inferred; if failure, both throw an error.

Again, for least confusion for beginners, we also use apply-subst on all answers before returning them.

Notation for environments: For example, an environment that says “the type of x is u3, the type of y is Bool” is written “{x::u3, y::Bool}”. The empty environment is written “{}”. When I clone an environment and add something, I write like “env plus z::u2” to add z of type u2.

The type inference algorithm is a structural recursion over the syntax of term expressions. Here are the cases:

Literal

Literals have basically hardcoded types, so just return them. For simplicity, I ignore type classes, so for example I just pretend that 4 is always Integer.

infer env True
    return Bool

infer env 4
    return Integer
    (For simplicity, I don’t cover type classes, I just stick to Integer)

Similarly for most other literals.

Some literals are data constructors of parametrized algebraic data types, e.g., Nothing, the empty list []. I postpone them to the section on algebraic data types later.

Variable

The type of a variable is simply given in the environment, or else it’s a “variable not in scope” error—the 3rd and final source of errors in type inference. (If you care about certain library functions, be sure to include them in the initial environment!)

If the type is polymorphic, there is some follow-up work to do. But first why it is possible: 1. many library functions have polymorphic types; 2. the let construct (shown later) can put polymorphic types in the environment.

The follow-up work is to instantiate the type variables with new unknowns. (Let the context worry about solving them.)

infer env v
where v is a variable:
    T := look up the type of v in env
    if not found, “variable not in scope” error
    if T is polymorphic:
        create new unknowns to instantiate with
        T’ := instantiate T
        (Example below.)
        return (apply-subst T’)
    else
        return (apply-subst T)

Example of finding a polymorphic type and instantiating it:

infer {x::u2, f::∀a b. a→b→a} f:
    T := look up the type of f in env
    T = ∀a b. a→b→a
    create unknowns u4 for a, u5 for b
    T’ := instantiate ∀a b. a→b→a
    T’ = u4→u5→u4
    return u4→u5→u4

Lambda

For a lambda like \vexpr, we can create an unknown, say u, for the type of v, and wait for it to be solved later. Then we can zoom into expr, but keep in mind the additional v::u; in other words, a recursive call but give it a larger environment that has the additional v::u.

If the recursive call figures out that expr has type T, that means the whole lambda has the function type u→T.

infer env (\vexpr):
    create new unknown, say u (for the type of v)
    T := infer (env plus v::u) expr
    return (apply-subst (u→T))

Remark: Although u starts as a new unknown, after the recursive call for expr, u may have been solved, so u→T may need a rewrite, even though T doesn’t.

Example: \x → x

infer {} (\x → x):
    create new unknown u1 for x
    T := infer {x::u1} x
        return u1
    T = u1
    return (apply-subst (u1→T)) = u1→u1

For lambdas of two or more parameters, you could treat as nested lambdas, e.g., \x → (\y → expr). If you find that tedious, we can shortcut it:

infer env (\v w → expr):
    create new unknowns, say uv (for v), uw (for w)
    T := infer (env plus v::uv, w::uw) expr
    return (apply-subst (uv→uw→T))

Example:

infer {} (\x y → x):
    create new unknowns ux, uy
    T := infer {x::ux, y::uy} x
        return ux
    T = ux
    return (apply-subst (ux→uy→ux)) = ux→uy→ux

Function application

For a function application like f e, we can first use recursion to find the types of f and e respectively.

f is being used as a function, so we also need to check whether its type is a function type. Its argument is e, so we also need to check whether the domain type matches the type of e. These checks are what unification is for! We don’t have any requirement on the codomain type at this point, so just create a new unknown for it.

The type of the whole f e is the codomain type, i.e., whatever the new unknown turns out to be.

infer env (f e):
    Tf := infer env f
    Te := infer env e
    (We now check Tf = Te→???)
    create new unknown u
    unify Tf (Te → u)
    return (apply-subst u)

A small example: The library has id :: ∀a. a→a. Let’s use it like “id True”.

infer {id :: ∀a. a→a} (id True)
    Tf := infer {id :: ∀a. a→a} id
        T := look up id, it’s ∀a. a→a
        it’s polymorphic
        create new unknown u1 for a
        T’ := instantiate ∀a. a→a
        T’ = u1→u1
        return (apply-subst (u1→u1)) = u1→u1
    Tf = u1→u1
    Te := infer {id :: ∀a. a→a} True
        return Bool
    Te = Bool
    create new unknown u2
    unify Tf (Te → u2)
        unify-intern (u1 → u1) (Bool → u2)
Updated table:
u1 := Bool
u2 := Bool
    return (apply-subst u2) = Bool

A larger example: \f x → f (f x)

infer {} (\f x → f (f x)):
    create new unknowns u1 (for f), u2 (for x)
    T1 := infer {f::u1, x::u2} (f (f x)):
        Tf := infer {f::u1, x::u2} f
            return u1
        Tf = u1

        T2 := infer {f::u1, x::u2} (f x)
            Tf2 := infer {f::u1, x::u2} f
                return u1
            Tf2 = u1
            Tx := infer {f::u1, x::u2} x
                return u2
            Tx = u2
            create new unknown u3
            unify Tf2 (Tx → u3)
                unify-intern u1 (u2 → u3)
Updated table:
u1 := u2 → u3
            return (apply-subst u3) = u3
        T2 = u3

        create new unknown u4
        unify Tf (T2 → u4)
            unify-intern (u2 → u3) (u3 → u4)
Updated table:
u1 := u2 → u2
u3 := u2
u4 := u2
        return (apply-subst u4) = u2
    T1 = u2
    return (apply-subst (u1→u2→T1)) = (u2→u2)→u2→u2

For multiple parameters, e.g., f e1 e2, again you can either treat as two function applications, (f e1) e2, or do the obvious shortcut.

if-then-else

if-then-else requires:

infer env (if cond then e1 else e2):
    (check cond :: Bool)
    Tcond := infer env cond
    unify Tcond Bool

    (find types of e1, e2)
    Te1 := infer env e1
    Te2 := infer env e2

    (but they have to be the same type!)
    unify Te1 Te2
    return (apply-subst Te1)

Algebraic data types (unparameterized)

An algebraic data type definition creates two aspects:

Accordingly, it means adding two groups of inference rules. I will illustrate by an example type. Then you extrapolate to other types.

data MI = Non | Has Integer

Making:

infer env Non
    return MI

infer env (Has expr):
    (check expr :: Integer first)
    T := infer env expr
    unify T Integer
    return MI

Using: For simplicity I just cover case-expressions with simple patterns:

case eMI of { Non -> e0 ; Has v -> e1 }

This is like if-then-else on steroid. We require:

So here we go:

infer env (case eMI of { Non → e0 ; Has ve1 }):
    (check eMI :: MI)
    T := infer env eMI
    unify T MI

    (infer e0)
    T0 := infer env e0

    (infer e1, note that local var v::Integer is in scope)
    T1 := infer (env plus v::Integer) e1

    (but they have to be the same type!)
    unify T0 T1
    return (apply-subst T0)

Algebraic data types (parameterized)

Parameterized algebraic data types are similar, but there are additional unknowns to create for instantiating the type variables. Example:

data Maybe a = Nothing | Just a

The data constructors are polymorphic—you can think of them as:

Nothing :: ∀a. Maybe a
Just :: ∀a. a -> Maybe a

Making:

infer env Nothing
    create new unknown u
    return (Maybe u)

infer env (Just expr)
    T := infer env expr
    return (Maybe T)

Using:

infer env (case eM of { Nothing → e0; Just ve1 }):
    (check eM :: Maybe ???)
    T := infer env eM
    create new unknown u
    unify T (Maybe u)

    (infer e0)
    T0 := infer env e0

    (infer e1, note local var v::u since eM = Just v :: Maybe u)
    T1 := infer (env plus v::u) e1

    (but they have to be the same type!)
    unify T0 T1
    return (apply-subst T0)

Extrapolate this to standard parameterized types such as lists and tuples.

The 2-tuple type can be understood as

data Tuple2 a b = MkTuple2 a b

except for the syntactic difference between (e1, e2) :: (T1, T2) and MkTuple2 e1 e2 :: Tuple2 T1 T2.

Likewise, with a syntactic change, the list type can be understood as

data List a = Nil | Cons a (List a)

let (single definition, non-recursive)

People expect these to be legal:

let f = \x -> x       -- expect f :: ∀a. a -> a
in (f True, f 4)

let e = []            -- expect e :: ∀a. [a]
in (True : e, 4 : e)

This means when “let” is used for local definitions (and similarly for “where”), we need to generalize an inferred type to a polymorphic type, e.g., \x -> x is normally just u→u with an unknown u, but here it is generalized to ∀a. a → a so f is polymorphic.

infer env (let v = exprv in expr)
    Tv := infer env exprv
    Tvg := generalize env Tv
    infer (env plus v::Tvg) expr

generalize env T:
    env-updated := like env but all types are rewritten by apply-subst
    (Unknowns in env may have become solved recently,
    so we need env-updated for the following to make sense.)

    for each unknown ui in T:
        if ui doesn’t also appear in env-updated:
            create new type variable, say ai
    return (∀(the ai’s) . (T but replace each ui by ai))

Example of generalize:
Suppose T = u2 → u3 → u4
Suppose u3 appears in env-updated, u2 and u4 don’t.
Then we generalize u2 to a2, u4 to a4, but leave u3 as is.
Result: ∀a2 a4. a2 → u3 → a4

If an unknown ui appears in env-updated, we don’t want to generalize it in the answer. This is because ui participates in an outer context that may solve it to a specific type later. So ui is not supposed to be generalized, at least not now.

Example: let f = \x→x in (f True, f 4)

infer {} (let f = \x → x in (f True, f 4))
    T := infer {} (\x → x)
        …
    T = u → u
    Tg := generalize {} (u → u)
        env-updated := {}
        create new type variable a for u in (u → u)
        return (∀a. (u → u but replace u by a)) = ∀a. a → a
    Tg = ∀a. a → a
    infer {f :: ∀a. a → a} (f True, f 4)
        T1 := infer {f :: ∀a. a → a} (f True)
            Tf := infer {f :: ∀a. a → a} f
                (∀a. a → a needs instantiation)
                create new unknown u2
                return (u2 → u2)
            Tf = (u2 → u2)
            Te := infer {f :: ∀a. a → a} True
                return Bool
            Te = Bool
            create new unknown u3
            unify Tf (Te → u3)
                unify-intern (u2 → u2) (Bool → u3)
Updated table:
u2 := Bool
u3 := Bool
            return (apply-subst u3) = Bool
        T1 = Bool

        T2 := infer {f :: ∀a. a → a} (f 4)
            Tf := infer {f :: ∀a. a → a} f
                create new unknown u4
                return (u4 → u4)
            Tf = (u4 → u4)
            Te := infer {f :: ∀a. a → a} 4
                return Integer
            Te = Integer
            create new unknown u5
            unify Tf (Te → u5)
                unify-intern (u4 → u4) (Integer → u5)
Updated table:
u2 := Bool
u3 := Bool
u4 := Integer
u5 := Integer
            return (apply-subst u5) = Integer
        T2 = Integer

        return (apply-subst (T1, T2)) = (Bool, Integer)

Not covered this time

Multiple definitions and recursive definitions. Programmer-provided type signatures. Type classes.


I have more Haskell Notes and Examples