Type Checking
in JavaScript

posted by Darryl on 16 Jan 2015

I'd like to follow up to my previous blog post on implementing a simply typed lambda calculus in JavaScript by demonstrating a more complete, more natural type system. If you haven't read it already, you should do so. The previous type system was a little unnatural because it had no type inference mechanism, and so all of the relevant type information had to be provided in the programs themselves. This time, however, we'll build a type checker that has inference, so we can write more natural programs.

One thing to note regarding type inference like this is that well-designed type programming languages, with properly defined type systems, can infer a lot in the way of types. One often hears that having to write a type signature is just a pain in the butt, and therefore strictly typed programming languages aren't as pleasant to use as untyped or duck-typed ones. But with type inference, you frequently don't need to write type signatures at all. In a language like Haskell, for instance, type signatures can very frequently be omitted.

The GitHub repo for this type checker can be found here.

To begin, let's define the type theory we'd like to embody. We'll again have just pair types and function types, as well as three primitive types Foo, Bar, and Baz which will serve as place holders until we implement some other types in a later post. Again we'll start with the type formation judgment A type that tells us when something is a type or not:

A type
======

-------- Foo Formation      -------- Bar Formation
Foo type                    Bar type


            -------- Baz Formation
            Baz type


A type    B type                  A type    B type
---------------- * Formation      ---------------- -> Formation
    A*B type                        A -> B type

The JavaScript implementation will be the same as before:

var Foo = { tag: "Foo" };
var Bar = { tag: "Bar" };
var Baz = { tag: "Baz" };

function prod(a,b) {
    return { tag: "*", left: a, right: b };
}

function arr(a,b) {
    return { tag: "->", arg: a, ret: b };
}

function judgment_type(a) {
  if ("Foo" == a.tag || "Bar" == a.tag || "Baz" == a.tag) {
      
      return true;
      
  } else if ("*" == a.tag) {
      
      return judgment_type(a.left) && judgment_type(a.right);
      
  } else if ("->" == a.tag) {
      
      return judgment_type(a.arg) && judgment_type(a.ret);
      
  } else {
      
      return false;
      
  }
}

As before, we'll use a snoc linked-list for encoding variable contexts:

var empty = { tag: "<>" }

function snoc(g,x,a) {
    return { tag: ",:", rest: g, name: x, type: a };
}

We'll have new intro and elim rules for the judgment G !- M : A which defines well-typed-ness. These will define programs that have less annotation for types, and which are therefore more natural and easier to use.

G !- M : A
==========

G !- M : A    G !- N : B
------------------------ * Intro
   G !- (M,N) : A*B

G !- P : A*B    G, x : A, y : B !- M : C
---------------------------------------- * Elim
     G !- split P as (x,y) in M : C

This time, our split elim, which does pattern matching for pairs, is not annotated for the types of x and y. If you look at the stuff above the inference line, you see A and B, but these don't appear below the line. If we wanted to simply type check, we'd need to invent these out of thin air, and there are a lot of ways to do that. So instead we can do inference on P, to get back it's type, which had better be of the form A*B, and proceed from there. So, we'll have the old checking function from before, but also a new inferring function:

function pair(m,n) {
  return { tag: "(,)", first: m, second: n };
}

function split(p, x, y, m) {
    return { tag: "split",
             pair: p,
             name_x: x, name_y: y,
             body: m };
}

// judgment_check will be modified shortly
function judgment_check(g, m, a) {
    
    if ("(,)" == m.tag && "*" == a.tag) {
        
        return judgment_check(g, m.first, a.left) &&
               judgment_check(g, m.second, a.right);
        
    } else if ("split" == m.tag) {
        
        var inferred_pair = judgment_infer(g, m);
        
        if (!inferred_pair || "*" != inferred_pair.tag) {
            return false;
        }
        
        return judgment_check(snoc(snoc(g, m.name_x, inferred_pair.left),
                                   m.name_y, inferred_pair.right),
                              m.body,
                              a);
    
    } else {
        
        return false;
        
    }
    
}

This much is basically the same as before, but we'll also define now a function judgment_infer:

// judgment_infer will also be modified shortly
function judgment_infer(g, m) {
    
    if ("(,)" == m.tag) {
        
        var inferred_left = judgment_infer(g, m.first);
        var inferred_right = judgment_infer(g, m.second);
        
        if (!inferred_left || !inferred_right) {
            return null;
        }
        
        return prod(inferred_left, inferred_right);
        
    } else if ("split" == m.tag) {
        
        var inferred_pair = judgment_infer(g, m.pair);
        
        if (!inferred_pair || "*" != inferred_pair.tag) {
            return null;
        }
        
        return judgment_infer(snoc(snoc(g, m.name_x, inferred_pair.left),
                                   m.name_y, inferred_pair.right),
                              m.body);
        
    } else {
        
        return null;
        
    }
    
}

This new function has some obvious similarities to judgment_check, but instead of checking if a term has the appropriate type, it computes the correct type and returns that.

Our definitions for function types are very similar to what we had before as well:

 G, x : A !- M : B
-------------------- -> Intro
G !- \x:A.M : A -> B

G !- M : A -> B    G !- N : A
----------------------------- -> Elim
        G !- M N : B

Unlike before, however, we now tag lambdas with their argument type. This particular type inference algorithm isn't smart enough to solve that for us (to do that would require a unification-based algorithm).

And the implementation of judgment_check is only slightly modified, including the syntax for function application. Again, tho, we use judgment_infer to recover the missing content of the elim rule.

function lam(x,a,m) {
    return { tag: "lam", name: x, arg_type: a, body: m };
}

function app(m,n) {
    return { tag: "app", fun: m, arg: n };
}

// the new version
function judgment_check(g, m, a) {
    
    ... else if ("lam" == m.tag && "->" == a.tag) {
        
        return type_equality(m.arg_type, a.arg) &&
               judgment_check(snoc(g, m.name, a.arg), m.body, a.ret);
        
    } else if ("app" == m.tag) {
        
        var inferred_arg = judgment_infer(g, m.arg);
        
        if (!inferred_arg) {
            return false;
        }
        
        return judgment_check(g, m.fun, arr(inferred_arg, a));
        
    } ...
    
}

And now the modification to judgment_infer:

// judgment_infer will also be modified shortly
function judgment_infer(g, m) {
    
    ... else if ("lam" == m.tag) {
        
        var inferred_body = judgment_infer(snoc(g, m.name, m.arg_type),
                                           m.body);
        
        if (!inferred_body) { return null; }
        
        return arr(m.arg_type, inferred_body);
        
    } else if ("app" == m.tag) {
        
        var inferred_fun = judgment_infer(g, m.fun);
        
        if (!inferred_fun || "->" != inferred_fun.tag ||
            !judgment_check(g, m.arg, inferred_fun.arg)) {
            
            return null;
        }
        
        return inferred_fun.ret;
        
    } ...
    
}

And of course we have the variable rule as before:

function v(n) {
    return { tag: "variable", name: n };
}

function judgment_check(g, m, a) {
    
    ... else if ("variable" == m.tag) {
        
        return type_equality(type_lookup(g, m.name), a);
        
    } ...
    
}

To infer the type of a variable, we'll need to look it up in the context:

function type_lookup(g, n) {
    
    if ("<>" == g.tag) {
        
        return null;
        
    } else if (n == g.name) {
        
        return g.type
        
    } else {
        
        return type_lookup(g.rest, n);
        
    }
    
}

But with that, inference is easy:


function judgment_infer(g, m) {
    
    ... else if ("variable" == m.tag) {
        
        return type_lookup(g, m.name);
        
    } ...
    
}

Lastly, type equality, which is unchanged:

function type_equality(a,b) {
    
    if (("Foo" == a.tag && "Foo" == b.tag) ||
        ("Bar" == a.tag && "Bar" == b.tag) ||
        ("Baz" == a.tag && "Baz" == b.tag)) {
        
        return true;
        
    } else if ("*" == a.tag && "*" == b.tag) {
        
        return type_equality(a.left, b.left) &&
               type_equality(a.right, b.right);
        
    } else if ("->" == a.tag && "->" == b.tag) {
        
        return type_equality(a.arg, b.arg) &&
               type_equality(a.ret, b.ret);
        
    } else {
        
        return false;
        
    }
    
}

For convenience in inspecting what's going on with these functions, it'll help to have a function to convert types to strings, so let's define one:

function show(a) {
    if ("Foo" == a.tag) { return "Foo"; }
    if ("Bar" == a.tag) { return "Bar"; }
    if ("Baz" == a.tag) { return "Baz"; }
    if ("*" == a.tag) {
        return "(" + show(a.left) + "*" + show(a.right) + ")";
    }
    if ("->" == a.tag) {
        return "(" + show(a.arg) + " -> " + show(a.ret) + ")";
    }
    return "Unknown";
}

If we now infer the types for some common functions, we'll get exactly what we expect:

// \x : Foo. x
// infers to  Foo -> Foo
judgment_infer(empty, lam("x",Foo, v("x")));

// \x : Foo. \f : Foo -> Bar. f x
// infers to  Foo -> (Foo -> Bar) -> Bar
judgment_infer(empty, lam("x",Foo,
                        lam("f",arr(Foo,Bar),
                          app(v("f"), v("x")))));

// \p : Foo*Bar. split p as (x,y) in x
// infers to  Foo*Bar -> Foo
judgment_infer(empty, lam("p",prod(Foo,Bar),
                        split(v("p"),"x","y", v("x"))));

If you have comments or questions, get it touch. I'm @psygnisfive on Twitter, augur on freenode (in #languagengine and #haskell). Here's the HN thread if you prefer that mode, and also the Reddit thread.