Making Sense of Subtypes

posted by Darryl on 16 Apr 2015

Subtypes are notoriously easy to mess up when designing a programming language, but they really shouldn't be. I think part of the reason is that most people try to set out their subtyping rules by gut instinct, because they don't really know how to do it properly. I don't know of any good tutorial on this, so fair enough. Here's that tutorial.


The general idea of subtyping is that some things (some values, some objects, whatever) of one type are more or less a variety of some other type. For instance, integers are a variety of real number, oaks are a variety of trees, and so on. Most programmers are familiar with this in the context of object oriented programming and subclasses. This is summed up quite nicely by the Liskov Substitution Principle, which states that if P(x) is true for any x of type A, and B is a subtype of A, then P(y) is true for any y of type B. Or to put it into concrete terms, if we know something about trees, and oaks are a kind of tree, then we know something about oaks.

This leads to some initially counter-intuitive facts, however. Contravariance, for instance, is a frequent issue that language designers get wrong, and this leads to quite a number of bugs. Of the following two possible rules for subtyping, one is very wrong, the other is very right. Which is which?

Proposal 1:
  If   A is a subtype of A',
  And  B is a subtype of B',
  Then A -> B is a subtype of A' -> B'

Proposal 2:
  If   A is a subtype of A',
  And  B is a subtype of B',
  Then A' -> B is a subtype of A -> B' (swapping A and A')

If you picked (1), congratulations! You just broke your programming language! But don't worry, you're in good company.

Rules for Pair Subtyping

To pin down subtypes, I'm going to use a type theoretic presentation of a number of common types from the Simply Typed Lambda Calculus, namely pairs types, function types, and sum/disjoint union types. Hopefully this won't be too out of people's comfort zones. If it is, I suggest checking out my blog post on building a type checker in JS or looking at some of the things on my my suggested reading/watching.

Let's consider pairs first. If you've got a pair p : Integer * Oak, we expect that we should be able to use it wherever we need a pair of type Real * Tree, because, after all, Integers are Reals, and Oaks are Trees, right? Pairing them up doesn't change that. The key words here are use and need. We should be able to use the one wherever we need the other.

In type theoretic terms, this means we want to be able to write a typing derivation that looks like this:

p : A*B     A*B ⊑ A'*B'
----------------------- subtype
      p : A'*B'
  some proof needing
a program of type A'*B'

To make this work, we need to explain when we're allowed to conclude A*B ⊑ A'*B' (A*B is a subtype of A'*B'). Suppose, however, that we didn't have subtyping. What would we do then? Well, we would need some kind of conversion function convert : A*B -> A'*B' that we could apply. For instance, integerToReal or oakToTree. When the types are merely convertible, not truly subtypes, this is necessary, but if one type really is just a special kind of another type, no conversion should be necessary, we just need to convince the type system that it's ok to do. In some sense, that's exact what the statement A*B ⊑ A'*B' does. It almost means the same thing as A*B -> A'*B', but restricted to functions that are, more or less, the identity function, and which therefore don't even have to be written because they don't do anything. Of course, we probably need to know how to break this problem down into sub-problems based on A, B, A', and B' since we don't think it should work for any random choice of those.

This gives us a hint now what the right approach to subtyping is: if A*B ⊑ A'*B' is more or less an identity function, but one that relies on breaking down the problem into subproblems by the types, then perhaps we can think in terms of a well-known identity function that does the very same thing: eta expansion. The process of eta expanding a value is simply one of converting it into an equivalent value (equivalent in terms of how you can use it), given ways of doing the same for sub-problems based on the types. For pairs, this just means splitting the pair into its parts, then putting them back together again. Given a pair p : A*B, we split into fst(p) : A and snd(p) : B, and then re-pair them as (fst(p), snd(p)). Or if you prefer case analysis, case p of { (x,y) -> (x,y) }.

In terms of behavior, hese expansions do nothing at all, which is why they constitute an identity. What if, however, we used subtyping rules instead of the recursive eta expansion? Well, we'd turn fst(p) : A into fst(p) : A' using A ⊑ A', and similarly, we'd turn snd(p) : B into snd(p) : B' using B ⊑ B', so when we paired them up, we'd get (fst(p), snd(p)) : A'*B'. Hey wait a minute! We just turned p : A*B into (fst(p),snd(p)) : A'*B' in a way that we know preserves behavior, and it works for any such p! Surely that would means A*B is a subtype of A'*B'!

Let's look at the full derivation we just proposed:

 p : A*B                             p : A*B
---------- fst                      ---------- snd
fst(p) : A       A ⊑ A'             snd(p) : B          B ⊑ B'
----------------------- subtyping   -------------------------- subtyping
      fst(p) : A'                          snd(p) : B'
      ------------------------------------------------ pair
                  (fst(p), snd(p)) : A'*B'

In some sense, this proof schema justifies the claim that A*B ⊑ A'*B' whenever we know A ⊑ A' and B ⊑ B', because we could, if necessary, decompose the larger claim into a proof using the smaller pieces using proof that's almost an eta expansion. Or to put it another way, we're allowed to add the following subtyping rule because if we ever use it's conclusion (which we might be skeptical about), we can always convert to using its premises (which we're less skeptical about):

A ⊑ A'   B ⊑ B'
--------------- subtyping
  A*B ⊑ A'*B'

The trick in general would seem to then be: construct an eta expansion, but insert uses of the subtyping rule wherever we can. This tells you the premises for the new subtyping rule you want. The input type and the resulting type tell you the left and right sides of the subtyping relation, respectively, for the conclusion.

Rules for Sum Subtyping

Sum types are little trickier, because their eta expansion puts us into a case expression, but the same trick works. Here's eta expansion for sums:

              ----- var            ----- var
              x : A                y : B
          ------------- left   -------------- right
s : A+B   left(x) : A+B        right(y) : A+B
--------------------------------------------- case
       case s of
         { left(x) -> left(x)    : A+B
         ; right(y) -> right(y)

Now, we can't use the subtyping rule before the variables are types A and B, respectively, because the case rule prevents us, but we can use subtyping after that, to produce

          ----- var                    ----- var
          x : A       A ⊑ A'           y : B       B ⊑ B'
          ------------------ subtype   ------------------ subtype
                x : A'                       y : B'
            --------------- left        ---------------- right
s : A+B     left(x) : A'+B'             right(y) : A'+B'
-------------------------------------------------------- case
            case s of
              { left(x) -> left(x)    : A'+B'
              ; right(y) -> right(y)

So now we know we can add the subtyping rule

A ⊑ A'   B ⊑ B'
--------------- subtype
  A+B ⊑ A'+B'

Rules for Function Subtyping

Let's now take a look at the often tricky case of functions. First lets construct an eta expansion for a function:

             ----- var
f : A -> B   x : A
------------------ app
     f x : B
 ---------------- lambda
 \x. f x : A -> B

If we now try to insert uses of the subtyping rule wherever possible, we can put it before the variable has type A, and after the application has type B:

             ----- var
             x : A'      A' ⊑ A
             ------------------ subtype
f : A -> B         x : A
------------------------ app
        f x : B                 B ⊑ B'
        ------------------------------ subtype
                   f x : B'
              ------------------ lambda
              \x. f x : A' -> B'

And what do you know, this leads us to the correct subtyping rule for functions with a contravariant argument:

A' ⊑ A     B ⊑ B'
----------------- subtype
A -> B ⊑ A' -> B'


This approach can work in general, provided that there's some notion of eta expansion. Making it work correctly for objects in an OO language will be somewhat tricky because objects don't have to expose all of their innards to allow you to reconstruct them, but you can deal with that by treating the object's hidden internals as an existentially quantified type. This produces exactly the right sort of subtyping rules for the various methods (constructor methods included!).

The uniformity of this approach is also nice because it can be completely mechanized. Any type can be automatically given appropriate subtyping rules without human input. This means that if you've designed your programming language in a type theoretic way (and had the right developer tools), you would never mess up your subtyping rules, thereby avoiding lots of deep bugs in your language.

If you have comments or questions, get it touch. I'm @psygnisfive on Twitter, augur on freenode (in #languagengine and #haskell).