posted by Darryl on 21 September 2015
Pattern matching is absolutely central to modern functional programming languages, and in the type theoretically informed ones, it forms the basis of all usage of userdefined data. To inspect some piece of data, to extract part of it, etc. is all done by pattern matching. Since it's so central, and so allpermeating, it's a good idea to make it as efficient as possible.
A nonoptimized pattern matching implementation that just pairs a pattern with a clause, taking the case expression at face value, will be inefficient because it will often do a lot of redundant work. Consider, for instance, this Haskell expression:
case foo of
(True,True) > bar
(True,False) > baz
(False,y) > quux
If we implement this match the foo
value to each pattern, we end up looking at the outermost constructor of foo
three separate times, each time to see if its the (,)
constructor for pairs. But once we've inspected it and know that it's (,)
the first time, we never need to look at it again, it's not going to change. Similarly, if we inspect the first element of foo
and find that its True
, the nonoptimized matcher will look at each clause independently, even tho once we find it to be True
, we can rule out the third clause entirely.
On option for optimizing these is to remove all the nesting of pattern matches, so that we only ever inspect the outmost constructor. Some languages, including Haskell, if I'm not mistaken, take this approach. It's a codecode transformation rather than a deep implementation strategy. So for example, with this expression, we can transform it to the equivalent
case foo of
(x,y) > case x of
True > case y of
True > bar
False > baz
False > quux
And now if we use the unoptimized matching algorithm, the resulting code is efficient. Some nuance arises when we try to actually define the transformation, because of expressions like this:
case foo of
(True,True) > bar
(False,True) > baz
(x,y) > quux
However, there's another way to implement efficient pattern matching that comes at it from the perspective of treating efficiency as a core component, rather than a code optimization decision. I don't know if this is a better approach, but it's at least an interesting one, so I figured I'd explain it here.
A hint at a solution?
When I was first thinking about this, I got the idea to play around with defining some kind of "fused" pattern that could represent multiple patterns simultaneously. For instance, if we have the natural number patterns Suc Zero
(for 1) and Suc (Suc Zero)
(for 2), we could somehow splice them together so that the outer Suc
constructor has both Zero
and Suc Zero
as "alternative" daughters. Something vaguely like Suc (Zero `or` Suc Zero)
, which would match 1 and also 2 (so it means "either 1 or 2"). Or we might have something like Suc (Zero `or` Suc (Suc (Zero `or` Suc (Suc Zero))))
which matches 1, 3, and 5, and so on
We can implement this, and a matcher, pretty easily for natural numbers. Here's an example, without treating data as general trees as we would in an implementation of an actual language:
data Nat
= Zero
 Suc Nat
data NatPattern
= ZeroP
 SucP NatPattern
 AnyNatP
 NatChoiceP NatPattern NatPattern
natMatch :: NatPattern > Nat > Bool
natMatch ZeroP Zero
= True
natMatch (SucP p) (Suc n)
= natMatch p n
natMatch AnyNatP _
= True
natMatch (NatChoiceP p p') n
= natMatch p n  natMatch p' n
This approach seems quite nice, because we can define a function that takes a bunch of patterns that have no use of NatChoiceP
and combine them into a minimal patterns with NatChoiceP
splicing them together appropriately, so that the following equivalencies hold:
NatChoiceP ZeroP ZeroP ≅ ZeroP
NatChoiceP (SucP p) (SucP p') ≅ SucP (NatChoiceP p p')
However, this idea isn't without its fatal flaws. So far, we've only seen what happens with a simple type like Nat
, which only has constructors with 0 or 1 argument, but what do we do for a type that has a constructor with multiple arguments? Consider this type:
data BinTree
= Leaf
 Branch BinTree BinTree
data BinTreePattern
= LeafP
 BranchP BinTreePattern BinTreePattern
 AnyBinTreeP
 BinTreeChoiceP BinTreePattern BinTreePattern
binTreeMatch :: BinTreePattern > BinTree > Bool
binTreeMatch LeafP Leaf = True
binTreeMatch (BranchP l r) (Branch l' r')
= binTreeMatch l l' && binTreeMatch r r'
binTreeMatch AnyBinTreeP _
= True
binTreeMatch (BinTreeChoiceP p p') t
= binTreeMatch p t  binTreeMatch p' t
When we have a choice between a Leaf
and either a Leaf
or a Branch
, then the corresponding pattern is easy enough: it's just BinTreeChoiceP
, but what about when it's a choice between two Branch
es?
Perhaps what we want is to have the following equivalencies:
BinTreeChoiceP LeafP LeafP ≅ LeafP
BinTreeChoiceP (BranchP l r) (BranchP l' r')
≅ BranchP (BinTreeChoiceP l l') (BinTreeChoiceP r r')
But this is definitely wrong, because the choice of l
for the left node always goes with the choice of r
for the right node in BinTreeChoiceP (BranchP l r) (BranchP l' r')
, and similarly for l'
and r'
. However, in the "equivalent" BranchP (BinTreeChoiceP l l') (BinTreeChoiceP r r')
, the choice of the left and right branches is independent. To use a concrete example:
tree = Branch Leaf (Branch Leaf Leaf)
pattern1 = BinTreeChoiceP
(BranchP LeafP LeafP)
(BranchP (BranchP LeafP LeafP) (BranchP LeafP LeafP))
pattern2 = BranchP
(BinTreeChoiceP
LeafP
(BranchP LeafP LeafP))
(BinTreeChoiceP
LeafP
(BranchP LeafP LeafP))
If we match tree
to each pattern, we get binTreeMatch pattern1 tree == False
while binTreeMatch pattern2 tree == False
. pattern2
will also match the mirror of tree
, for a total of 4 matching trees, while pattern1
only matches the two in its choice.
This really is no surprise. There are many ways to view this that make sense of it, but I think the best way is to realize that multiple arguments produce conjunctive patterns (p
has to match and p'
has to match) whereas choice produces disjunctive patterns (p
has to match or p'
has to match), and so since conjunction and disjunction don't commute ((a && b)  (c && d)
is not the same as (a  c) && (b  d)
). In regular expressions we get the same thing: abcd
is not the same as (ac)(bd)
. There's a deeper reason for all of these, of course (having to do with noncommutation of products and coproducts) but the intuition should be clear now.
In addition to this problem, there is an even bigger, more fatal problem: we're supposed to be doing pattern matching, not just testing if some piece of data is equal to any of a specified collection. Pattern matching requires we form some kind of lookup, mapping elements of the specified collection to the associated clause. So these "choice" patterns are analogous to sets (Set Nat
, Set BinTree
), while what we actually need is something more like a map (Map Nat Clause
, Map BinTree Clause
). As it is, once we've found a matching choice, we have no way of saying what the associated clause is.
The better solution
If we take a look at the types NatPattern
and BinTreePattern
, ignoring choice now, we can notice that any such pattern defines an "upper" portion of dataastree. That is to say, patterns are exactly like values, except patterns have an "any" label that indicates anything (of the right type) can go there.
A piece of data matches a pattern when it has an upper portion thats the same as the pattern. Matching involves traversing the pattern and the data at the same time, making the same traversal moves, as determined by the pattern, and when the labels don't match and the moves can't be made, the data doesn't match the pattern.
This provides the key inside we need. We can view matching as simultaneous traversal. Given any pattern, and any fixed traversal order, we can convert the pattern to a list of label names, and then traverse the data with the same order, checking each pattern name in turn. So, for instance, the pattern SucP (SucP ZeroP)
would instead be given as, say, [SucPL,SucPL,ZeroPL]
, using inorder traversal, and similarly, BranchP LeafP (BranchP LeafP LeafP)
would be instead [BranchPL,LeafPL,BranchPL,LeafPL,LeafPL]
.
While a list of constructor names isn't quite the solution yet, we can see how matching would work in this alternative formulation of patterns:
data BinTreePatternCons
= LeafC
 BranchC
 AnyBinTreeC
type BinTreePatternC = [BinTreePatternCons]
binTreeMatchC :: BinTreePatternC > BinTree > Bool
binTreeMatchC p t = go p [t]
where
go [] [] = True
go (LeafC:p) (Leaf:ts)
= go p ts
go (BranchC:p) (Branch l r:ts)
= go p (l:r:ts)
go (AnyBinTreeC:p) (t:ts)
= go p ts
go _ _ = False
This uses a typical tailrecursive traversal of a tree on a stack, which is quite easy to implement and which produces an inorder traversal.
As I said tho, this isn't the final way we're going to do this, because we still haven't taken choices into account, nor that clauses have to pair with them. But we at least now have a sense of the process. To handle choices, we need to be able to say at any given point, there are some choices of constructors to read next, not just one, and for each option we need to give a traversal. So a traversal becomes a kind of map, from constructor names to traversals. But it can't be just a map, because at some point the traversal has to end, so we'll have two kinds of traversals:
data Traversal a
= Done
 Choice (Map a (Traversal a)
binTreeMatchT :: Traversal BinTreePatternCons > BinTree > Bool
binTreeMatchT p t = go p [t]
where
go Done [] = True
go (Choice m) (Leaf:ts)
= goCon m LeafC ts ts
go (Choice m) (Branch l r:ts)
= goCon m BranchC (l:r:ts) ts
go _ _ = False
goCon m c ts ts'
= case M.lookup c m of
Just p  go p ts
> True
_ > case M.lookup AnyBinTreeC m of
Just p > go p ts'
Nothing > False
The auxiliary function go
is relatively straight forward: if the pattern traversal wants to end, and there are no more trees to decompose, we've succeeded. On the other hand, if we have a choice, and we can proceed by trying to decompose a tree with a specified constructor, we call the goCon
auxiliary function to handle that decomposition. Otherwise, we fail.
The auxiliary function goCon
is somewhat subtle, however. Because we have variable patterns like AnyBinTreeC
, we might find that the pattern doesn't have an option for the specified constructor but it does have a catchall option. So we need to provide goCon
with the stack of trees for both eventualities. For instance, if a tree is Branch l r
, we need to proceed by using a stack that has l
and r
if the pattern can inspects the branch node, but if it can only match with AnyBinTreeC
, we don't need to put l
and r
on the stack.
Furthermore, in the case that the constructor does show up in the pattern, that doesn't guarantee that the associated traversal will succeed. If it doesn't, the pattern match as a whole can still succeed because AnyBinTreeC
can be tried as an alternative. For example, consider this case expression from earlier:
case foo of
(True,True) > bar
(False,True) > baz
(x,y) > quux
In the first and second cases, the traversals for their patterns can both fail even if a partial match is match. Suppose, for instance, that foo = (True,False)
. If we go through the traversal of (True,True)
, we'll succeed in matching the pair constructor and then the first True
, and then fail at the second True
. There is no path through the traversal that goes pairtruefalse or even pairtrueany. However there is a path that goes pairanyany, namely, the third pattern (x,y)
. The traversal for this case expression's patterns would look like this, in tree form:
(,)

 True
 
  True

 False
 
  True

 x

 y
This is why, in the definition of goCon
, even if we successfully find a traversal choice that matches the input data's constructor, we have the guard in Just p  go p ts
that requires the whole rest of that traversal to succeed before we consider that choice successful.
If we now try out this matching function with a traversal, it works exactly the same as a the list version. This is to be expected: when the choice traversals have exactly one choice, the traversal is isomorphic to a list. However, if we have 0 or 2+ alternatives, we get a full set of trees to match.
So we've solved the problem of choice in the patterns from earlier, all that remains is to figure out how to pair up the appropriate clauses with their patterns. This is very simple, however. We can augment the Done
constructor to carry a Clause
, and then augment the binTreeMatchT
function to return Maybe Clause
instead of a Bool
. The change is almost too trivial to bother showing:
data Traversal a
= Done Clause
 Choice (M.Map a (Traversal a))
deriving (Show)
binTreeMatchT :: Traversal BinTreePatternCons > BinTree > Maybe Clause
binTreeMatchT p t = go p [t]
where
go (Done c) [] = Just c
go (Choice m) (Leaf:ts)
= goCon m LeafC ts ts
go (Choice m) (Branch l r:ts)
= goCon m BranchC (l:r:ts) ts
go _ _ = Nothing
goCon m c ts ts'
= case M.lookup c m of
Just p  Just c < go p ts
> Just c
_ > case M.lookup AnyBinTreeC m of
Just p > go p ts'
Nothing > Nothing
Lastly, it would be useful to have a way to combine patternsastraversals, so that we can take a list of traversals (one for each pattern+clause) and turn them into the appropriate total traversal by combining them one at a time. This is pretty easy:
combine :: Ord a => Traversal a > Traversal a > Traversal a
combine (Done c) (Done c') = Done c
combine (Choice m) (Choice m') = Choice (M.unionWith combine m m')
Discussion
Now, obviously a real functional programming language won't be using this BinTree
type for data, it'll something more like RoseTree String
, where RoseTree
is defined as
data RoseTree a = RoseTree a [RoseTree a]
But the matching function for this is basically the same, if not simpler. Implementation is left as an exercise to the reader.
Outside of the domain of implementing a functional programming language, this way of viewing tree structured data has some interesting manifestations. We can step back from the pattern matching and look instead at just data in general encoded this way, and it turns out that in Haskell, we can define a data family that can encode various types. I call it Lin
(because this represents a way to linearize or serialize data), and it can come with an encoding and decoding operation:
data family Lin a r
class Linearizable a where
linearize :: a > r > Lin a r
delinearize :: Lin a r > (a,r)
The a
parameter represents the data type being encoded, and the r
parameter represents the "rest" of the data, whatever shows up after the a
stuff. On simple data, the definition of Lin
is easy, we just wrap up an r
in constructors that correspond to the simple data type. For example, on booleans:
data instance Lin Bool r where
LinBoolTrue :: r > Lin Bool r
LinBoolFalse :: r > Lin Bool r
instance Linearizable Bool where
linearize True r = LinBoolTrue r
linearize False r = LinBoolFalse r
delinearize (LinBoolTrue r) = (True,r)
delinearize (LinBoolFalse r) = (False,r)
For a simple recursive type like Nat
the encoding is also simple, except for the recursive constructors, we just have a recursive argument. The r
argument is only present for nonrecursive constructors.
data Nat = Zero  Suc Nat
data instance Lin Nat r where
LinNatZero :: r > Lin Nat r
LinNatSuc :: Lin Nat r > Lin Nat r
instance Linearizable Nat where
linearize Zero r = LinNatZero r
linearize (Suc n) r = LinNatSuc (linearize n r)
delinearize (LinNatZero r) = (Zero,r)
delinearize (LinNatSuc nr) = let (n,r) = delinearize nr
in (Suc n, r)
But the interesting stuff happens with multiple arguments. Like our traversals, we end up representing compound data in a very weird way, where something that is "after" something else is "inside" that something else. The definition of Lin
has constructors corresponding to the encoded data type, but because "after" means "inside", when we have multiple arguments, we put the "later" arguments in the r
position of the "earlier" arguments. For instance, pairs:
data instance Lin (a,b) r where
LinPair :: Lin a (Lin b r) > Lin (a,b) r
instance (Linearizable a, Linearizable b) => Linearizable (a,b) where
linearize (a,b) r = LinPair (linearize a (linearize b r))
delinearize (LinPair abr) = let (a,br) = delinearize abr
(b,r) = delinearize br
in ((a,b),r)
Or lists:
data instance Lin [a] r where
LinNil :: r > Lin [a] r
LinCons :: Lin a (Lin [a] r) > Lin [a] r
instance Linearizable a => Linearizable [a] where
linearize [] r = LinNil r
linearize (x:xs) r = LinCons (linearize x (linearize xs r))
delinearize (LinNil r) = ([],r)
delinearize (LinCons aasr) = let (a,asr) = delinearize aasr
(as,r) = delinearize asr
in (a:as,r)
The most interesting aspect of this, however, is not that this can be done. We ought to expect this can be done, since this is just a variation on the traversal theme where instead of using maps and constructor names and so forth, we define this funky Lin
data family. No, what's most interesting is the types of the Lin
constructors. Let's define some auxiliary types to make things cleaner:
type (~>) f g = forall a. f a > g a
type (.:) f g a = f (g a)
type Id a = a
The first can be seen as roughly being the arrow for natural transformations, and the second is roughly functor composition. So what happens when we rewrite the Lin
constructors with these?
LinBoolTrue :: Id ~> Lin Bool
LinBoolFalse :: Id ~> Lin Bool
LinBoolZero :: Id ~> Lin Nat
LinBoolSuc :: Lin Nat ~> Lin Nat
LinBoolPair :: Lin a .: Lin b ~> Lin (a,b)
LinBoolNil :: Lin [a]
LinBoolCons :: Lin a .: Lin [a] ~> Lin [a]
These look suspiciously like the types (in category theoretic form) of the constructors of the types that are being encoded. Replace functorial arrows ~>
correspond to type arrows >
, functorial composition .:
corresponds to pairing (,)
.
I don't quite know what the right categorial treatment of this is, but it seems to be that Lin
is a functor from Set
to Set^Set
that somehow preserves algebras in some way?
If you have comments or questions, get it touch. I'm @psygnisfive on Twitter, augur on freenode (in #languagengine and #haskell).