# Chapter 5: Efficient Compilation of Pattern-Matching

Example:

```
mappairs f [] ys = []
mappairs f (x:xs) [] = []
mappairs f (x:xs) (y:ys) = f x y : mappairs f xs ys
```

The trivial compilation could check 5 times in total -- for the last case. But we can compile it into a more efficient version with `case`

.

```
mappairs f xs ys =
case xs of
NIL => NIL
CONS x xs =>
case ys of
NIL => NIL
CONS y ys => CONS (f x y) (mappairs f xs ys)
```

We will study the algorithm which can transform the pattern-matching to case syntax.

First, let's define the `match`

syntax, which can be transformed to trivially from pattern matching syntax for function definition.

## The Variable Rule

can be simplified to

## The Constructor Rule

can be simplified to

in which each $qs'_i$ if of form

## The Empty Rule

$$

$$

is reduced to $E_1 | E_2 | ... | E_m | E$, which is $E_1$ if $m > 0$ and is $E$ otherwise.

## The Mixture Rule

When the first pattern is *mixed*, i.e., composed of both variable and constructor, we can't use the above rules. Instead, we may use nested match to decompose the original match's branches, and then transform each single-branch match into a `case`

expression. However, that could produce redundant computation due to the nested `case`

might involve same variable. Thus, we need a way to optimize it.

## The Pattern-Matching Compiler

I shall read through the text and blind-implement it in Haskell.

See my gist.github.com

## Optimization

Suppose that `FAIL`

is returned by an expression `e`

, then one of the following conditions must hold

`FAIL`

is mentioned explicitly in`e`

`E`

contains a pattern-matching lambda abstraction whose application may fail`FAIL`

is the value of one of the free variables of`e`

After compilation, case 2 doesn't exist. And programmer can't write `FAIL`

explicitly, so case 3 can't hold as well.

So we focus on case 1, i.e., on all the places where `FAIL`

can be introduces explicitly by the compiler.

### Rules for transforming $|$ and $FAIL$

- if $E_1$ can't return $FAIL$, then $E_1 | E_2 \equiv E_1$
- $E | FAIL \equiv E$
- $FAIL | E \equiv E$
- $(IF \, E_1 \, E_2 \, E_3) | E \equiv IF \, E_1 \, E_2 \, (E_3 | E)$

### Eliminating $|$ and $FAIL$

Transform the pairs of guard and alternative into nested `IF`

and finalized the innermost $else$ with either $FAIL$ or default case provided by $otherwise$.

However, we can also do some tricks in the process of compilation to machine code.

## Uniform equations

Sometimes, the order in which we write clauses of matching in function definition might effect the determinism property.

For example:

```
diagonal x True False = 1 -- (1)
diagonal False y True = 2 -- (2)
diagonal True False z = 3 -- (3)
```

If we evaluate $diagonal bottom True False$, then the above order will be `1`

, but if we interchange `(1)`

with `(3)`

, then it will not terminate.

### Definition of "Uniformity"

- All equations begin with a variable pattern, and applying the variable rule yields a new set of uniform equations
- Or, all equations begin with a constructor pattern, and applying the constructor rule yields a new set of uniform equations
- Or, all equations have an empty list of pattern, and the empty rule applies and there is at most one equations in the set

### Properties of "Uniformity"

If a definition is uniform, *changing the order* of the equations does not change the meaning of the definition. And vice versa.