# The expression problem as a litmus test

The expression problem is a famous problem in programming languages.

“The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).”

Using *interfaces* (like in Java) as the datatype example, the problem simply asks whether it is
possible to derive the interface and add new methods to the interface, without having to recompile
existing code or to resort to using casts.

Obviously, in a OOP language it’s easy to derive interfaces, but the problem uncovers the rigidity of the type system: you can’t modify (i.e. extend) the interface, because you have to modify all the classes classes that implement the interface.

Conversely, in functional programming languages, adding new methods operating on the interface is easy. Consider the canonical OCaml example:

```
type shape = Circle of float | Rectangle of float * float
let area shp = match shp with
Circle radius -> 3.14159 *. radius *. radius
| Rectangle (width, height) -> width *. height
let vertices shp = match shp with
Circle _ -> infinity
| Rectangle (_, _) -> 4
```

So in FP, you could create a function called `volume`

that computes the volume for the existing
types, and you needn’t touch the above code. However, as soon as you do that, you realize you’ve
made a silly mistake: our shapes are *flat* so their volume is zero. Quickly, you realize you need a
three-dimensional `Cube`

shape.

```
let volume shp = match shp with
Circle _ -> 0.
| Rectangle _ -> 0.
| Cube s -> a *. a *. a (* Cube isn't defined *)
```

Oops!

Here’s the onion: to get the `Cube`

working, you’ll have to modify the existing code in two places:
the definition of `shape`

*and* both methods `area`

and `vertices`

.

In OOP, this isn’t the case, since you can just derive the new `IShape`

interface and be done with
it, but the problem arises when you’re adding the `volume`

function, because you need to modify
`IShape`

, and thus every class that derives it.

In FP, adding new functions over the datatypes is easy, but adding new cases to the datatype is tricky, because you have to modify existing functions. In OOP, adding new functions over the datatype is hard, because you need to modify each implementation; adding new cases to the datatype is easy, since all you need to do is derive the interface. FP has invented a multitude of ways to deal with this problem, ranging from type classes, traits to protocols; OOP usually solves with either patterns or open classes. Ruby’s refinements can be used for this purpose as well.

## Polymorphic variants

That’s a quick introduction to the problem. I think the expression problem is a perfect litmus test of sorts for programming languages, that is, the measure of the expressive power of the language is the quality of the solutions the language presents to the expression problem.

The expression problem is theoretically solvable in any language, but to varying degrees of
elegance. In Java one must resort to using the
visitor pattern, and in my mind this is the most
inelegant way of going about it. I would rate the solutions on a spectrum: with the most *basic*
solution being the visitor pattern, at the other end we have something like
polymorphic variants and type
classes. Multimethods and protocols are somewhere in between.

When you compare polymorphic variants of OCaml with Haskell’s type classes, there’s a marked difference in brevity. Polymorphic variants are succincter than type classes but cannot provide the same level of type safety.

```
type shape = [ `Circle of float | `Rectangle of float * float ]
let area shp = match shp with
`Circle radius -> radius *. radius *. 3.14159
| `Rectangle (w, h) -> w *. h
let vertices shp = match shp with
`Circle radius -> infinity
| `Rectangle (w, h) -> 4.
```

Not too different from the above declaration, the type is surrounded with brackets and the types are preceded with backticks. Recreating the volume function is easy.

```
let volume shp = match shp with
`Circle _ -> 0.
| `Rectangle _ -> 0.
| `Cube a -> a *. a *. a
```

So now I’ve extended the `shape`

type with another type `Cube`

, and I haven’t touched `vertices`

and
`area`

functions. The `volume`

function can be done even more succinctly:

```
let short_volume shp = match shp with
(* no volume in two dimensions! *)
#shape -> 0.
| `Cube a -> a *. a *. a
```

It is also possible to constrain the polymorphic variants:

```
let flatten shp = match shp with
#shape as x -> x
| `Cube a -> `Rectangle a
```

The type of this function is
`[ < `Circle of float | `Cube of float | `Rectangle of float * float ] -> [> shape]`

. The
`[< A | B]`

means a closed type: it can be only `A`

or `B`

, but nothing else, and `[> Foo]`

means
“Foo or something else”. So the `flatten`

function accepts `Circle`

, `Rectangle`

or `Cube`

and
returns a `shape`

(or possibly something else). Trying to run `flatten (`Sphere 4)`

produces a
type error:

```
# flatten (`Sphere 3);;
Characters 8-19:
flatten (`Sphere 3);;
^^^^^^^^^^^
Error: This expression has type [> `Sphere of int ]
but an expression was expected of type
[< `Circle of float
| `Cube of float * float
| `Rectangle of float * float ]
The second variant type does not allow tag(s) `Sphere
```

However, the following code compiles:

```
type polytope = [ shape | `Cube | `Octahedron ]
let frobnicate pt =
let flattened = flatten pt in
match flattened with
#shape -> "Already flaaat!"
| `Octagon -> "Eight coorneeeeerss"
```

The compiles, although we didn’t tell the compiler that `flatten`

does not return
`Octagon`

. There are two ways to fix this: either explicitly annotate `pt`

to be of type
`polytope`

, which produces this error:

```
Error: This expression has type polytope
but an expression was expected of type
[< `Circle of float | `Cube of float | `Rectangle of float * float ]
The second variant type does not allow tag(s) `Octahedron
```

It is possible to further constrain the type with type annotations. We can make sure that the
`flatten`

function returns *only* flat shapes:

```
let safe_flatten shp : [< shape] = match shp with
#shape as x -> x
| `Cube a -> `Rectangle a
| `Sphere r -> `Circle r
```

This produces the error:

```
Error: This pattern matches values of type [? `Octagon ]
but a pattern was expected which matches values of type shape
The second variant type does not allow tag(s) `Octagon
```

## Not a silver bullet

Unfortunately, polymorphic variants are *problematic*. The problem with polymorphic variants is you
quickly reach an absurd level of complexity and are forced to use annotations or subtyping to ensure
maximal type safety. So although polymorphic variants are *nice*, and they do let us solve the
expression problem, they’re an unsteady compromise between type safety and brevity. You can
certainly make elegant abstractions with them but they get unwieldy quickly. They aren’t as
efficient compared to regular variants either.

So what are the options? In OCaml 4.02, you can use *extensible variant types*:

```
type boring_shape = ..
type boring_shape += Circle of float | Square of float
let boring_area shp = match shp with
Circle r -> r *. r *. 3.14159
| Square a -> a *. a
| _ -> 0.
type boring_shape += Rectangle of float * float
let radical_area shp = match shp with
Circle _ as c -> boring_area c
| Square _ as s -> boring_area s
| Rectangle (w, h) -> w *. h
| _ -> 0.
```

An extensible variant is defined using `..`

, and extension is done with the `+=`

operator. The
caveat is that you must handle the default `_`

case in pattern matching. Extensible variants are
another neat trick for solving the expression problem.

## A measure of expressive power

The expression problem is a great litmus test that measures the expressive power of a
programming language. The actual measurement of the test can be either the brevity of the code or
its type safety. The solutions range from the clumsy Visitor Pattern in Java to polymorphic and
extensible variants in OCaml and to type classes in Haskell. Clojure and Elixir have
protocols that are both quite nice but not so type-safe since both
are dynamically typed languages. What is more, since the expression problem is also about type
safety, then strictly speaking the problem isn’t valid in a dynamic language. Any Lisper knows that
Lisps *are* super expressive anyway.