# Susan Potter

## Algebraic Data Types

### Tue November 11, 2012

An algebraic data type can take many forms:

• Sum type

• Product type

• Hybrid Sum/Product type

• Recursive type (not covered in this post)

### Sum Types (aka Tagged Unions)

A sum type is a type that has a known and exhaustive list of constructors where a value of that type must be constructed by exactly one of the constructors. Also known as a tagged union or coproduct.

A sum type is a type definition where our type has a finite number of known constructions. Each construction may hold zero or more components of specific types (although these components may vary from construction to construction).

The simplest definition of a sum type is the `Unit` type which would look something like this in Scala:

``case object Unit``

It has example one constructor (`Unit`) with no arguments.

Let's step back to a basic example of a sum type (which is one variety of ADTs) to illustrate the difference between a sum type and a pure product type.

``````// Scala here to mix it up :)

sealed trait USCoin                 { def value: Int  }
case object Penny   extends USCoin  { def value = 1   }
case object Nickel  extends USCoin  { def value = 5   }
case object Dime    extends USCoin  { def value = 10  }
case object Quarter extends USCoin  { def value = 25  }
case object Dollar  extends USCoin  { def value = 100 }``````

The above just tells Scala that there is this trait (in this case we can pretend it is the same as the LHS of Haskell's data declaration, but it isn't always this way), USCoin, that has a finite, and known number of constructors upfront (typically how case classes in Scala get used [well]). Note that the trait is sealed. For non-Scala people, this just means that outside of this source file, no extensions of this trait are allowed. i.e. the compiler can guarantee that consumers of our libraries or toolkits cannot extend `USCoin`.

In this particular scenario that is probably what we want (not allowing consumers of our code to extend this). The likelihood that the US central bank would introduce new coins or take existing coins out of circulation before we update our library in time to cater for it, is pretty unlikely. However, there is another good reason why we might want this too: we can know we have exhaustively catered for all constructions of `USCoin` in our supporting logic or calculations.

You might be wondering how this can be done. Let us try to use this code in `sbt console`:

``````scala> import funalgebra.algdt._
import funalgebra.algdt._

scala> object VendingMachine {
|   def accepts(coin: USCoin): Boolean = coin match {
|     case Penny => false
|     case Nickel => false
|     case Dime => true
|     case Quarter => true
|   }
| }
:12: warning: match is not exhaustive!
missing combination         Dollar

def accepts(coin: USCoin): Boolean = coin match {
^
defined module VendingMachine``````

What happened here is that we forgot to match on the Dollar constructor for USCoin sum type and the Scala compiler warned us about it. If you find you want this behavior for a particular data type definition, then you probably want to define it as a sum type this way.

### Product Types (aka Record Types)

In the language of abstract algebra, which computer scientists in programming language theory appear to have adopted, a data type with exactly one constructor is isomorphic to what is often called a product type that takes an ordered list of typed operands. Informally these are sometimes called record types.

The archetypal product type is the tuple. For example, perhaps we want to represent an image element in an HTML page. We might initially represent it as the following tuple:

``(String, Int, Int)``

Here we take a string (the source URL), an integer (the width), and a second integer (the height). The problem with tuples is that this might also represent any number of issues. It is hard to know what it is referring to. Enter product types.

In Scala, we can represent an image element in HTML like the following case class:

``case class ImageElement(sourceUrl: String, width: Int, height: Int)``

Since pure product types only have one constructor for a type, we can eliminate the trait or abstract class type definition that we used in the sum type example above with `USCoin`.

### Sum-Product Hybrids

So a first stab at modeling this with an algebraic data type in Haskell might look like:

``````import Data.DateTime

-- Assumes User is defined elsewhere and imported

-- Assume Status and Comment types are already defined

What did I actually do? Well if we look at the different notifications we see there are a variety of constructions of notification events, including:

• When one of your statuses is liked it tells you who and at what time the last action was taken

• When one of your comments is liked it tells you who and the time the last action was taken

• When a post you commented on is commented on it tells you who and the time the last action was taken

Let us dissect the Haskell code a little. The identifier on the left-hand-side (LHS), `Notification`, is the type name. Then the right-hand-side (RHS) contains an exhaustive list of possible constructors such as `CommentNotification`, `CommentLikeNotification`, `PostLikeNotification` for our simple model.

Now we could have modeled Notification data type a little differently. Let us consider the following modeling of the domain:

``````import Data.DateTime

-- Assumes User is defined elsewhere and imported

-- We might want to add more constructors for Post sum type of a more
-- complete model of Facebook notifications, but left as a homework to
-- reader, because every algebra lesson has this ;)
data Post = Status Text DateTime | Comment Text DateTime
This model of the domain of Facebook notifications uses a pure product type for the `Notification` type definition and a hybrid sum-product type for the `Post` type.