Susan Potter

Dhall's merge keyword and union types

Mon September 9, 2020

DRAFT

At work my team uses Dhall to generate configuration for services or environments and one thing we often need to do is define union types (also known as sum types).

Pattern matching in Haskell/PureScript background

In Haskell and PureScript, we can use a feature of the language called pattern matching to be able to match a particular construction of a sum type value.

For instance, let us start off with a simple special case of a sum type, an enumeration type:

data Shell = Bash | Zsh | Fish

Suppose we are trying to decipher between the available options of shells a developer might be using and producing the shebang for the shell:

shebang :: Shell -> String
shebang Bash = "#!/bin/bash"
shebang Zsh  = "#!/bin/zsh"
shebang Fish = "#!/bin/fish"

Because we can only ever construct a value of type Shell with one of the three data constructors (Bash, Zsh, Fish) none of which take arguments, then by looking at the shebang definition above we can see we have handled all possible inputs and always produce a value giving us totality in this simple case. The Haskell compiler I use (GHC) also agrees and can check this for us.

Simulating pattern matching in Dhall (simple case)

To provide the same functionality in Dhall we can do the following:

let Shell : Type = < Bash | Zsh | Fish >
let shebang = \(shell : Shell) ->
      merge { Bash = "#!/bin/bash"
            , Zsh  = "#!/bin/zsh"
            , Fish = "#!/bin/fish" }
            shell
in shebang Shell.Zsh

Above we can see we defined our union type (another name for a sum type) on the first line like so:

let Shell : Type = < Bash | Zsh | Fish >

Next we define the function shebang by using the merge keyword. It takes a record containing the data constructors as keys and a value.

In this case, our union type contains no arguments for any of the data constructors. So our value is just the value we want returned upon a match.

Matching more interesting union types in Dhall

Let us suppose we now want to model different environments and their levels, e.g. dev, qa, prod.

Going back to Haskell we can model each environment having a level and possibly a tag like the following as a sum type:

newtype Tag = MkTag Text
data Environment = Prod | QA Tag | Dev Tag

In our case, we might not need a tag for Prod because there is only one production environment, but we dynamically create QA environments per story and each developer locally develops in their own development environment. We do this because logs, metrics, and stack traces are reported from these environments we need to be able to filter on the tag for these environments.

Examples values of Environment might be:

-- >>> Prod

-- >>> QA "CLUB-12345"

-- >>> Dev "mbbx6spp@nixos0"

Now to pattern match on this Haskell allows us to do the following:

logServer :: Environment -> Hostname
logServer Prod = "logs.prod.example.com"
logServer (QA _) = "logs.qa.example.com"
logServer (Dev tag) = "logs." <> username <> ".local"
  where username = takeWhile (!= '@') tag