Sums of products (and carbs)

Fun with datatypes and representations

If we take any data, there's many ways we can represent that information - bullet points on a notebook, in a PostgreSQL table, a JSON object and so on. There's going to be all sorts of syntactic wrappings that make things look different from one representation to another, but underneath it's pretty much, well, data. So if there is a path to convert between representations, then we can say they are the same. (ie. up to isomorphism)

In Haskell, lists are homogeneous, that is, they can only be instantiated to some given type. ["foo", "bar"] :: [String] is ok, but ["foo", 3] is not. We can however represent the latter as a datatype,

datatype Foo = MkFoo String Int

foo :: Foo
foo = MkFoo "foo" 3

But both encode the same information! (or at least that was what preoccupied my thoughts over dinner) In other words, there's a path from one to the other. Some searching on Hackage brought up the generics-sop library, which describes an interesting way of representing datatypes: "the choice between constructors is represented using an n-ary sum, and the arguments of each constructor are represented using an n-ary product"

That sounded like a nice after dinner scribble, so I decided to explore it. Some preliminaries,

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeApplications #-}

import qualified GHC.Generics as GHC
import           Generics.SOP

Next, let's think about some trivial data to work with. If we want to encode some information about our meals today, we could do it as [food, drink], so today I might feel like ["Rice", "Water"]. Putting that into some trivial datatype definitions,

-- Sum
data Drink
  = Soda
  | Water
  deriving (Read, Show, GHC.Generic)

-- Sum
data Chow
  = Rice
  | Bread
  | Noodles
  deriving (Read, Show, GHC.Generic)

-- Sum of products
data Meal
  = Upsize Chow Drink
  | Normal Chow
  | Diet
  deriving (Show, GHC.Generic)

instance Generic Drink
instance Generic Meal

So we can think of the choice between constructors as sums, and they are represented via Peano numbers, and the arguments of each constructor via a list like structure. We can then write down the representation of this in the type signature: Rep Meal ~ SOP I '[ '[ Chow, Drink ], '[Chow], '[] ],

chooseMeal :: [String] -> Rep Meal
chooseMeal [x, y] = SOP (Z (I (read x) :* I (read y) :* Nil))
chooseMeal [x] = SOP (S (Z (I (read x) :* Nil)))
chooseMeal [] = SOP (S (S (Z Nil)))
chooseMeal _ = error "Fail to parse input"

-- Making use of that fancy TypeApplications extension
mealFromOpt :: [String] -> Meal
mealFromOpt s = to @Meal (chooseMeal s)

And we convert back to a list of strings. Here, SOP I (Code Meal) ≅ datatype Meal and we can switch this into a homogeneous structure (a list of strings) with the constant functor `K` (SOP (K String) (Code Meal) ≅ [String])

hmeal :: Rep Meal
hmeal = SOP (Z (I Bread :* I Soda :* Nil))

printMeal :: Rep Meal -> [String]
printMeal = hcollapse . hcmap (Proxy :: Proxy Show) (mapIK show)

In ghci,

λ> mealFromList ["Rice", "Soda"]
Upsize Rice Soda
λ> mealFromList ["Bread"]
Normal Bread
λ> mealFromList []
λ> printMeal hmeal

-- back and forth
λ> printMeal . chooseMeal $ ["Bread"]
λ> mealFromList . printMeal . from $ Normal Bread
Normal Bread

That was fun! The library is pretty huge - there are some examples there but I suspect I'll need to take some time to mow down the concepts.

<< back