Haskell/Existentially quantified types

Existential types, or 'existentials' for short, are a way of 'squashing' a group of types into one, single type.

Existentials are part of GHC's type system extensions. They aren't part of Haskell98, and as such you'll have to either compile any code that contains them with an extra command-line parameter of -XExistentialQuantification, or put {-# LANGUAGE ExistentialQuantification #-} at the top of your sources that use existentials.

The forall keyword

edit

The forall keyword is used to explicitly bring fresh type variables into scope. For example, consider something you've innocuously seen written a hundred times so far:

Example: A polymorphic function

map :: (a -> b) -> [a] -> [b]

But what are these a and b? Well, they're type variables, you answer. The compiler sees that they begin with a lowercase letter and as such allows any type to fill that role. Another way of putting this is that those variables are 'universally quantified'. If you've studied formal logic, you will have undoubtedly come across the quantifiers: 'for all' (or  ) and 'exists' (or  ). They 'quantify' whatever comes after them: for example,   means that whatever follows is true for at least one value of x.   means that what follows is true for every possible value of x you can imagine. For example,   and  .

The forall keyword quantifies types in a similar way. We would rewrite map's type as follows:

Example: Explicitly quantifying the type variables

map :: forall a b. (a -> b) -> [a] -> [b]

So we see that for any combination of types a and b we can imagine, map takes the type (a -> b) -> [a] -> [b]. For example, we might choose a = Int and b = String. Then it's valid to say that map has the type (Int -> String) -> [Int] -> [String]. Here we are instantiating the general type of map to a more specific type.

However, in Haskell, any introduction of a lowercase type parameter implicitly begins with a forall keyword, so those two previous type declarations for map are equivalent, as are the declarations below:

Example: Two equivalent type statements

id :: a -> a
id :: forall a . a -> a

What makes life really interesting and the forall so useful is that you can apply additional constraints on the type variables it introduces. Such constraints,  , serve to guarantee certain properties of the type variable,  , as a kind of ad-hoc interface restriction, (similar to   or   stipulations).

Let's dive right into the deep end of this by seeing an example of the power of existential types in action.

Example: heterogeneous lists

edit

The premise behind Haskell's type class system is grouping types that all share a common property. So if you know a type that is a member of some class C, you know certain things about that type. For example, Int is a member of class Eq, so we know that elements of Int can be compared for equality.

Suppose we have a group of values. We don't know if they are all the same type, but we do know they are all members of some class (and, by extension, that all the values have a certain property). It might be useful to throw all these values into a list. We can't do this normally because lists elements must be of the same type (homogeneous with respect to types). However, existential types allow us to loosen this requirement by defining a 'type hider' or 'type box':

Example: Constructing a heterogeneous list

 data ShowBox = forall s. Show s => SB s
 
 heteroList :: [ShowBox]
 heteroList = [SB (), SB 5, SB True]

We won't explain precisely what we mean by that data type definition, but its meaning should be clear to your intuition. The important thing is that we're calling the constructor on three values of different types, [SB (), SB 5, SB True], yet we are able to place them all into a single list, so we must somehow have the same type for each one. Essentially, yes. This is because our use of the forall keyword gives our constructor the type SB :: forall s. Show s => s -> ShowBox. If we were now writing a function to which we intend to pass heteroList, we couldn't apply a function such as not to the values inside the SB because their type might not be Bool. But we do know something about each of the elements: they can be converted to a string via show. In fact, that's pretty much the only thing we know about them.

Example: Using our heterogeneous list

 instance Show ShowBox where
  show (SB s) = show s        -- (*) see the comment in the text below
 
 f :: [ShowBox] -> IO ()
 f xs = mapM_ print xs

 main = f heteroList

Let's expand on this a bit more. In the definition of show for ShowBox – the line marked with (*) see the comment in the text below – we don't know the type of s. But as we mentioned, we do know that the type is an instance of Show due to the constraint on the SB constructor. Therefore, it's legal to use the function show on s, as seen in the right-hand side of the function definition.

As for f, recall the type of print:

Example: Types of the functions involved

 print :: Show s => s -> IO () -- print x = putStrLn (show x)
 mapM_ :: (a -> m b) -> [a] -> m ()
 mapM_ print :: Show s => [s] -> IO ()

As we just declared ShowBox an instance of Show, we can print the values in the list.

A Further Explanation

edit

One way to think about forall is to think about types as a set of possible values. For example, Bool is the set {True, False, ⊥} (remember that bottom, ⊥, is a member of every type!), Integer is the set of integers (and bottom), String is the set of all possible strings (and bottom), and so on. forall serves as a way to assert a commonality or intersection of the specified types (i.e. sets of values). For example, forall a. a is the intersection of all types. This subset turns out to be the set whose sole element is bottom, {⊥}, since it is an implicit value in every type. That is, the type whose only available value is bottom. However, since every Haskell type includes bottom, {⊥}, this quantification in fact stipulates all Haskell types. However, the only permissible operations on it are those available to a type whose only element is bottom.

A few more examples:

  1. The list, [forall a. a], is the type of a list whose elements all have the type forall a. a, i.e. a list of bottoms.
  2. The list, [forall a. Show a => a], is the type of a list whose elements all have the type forall a. Show a => a. The Show class constraint requires the possible types to also be a member of the class, Show. However, ⊥ is still the only value common to all these types, so this too is a list of bottoms.
  3. The list, [forall a. Num a => a], requires each element to be a member of the class, Num. Consequently, the possible values include numeric literals, which have the specific type, forall a. Num a => a, as well as bottom.
  4. forall a. [a] is the type of the list whose elements all have the same type a. Since we cannot presume any particular type at all, this too is a list of bottoms.

We see that most intersections over types just lead to bottoms because types generally don't have any values in common and so presumptions cannot be made about a union of their values.

However, recall that in the last section, we developed a heterogeneous list using a 'type hider'. This 'type hider' functions as a wrapper type which guarantees certain facilities by implying a predicate or constraint on the permissible types. In that case it was that they must be a member of the type class Show. In general, that seems to be the purpose of forall, to impose type constraint on the permissible types within a type declaration and thereby guaranteeing certain facilities with such types.

Let's declare one.

Example: An existential datatype

 data T = forall a. MkT a

This means that:

Example: This defines a polymorphic constructor, or a family of constructors for T

MkT :: forall a. (a -> T)

So we can pass any type, a, we want to MkT and it will create a T. So what happens when we deconstruct a T value with pattern matching...?

Example: Pattern matching on our existential constructor

 foo (MkT x) = ... -- what is the type of x?

As we've just stated, x could be of any type. That means it's a member of some arbitrary type, so has the type forall a. a. In other words the set whose only available value is bottom, ⊥.

However, we can make a heterogeneous list:

Example: Constructing the hetereogeneous list

 heteroList = [MkT 5, MkT (), MkT True, MkT map]

Of course, when we pattern match on heteroList we cannot presume any features about its elements[1]. So technically, we can't do anything useful with its elements, except reduce them to WHNF.[2] However, if we introduce class constraints:

Example: A new existential data type, with a class constraint

 data T' = forall a. Show a => MkT' a

The class constraint serves to limit the types we are intersecting over, such that we now have values inside a T' which are elements of some arbitrary type that are members of Show. The implication of this is that we can apply show to a value of type a upon deconstruction. It doesn't matter exactly which type it turns out to be.

Example: Using our new heterogenous setup

 heteroList' = [MkT' 5, MkT' (), MkT' True, MkT' "Sartre"]
 main = mapM_ (\(MkT' x) -> print x) heteroList'

 {- prints:
 5
 ()
 True
 "Sartre"
 -}

To summarize, the interaction of the universal quantifier with data types produces a qualified subset of types guaranteeing certain facilities as described by one or more class constraints.

Example: runST

edit

One monad that you may not have come across so far is the ST monad. This is essentially a more powerful version of the State monad: it has a much more complicated structure and involves some more advanced topics. It was originally written to provide Haskell with IO. As we mentioned in the Understanding monads chapter, IO is basically just a State monad with an environment of all the information about the real world. In fact, inside GHC at least, ST is used, and the environment is a type called RealWorld.

To get out of the State monad, you can use runState. The analogous function for ST is called runST, and it has a rather particular type:

Example: The runST function

runST :: forall a. (forall s. ST s a) -> a

This is actually an example of a more complicated language feature called rank-2 polymorphism, which we don't go into detail here. It's important to notice that there is no parameter for the initial state. Indeed, ST uses a different notion of state to State; while State allows you to get and put the current state, ST provides an interface to references. You create references, which have type STRef, with newSTRef :: a -> ST s (STRef s a), providing an initial value, then you can use readSTRef :: STRef s a -> ST s a and writeSTRef :: STRef s a -> a -> ST s () to manipulate them. As such, the internal environment of a ST computation is not one specific value, but a mapping from references to values. Therefore, you don't need to provide an initial state to runST, as the initial state is just the empty mapping containing no references.

However, things aren't quite as simple as this. What stops you creating a reference in one ST computation, then using it in another? We don't want to allow this because (for reasons of thread-safety) no ST computation should be allowed to assume that the initial internal environment contains any specific references. More concretely, we want the following code to be invalid:

Example: Bad ST code

 let v = runST (newSTRef True)
 in runST (readSTRef v)

What would prevent this? The effect of the rank-2 polymorphism in runST's type is to constrain the scope of the type variable s to be within the first parameter. In other words, if the type variable s appears in the first parameter it cannot also appear in the second. Let's take a look at how exactly this is done. Say we have some code like the following:

Example: Briefer bad ST code

... runST (newSTRef True) ...

The compiler tries to fit the types together:

Example: The compiler's typechecking stage

newSTRef True :: forall s. ST s (STRef s Bool)
runST :: forall a. (forall s. ST s a) -> a
together, (forall s. ST s (STRef s Bool)) -> STRef s Bool

The importance of the forall in the first bracket is that we can change the name of the s. That is, we could write:

Example: A type mismatch!

together, (forall s'. ST s' (STRef s' Bool)) -> STRef s Bool

This makes sense: in mathematics, saying   is precisely the same as saying  ; you're just giving the variable a different label. However, we have a problem with our above code. Notice that as the forall does not scope over the return type of runST, we don't rename the s there as well. But suddenly, we've got a type mismatch! The result type of the ST computation in the first parameter must match the result type of runST, but now it doesn't!

The key feature of the existential is that it allows the compiler to generalise the type of the state in the first parameter, and so the result type cannot depend on it. This neatly sidesteps our dependence problems, and 'compartmentalises' each call to runST into its own little heap, with references not being able to be shared between different calls.

Quantified Types as Products and Sums

edit

A universally quantified type may be interpreted as an infinite product of types. For instance, a polymorphic function like:

Example: Identity function

 id :: forall a. a -> a
 id a = a

can be understood as a product, or a tuple, of individual functions, one per every possible type a. To construct a value of such type, we have to provide all the components of the tuple at once. In case of function types, this is possible because of polymorphism -- one formula generating an infinity of functions.

In case of numeric types, one numeric constant may be used to initialize multiple types at once. For instance, in:

Example: Polymorphic value

 x :: forall a. Num a => a
 x = 0 

x may be conceptualized as a tuple consisting of an Int value, a Double value, etc.

Similarly, an existentially quantified type may be interpreted as an infinite sum. For instance,

Example: Existential type

 data ShowBox = forall s. Show s => SB s

may be conceptualized as a sum:

Example: Sum type

 data ShowBox = SBUnit | SBInt Int | SBBool Bool | SBIntList [Int] | ...

To construct a value of this type, we only have to pick one of the constructors. A polymorphic constructor SB combines all those constructors into one.

Quantification as a primitive

edit

Universal quantification is useful for defining data types that aren't already defined. Suppose there was no such thing as pairs built into haskell. Quantification could be used to define them.

{-# LANGUAGE ExistentialQuantification, RankNTypes #-}

newtype Pair a b = Pair (forall c. (a -> b -> c) -> c)

makePair :: a -> b -> Pair a b
makePair a b = Pair $ \f -> f a b

In GHCi:

 λ> :set -XExistentialQuantification
 λ> :set -XRankNTypes
 λ> newtype Pair a b = Pair {runPair :: forall c. (a -> b -> c) -> c}
 λ> makePair a b = Pair $ \f -> f a b
 λ> pair = makePair "a" 'b' 
 
 λ> :t pair
 pair :: Pair [Char] Char
 
 λ> runPair pair (\x y -> x)
 "a"
 
 λ> runPair pair (\x y -> y)
 'b'

Notes

  1. However, we can apply them to functions whose type is forall a. a -> R, for some arbitrary R, as these accept values of any type as a parameter. Examples of such functions: id, const k for any k, seq
  2. because all we know is that they have some arbitrary type.

Further reading

edit