Question

What does it mean in Haskell when a function doesn't handle every constructor of a data type?

Consider

data Pair = Pair (Headers, Builder)
          | CompoundPair (Headers, [Pair])

showBoundPart :: Boundary -> Pair -> Builder
showBoundPart (Boundary b) (Pair (headers, content)) = mconcat
    [ fromByteString "--"
    , fromText b
    , fromByteString "\n"
    , mconcat $ map showHeader headers
    , fromByteString "\n"
    , content
    ]

Observe that Pair is a sum type, but that showBoundPart does not handle CompoundPair, only Pair.

I come from another language where this is a compile-time error, but this code is lifted from the MimeMail library, which is presumably in production somewhere. https://hackage.haskell.org/package/mime-mail-0.5.1/docs/src/Network.Mail.Mime.html#Part

 4  55  4
1 Jan 1970

Solution

 6

Exactly the same thing happens as if you write a case expression and don't write branches that cover every possible pattern. The compiler inserts a catch-all match which will throw a runtime exception. (So it's memory safe; there's no way for this to result in it treating the memory holding the [Pair] in a CompoundPair constructor as if it were the Builder in a Pair constructor, for example).

GHC will warn about this at compile time if you have -Wall on, but I believe it doesn't by default. There'll be a more specific warning flag for just this warning too. So combined with -Werror you can turn it into a compile time error if you wish (which I normally would do).

Essentially this creates an invariant that callers need to know about and respect to safely use the function. Generally I think this should be avoided as much as possible in the public API of a package (and should be clearly documented if it can't be avoided), but from a quick look showBoundPart doesn't seem to be exported. It's more reasonable to do this in internal functions, where only the package author is responsible for knowing about the invariant.

2024-07-20
Ben

Solution

 3

In Haskell, non-exhaustive patterns can be used in code: when pattern matching we do not need to match against all cases. If at runtime we do fall into an unhandled case, an exception will be raised, likely crashing the whole program.

Personally, I disagree with this design choice, and I'd rather wish that patterns always had to be exhaustive. Fortunately, we can ask GHC to enforce that, even if it's not the default. Here are some options:

  • By using the GHC flag -Wincomplete-patterns, GHC will warn when patterns do not cover all the cases.

  • Using -Wall we enable -Wincomplete-patterns and many other useful warnings. This is my go-to choice, since it's easy to turn on.

  • One could be even more strict and use -Wall -Werror=incomplete-patterns which acts like -Wall but treats incomplete patterns like an error, not just a warning. On serious code, I like to turn that on in the .cabal file.


Finally, I see you added the gadt tag. Note that when using GADTs we do not necessarily need to handle all the constructors, but we only need to handle those that are compatible with the type information at hand.

data T a where
  K1 :: Int -> T Int
  K2 :: String -> T String

foo :: T String -> String
foo (K2 s) = s

Above, foo is exhaustive, even if K1 is not handled, since K1 can never produce a T String. Modern GHCs are aware of this and do not trigger warnings/errors for this code.

2024-07-20
chi