Why Monad Transformers Matter

July 01, 2023
« Previous post  

Summary

In this post, I provide a glimpse into what makes monad transformers powerful and useful. I provide a sample of practical usages of monad transformers. We see how we can easily extend existing libraries with new functionality without the library author needing to specifically provide extensibility for us. We see how the extended functionality we write does not require us to fork the code and can be written in an isolated manner, yet is still open for other users to recombine and extend our new components in the same manner we have.

This post is adapted from the chapter on monad transformers from my book, Abstractions in Context.

Last time we looked at what monad transformers are. Hopefully now you understand a little bit about how to use them, but you might still be skeptical about how practical they actually are. Maybe the examples seem a little contrived and academic. What do monad transformers actually give us in terms of expressive power?

When we went over the fundamentals of monad stacks, it may have looked like we were just improving our code with small syntactic conveniences. Is that all monad transformers are? I would argue instead that monad transformers give you a profound way of structuring your code, one that gives you an incredible amount of composability and modularity. You can make your code infinitely extensible by other people, without ever needing to provide specific support for doing so.

I don't expect you to take me at my word. So let's dive in to some examples.


As monad transformers form such a core part of Haskell fundamentals, it's perhaps not surprising that many libraries have taken the approach of structuring their public API as a monad transformer. And while each of those libraries are useful on their own, the real magic happens when we start using them together. Some such libraries that we'll be making use of in this post are megaparsec (for parsing input, usually text), monad-logger (for... logging), haskeline (for reading user input), conduit (for streaming data), hedgehog (for property testing), and resourcet (for making sure limited resources are cleaned up properly), along with the "standard" transformers provided in transformers.

I've split the examples into different sections based on problem domain: testing, parsing, and an "output" section on rendering HTML and images. Each section is meant to be self-contained, and ramps up from easy, simple examples to more complex monad stacks involving multiple, powerful monad transformers. This isn't meant to be a comprehensive treatment of everything monad transformers are capable of, but my hope is that this will serve as an inspiration, a glimpse into how powerful monad transformers really are.

In order to produce working code, some of the examples will require defining typeclass instances (like the mtl-style MonadState, MonadReader, etc. typeclasses that we saw last time), or require some usage of the mmorph library. I've made sure to point out when that's the case.

Testing (with monad transformers)

The hedgehog library provides property-based testing à la QuickCheck. The main differences are that it provides nice quality-of-life tooling, like built-in functionality for testing state machines or recursive data structures, as well as the data generators being free-floating values instead of attached to a typeclass.

hedgehog provides two monad transformers as part of its public API: GenT representing random data generators, and PropertyT, representing individual properties to test. A test suite might consist of many PropertyTs, usually arranged in a tree.

We're mostly going to focus on PropertyT, as stacking transformers under or over GenT seems more niche. PropertyT itself gives the "capability" of generating random data for a property out of a GenT, defining the property, and automatically checking that property hundreds or thousands of times against said random data. What happens if we mix other transformers with it?

Property tests that do IO

someProp :: PropertyT IO ()
someProp = do
  testData <- forAll <generator>
  result <- liftIO $ <make some API call with testData>
  ...
  result === <...>

Generally it's frowned upon to do IO in property tests, since things like reading from a file or sending data over the network are slow, and a property test might get run hundreds or thousands of times, and no one likes slow test suites or slow CI builds. But this can be useful for one-and-done scripts; for instance, I've done this before when first working with a third-party API to ensure that it works the way I think it does, even in edge cases.

PropertyT IO () is actually Hedgehog's default type for property tests, so if you use Hedgehog, you can already write tests like this, no extra code needed.

Keeping test-case local state

someProp :: PropertyT (StateT s Identity) ()
someProp = do
  testData <- forAll <generator>
  modify' <some update fn>
  ...
  testData === <...>

Note the state is per each test case run, not per each property. So if the property is set to run 10,000 times before it succeeds, each of those 10,000 runs will have an isolated state.

Useful if the property has multiple "steps" that each change some intermediate value, and you don't want to constantly rename it.

To use:

-- hoist and generalize from Control.Monad.Morph
run :: s -> PropertyT (StateT s Identity) () -> PropertyT IO ()
run init = hoist (generalize . flip evalStateT init)

Debugging a broken test case (using logging)

someProp :: PropertyT (LoggingT IO) ()
someProp = do
  testData <- forAll <generator>
  lift $ logDebugN $ show testData
  ...

Pretty self-explanatory. The log output can end up interleaved with Hedgehog's own diff output, so I wouldn't recommend leaving log statements in your properties permanently. Still, it can be useful for quickly isolating a problem.

To use:

-- hoist from Control.Monad.Morph
run :: PropertyT (LoggingT IO) () -> PropertyT IO ()
run = hoist runStderrLoggingT  -- or wherever you want the log output to go

Putting it all together

Who says we can only have one of these pieces of functionality at a time? Monad transformers stack infinitely.

someProp :: PropertyT (StateT s (LoggingT IO)) ()
someProp = do
  ...

To use:

run :: s -> PropertyT (StateT s (LoggingT IO)) () -> PropertyT IO ()
run init = hoist (runStderrLoggingT . flip evalStateT init)

Parsing (with monad transformers)

The megaparsec parser combinator library should be your default choice for any kind of text parsing or processing.1 Its primary type is the ParsecT monad transformer. As per the documentation:

data ParsecT e s m a

-- ParsecT e s m a is a parser with custom data component of error e,
-- stream type s, underlying monad m and return type a.

So a ParsecT Void Text m Int would take as input a Text (i.e. a stream of characters), produce an Int, run in some monad m, and have no specific error messages.

Values of this type can then be glued together with combinators like many to produce parsers for larger, more complicated texts.

Parsing while keeping around state

someParser :: ParsecT Void Text (StateT Int Identity) a
someParser = do
  single '('
  modify' (+1)  -- count how many parenthesized things we find
  ...
  single ')'
  ...

Note that the state remains even if the parser backtracks.

Run using runParserT.

Parsing with logging

someParser :: ParsecT Void Text (LoggingT IO) a
someParser = do
  single '('
  lift $ logDebugN "found something parenthesized!"
  ...
  single ')'
  ...

Run using runParserT.

Take user input while parsing (like passwords)

The haskeline library provides readline-like capabilities, packaged as the InputT monad transformer. It's a lot more featureful compared to simply calling getLine, as it gives the user things like up/down arrows to cycle through previous input, an input history file, and asterisk-masked password inputs.

You could, say, build a tool to interactively parse JSON documents where some fields are encrypted.

someParser :: ParsecT Void Text (InputT IO) a
someParser = do
  chunk "BEGIN ENCRYPTED DATA\n"
  mpassword <- lift $ getPassword (Just '*') "Enter decryption password: "
  case mpassword of
    Nothing -> fail "no password provided"
    Just password -> <decrypt the data...>

Adding a StateT could let you only ask the user once if multiple such fields exist, and using the IO in the stack could let you cache credentials locally.

Run using runParserT.

Write simple interpreters

The "canonical" way of writing an interpreter for a programming language is to write a parser that parses the source code into an AST, then an interpreter that traverses the AST to "do what it says." For simple languages this might be overkill. By using ParsecT with other monad transformers, we can interleave the parsing and the doing, removing the need for an intermediate AST at all.

-- a simple language consisting of only two types of statements:
--  variable assignments (x := some text;)
--  show the value of a variable (show x;)

assignment :: ParsecT Void Text (StateT (Map Text Text) IO) ()
assignment = do
  identifier <- Text.pack <$> some alphaNumChar
  hspace
  chunk ":="
  hspace
  value <- takeWhile1P (Just "value") (\c -> c /= ';' && c /= '\n' && c /= '\r')
  single ';'
  eol
  modify' (Map.insert identifier value)

show :: ParsecT Void Text (StateT (Map Text Text) IO) ()
show = do
  _ <- chunk "show"
  hspace1
  identifier <- Text.pack <$> some alphaNumChar
  single ';'
  eol
  mvalue <- gets (Map.lookup identifier)
  case mvalue of
    Nothing -> do
      liftIO $ Text.putStrLn $ "ERR: unbound variable `" <> identifier <> "'"
      fail "unbound variable"
    Just value ->
      liftIO $ Text.putStrLn $ identifier <> ": " <> value

interpreter :: ParsecT Void Text (StateT (Map Text Text) IO) ()
interpreter = do
  many $ choice [try assignment, show]
  eof

If the language becomes more featureful, or the grammar more complicated, this wouldn't scale. For one thing, you'd have to be very careful with backtracking to avoid "phantom" results. But it can be useful for prototyping ideas.

Rendering output (with monad transformers)

The cairo and blaze-html libraries do similar things for different types of output: they provide a monadic DSL for rendering images or writing HTML documents, respectively. For instance, a cairo program to draw a circle with a line through it might look like so:

circleLine :: Double -> Render ()
circleLine radius = do
  setSourceRGB 0 0 0
  arc 50 50 radius 0 (2 * pi)
  stroke
  newPath
  moveTo 50 0
  lineTo 50 100
  setLineWidth 2
  stroke
  newPath

And an HTML document written using blaze-html might look like so:

document :: Html
document = do
  docType
  html $ do
    head $ do
      meta ! charset "UTF-8"
      title "Sample document"
    body $ do
      -- ...

By themselves these libraries are plenty useful, though note that Cairo and blaze-html only provide monads, not monad transformers. Still, even though they're "only" monads, we can get even more out of them by using them as the base of a monad stack!

Conduit

One of the transformers we can stack on top is ConduitT, from the conduit library, which provides a compositional way to process streaming data. Think of them like how Unix pipes can be glued together to form complex data processing. Conduits are like pipes, but typed and with better composition guarantees.

Logging while processing streaming data

streaming :: ConduitT input output (LoggingT IO) ()
streaming = do
  mnext <- await
  case mnext of
    Nothing -> pure ()
    Just next -> do
      lift $ logDebugN "processing next item..."
      -- do something with `next'

Run with runConduitT. Note that depending on where you stream input from or sink output to, you may need the innermost monad to be ResourceT IO instead of just IO. You'd need this if you sink the output to a file, for instance.

Cairo

Render an image while keeping track of some state

For instance, perhaps there's a sequence of colors to step through for each circle that gets rendered.

data Color = ...

nextColor :: Color -> Color
nextColor c = ...

circle :: (Double, Double) -> Double -> StateT Color Render ()
circle (x, y) radius = do
  color <- get
  -- set fill color appropriately
  lift $ arc x y radius 0 (2 * pi)
  lift fill
  lift newPath
  modify' nextColor

Render an image based on the result of some parse

A fixed sequence is no longer adequate. Instead, you want to be able to supply a file with a list of colors, and the rendering will use that list.

The conventional way to do this would be to parse the file into a list of Colors, then try to write a fold or traverse over that list. But circle from above might be called in a lot of different places, and you don't want all the callers to have to add an extra color parameter just so they can eventually pass it to circle.

Instead, what if we directly used ParsecT and Render together?

parseColor :: (Ord e, Monad m) => ParsecT e Text m Color
parseColor = ...

circle :: Ord e => (Double, Double) -> Double -> ParsecT e Text Render ()
circle (x, y) radius = do
  color <- parseColor
  -- set fill color appropriately
  lift $ arc x y radius 0 (2 * pi)
  lift fill
  lift newPath

foo :: Ord e => ParsecT e Text Render ()
foo = do
  circle (50, 50) 25
  circle (75, 100) 25

Notice how foo doesn't need to mention anything related to Colors at all. We only have to specify that a color is needed right where it's needed, in circle.

Render an image based on some input stream

Extending the previous example even further: why tie ourselves to any concrete input stream of colors? Can we abstract our code such that we could plug our rendering code into any stream?

circle :: (Double, Double) -> Double -> ConduitT Color o Render ()
circle (x, y) radius = do
  mcolor <- await
  case mcolor of
    Nothing -> pure ()
    Just color -> do
      -- set fill color appropriately
      lift $ arc x y radius 0 (2 * pi)
      lift fill
      lift newPath

foo :: ConduitT Color o Render ()
foo = do
  circle (50, 50) 25
  circle (75, 100) 25

alwaysRed :: Monad m => ConduitT i Color m ()
alwaysRed = do
  yield Red
  alwaysRed

fixedSequence :: Monad m => ConduitT i Color m ()
fixedSequence = do
  yield Red
  yield Blue
  yield Green
  fixedSequence

Now we can pipe any input stream into foo; both alwaysRed .| foo and fixedSequence .| foo will work. By putting the Render inside WriterT (Render ()), like we'll do with blaze-html, we could even put IO at the bottom of our stack, which would allow us to stream in colors from user input, directly from a file, or even incrementally from a network socket.

blaze-html

blaze-html's monadic DSL works through the MarkupM monad, and Blaze defines a convenient type synonym type Html = MarkupM (). It might seem like we can work with this exactly the same way we worked with Cairo: just stack monad transformers directly on top of this monad.

The main complication is that Blaze also provides a bunch of combinators. For instance, div is a function for producing (what else?) an HTML <div>...</div>. Take a look at its type:

div :: Html -> Html

If you attempt to directly stack transformers on top of Html/MarkupM, you won't be able to get it to compile.

greeting :: Text -> StateT s MarkupM ()
greeting name =
  lift $ span $ text $ "Hello, " <> name <> "!"

page :: StateT s MarkupM ()
page = lift $ do
  docType
  html $ do
    div $ do
      h1 "Welcome to my page!"
      greeting "William"  -- compile error, StateT s MarkupM != MarkupM

Which makes sense, since div expects an unadorned Html type, but by putting a monad transformer on it we've changed its type.

The way around this is simple but somewhat tedious. We can treat the Html as if it's a normal value and accumulate it inside a WriterT, taking advantage of the fact that Html has a Monoid instance. The downside of this approach is that we now have to write our own versions of div and all the functions like it to handle our new type.

text' :: MonadWriter Html m => Text -> m ()
text' t = tell $ text t

span' :: MonadWriter Html m => m () -> m ()
span' contents = pass $ fmap (, span) contents

div' :: MonadWriter Html m => m () -> m ()
div' contents = pass $ fmap (, div) contents

-- etc. for docType', html', h1'

greeting :: Text -> StateT s (WriterT Html Identity) ()
greeting name =
  span' $ text' $ "Hello, " <> name <> "!"

page' :: StateT s (WriterT Html Identity) ()
page' = do
  docType'
  html' $ do
    div' $ do
      h1' $ text' "Welcome to my page!"
      greeting "William"

The definitions of span', div', et. al. are all the same other than which "base" combinator we use. So we might as well write a helper so we don't have to write them all by hand:

liftTag :: MonadWriter Html m => (Html -> Html) -> m () -> m ()
liftTag tag contents =
  pass $ fmap (, tag) contents

span' = liftTag span
div' = liftTag div

Similar lifting of the (!) function would be required for HTML attributes.

In the following examples, span, div, etc. will refer to the lifted versions we just defined, unless stated otherwise.

Take user input while outputting HTML

The haskeline library provides readline-like capabilities, packaged as the InputT monad transformer. It's a lot more featureful compared to simply calling getLine, as it gives the user things like up/down arrows to cycle through previous input, an input history file, and asterisk-masked password inputs.

Using InputT, we can quickly set up HTML "templates" that prompt the user to input information like article title, article author, publish date, and so on.

inputField :: String -> WriterT Html (InputT IO) ()
inputField prompt = do
  muserInput <- lift $ getInputLine ("Enter " <> prompt <> ": ")
  case muserInput of
    Nothing -> span $ tell mempty
    Just input -> span $ text $ Text.pack input

document :: WriterT Html (InputT IO) ()
document = html $ do
  body $ do
    h2 $ inputField "title"
    p $ do
      text "Authored by "
      inputField "author"

Outputting HTML based on some input stream

For instance, say we want to number each section in our document (Section 1, Section 2, Section 3, etc.) Instead of manually writing in those numbers, we can use Conduit to automatically generate them for us.

sectionHeading :: ConduitT Int o (WriterT Html Identity) ()
sectionHeading = do
  msectionNum <- await
  case msectionNum of
    Nothing -> pure ()  -- or signal an error
    Just num -> h2 $ text $ "§ Section " <> Text.pack (show num)

sequenceNumber :: Monad m => Int -> ConduitT i Int m ()
sequenceNumber init =
  Conduit.iterate (+1) init

document :: ConduitT Int o (WriterT Html Identity) ()
document = ...

Then you can write sequenceNumber 1 .| document, and any usages of sectionHeading anywhere in or below the definition of document will automatically order themselves.

Output HTML based on an incremental input stream

Conduit is a general tool for working for streaming data, meaning there's no reason the input stream couldn't come from a file, or a network socket, or a database request. That means that with a type like so

document :: ConduitT i o (WriterT Html (ResourceT IO)) ()
document = ...

you can incrementally generate an HTML document based on data that isn't all available at once.


Hopefully with that you've gotten a taste of the power of monad transformers. Look, for instance, at using megaparsec to directly implement interpreters, and how we've managed to produce something far larger than the sum of its parts!

To close with, I'd like to point out two final things.

Firstly, that we were able to augment the libraries we used with a bunch of orthogonal functionality without the library authors needing to provide specific support for extensibility. There was no clearly delineated "plugin" system, and the library authors didn't have to try to predict how their code would be used. Just by making the library API a monad transformer the library becomes infinitely extensible for free, which also lets individual libraries be more independent of each other.

Secondly, that the examples shown here, with extra monad transformers stacked onto the libraries, retain the ability to be recombined with other transformers, just the same way we did. All the bits of code here are still monads, and we could happily package them up into libraries of their own for other developers to add transformers on top to suit their needs, or slice and dice the contents of the monad stack using the mmorph package.

This, in my mind, is one of the reasons to use Haskell: the abstractions and concepts that Haskell gives you are defined so generically, and because of that they often recombine in beautiful, powerful, and unexpected ways. Even the mundane code you write to solve practical problems ends up full of glittering possibility, ready to be extended and reused in countless directions. Everything builds and builds on each other, becoming more than the sum of each individual abstraction.

I hope that you've found this short tour through monad transformers useful, illuminating, and perhaps even a little mind-blowing.

As ever, I would love to hear what you thought about this post.

« Previous post  

Before you close that tab...


↥1 You might have heard of other alternatives in the parser combinator space, like parsec, which is bundled with GHC, or attoparsec. The general rule is to use megaparsec for human-readable text, and attoparsec for binary formats. parsec is mostly legacy code that proved the feasibility of parser combinators.

Interestingly, parser represents its core parser type as a monad transformer, but attoparsec does not.