Matt Parsons wrote an article the other day called Return a Function to Avoid Effects. The article is short and to the point, so I’d recommend just reading it, but this gist is this: sometimes a library wants you to produce a value of type t, but you’d like to have a value of type e available when you do that and the API doesn’t offer a way to inject an e in the right place. Parsons shows that you can produce a value of type e -> t instead and then, after the library returns it to you, pass an e to the function to get the t.

In this article I’m going to talk about another application of this technique: making extra state available while parsing JSON with the Aeson library. I learned how to do this from Stack Overflow user Benjamin Hodgson, who answered my question on the topic. Parsons’s article inspired me to write up this use of the technique too.

The code samples will assume that these extensions and imports are in play:

It’s worth noting that everything I’m going to say about parsing JSON with Aeson also applies to parsing YAML with the yaml library, since the yaml library wraps Aeson and they share the same types under the hood.

When JSON parsing is straightforward

Suppose we’re building a static site generator that will take Markdown files and convert them into HTML files. Each Markdown file starts with a block of metadata in JSON format:1

We want to parse this JSON into the following record type:

The Aeson library defines a class, FromJSON, that indicates that a type can be deserialized from JSON. Our instance of FromJSON looks like this:

Let’s test this out in GHCi. (I’ve added some indentation and line breaks for clarity.)

> let jsonString = "{\"title\": \"Parsing JSON with more context\",
                     \"date\": \"2019-03-31T14:14:39-07:00\",
                     \"location\": \"Mill Valley, California\"}"
> (decode jsonString) :: Maybe Headers
Just (Headers {title = "Parsing JSON with more context",
               date = 2019-03-31 21:14:39 UTC,
               location = "Mill Valley, California"})

Great. Now let’s add another field to our Headers—one that will require some information that isn’t present in the JSON header itself.

Motivation

Suppose we want to add a “featured image” to some of our articles. We’ll define a new type to hold some image information:

We want the image’s width and height so that we can include these values directly in the <img> tag we create. This avoids a flash of unstyled content effect: if we didn’t include the image size, the browser would have to lay out the page twice—once before and once after learning the image’s dimensions—and the page would jump unpleasantly when the layout was recalculated.

The obvious way to change the JSON metadata would be to add an “image” object with “filename,” “width,” and “height” keys. But wouldn’t it be nice if we could just specify the filename and have our program fill in the size?

This is perfectly possible, as it turns out. What’s more, we can add this information as part of the JSON parsing process, without defining multiple data formats or adding unsemantic Maybes or anything else. Let’s define a type that will map from image filenames to sizes, and a convenience function for doing the lookup:

(This function will crash at runtime if the given path isn’t present in the map. We’ll fix this later.)

Our updated Headers record looks like this:

The image is wrapped in a Maybe because some articles might not have images. This Maybe is not here to represent any kind of runtime failure.

Return a function

For our updated FromJSON instance we’re going to do something weird. Instead of defining an instance for Headers itself, like you would expect, we’re going to define an instance for ImageSizeTable -> Headers:

This says that given some JSON, we can’t immediately decode a Headers from it, but we can decode a function which will take an ImageSizeTable and then give us our Headers. The type ImageSizeTable -> Headers is an example of a reader monad, sometimes called an environment functor. You can also write it like (->) ImageSizeTable Headers if you want to prevent newbies from understanding your code.

Let’s look at the FromJSON instance for this type.

Our original implementation used the Aeson machinery to pull the title, date, and location out of the JSON object and pass these to the Headers constructor. Now’re we’re pulling out the title, date, location, and image filename and passing these to makeLookupFn. (I’ve written this as a separate function for clarity; this kind of helper function would usually be defined in a where clause within parseJSON.) The makeLookupFn function is being partially applied: it takes five arguments (title, date, location, image path, and image size table) but we’re only giving it four (title, date, location, and image path).

The result is a function that still needs to be given an image size table before it will return the Headers we want. That point bears repeating: instead of Aeson’s decode function giving us a Headers object (wrapped in Maybe to represent the possibility of failure), it’s giving us a function (still wrapped in Maybe) that we have to call before we actually get a Headers.

We can use it like this:

> let table = (HM.fromList [("flower.jpeg", (800, 600))]) :: ImageSizeTable
> let jsonString = "{\"title\": \"Parsing JSON with more context\",
                     \"date\": \"2019-03-31T14:14:39-07:00\",
                     \"location\": \"Mill Valley, California\",
                     \"image\": \"flower.jpeg\"}"
> let makeHeaders = (decode jsonString) :: Maybe (ImageSizeTable -> Headers)
> makeHeaders <*> Just table
Just (Headers {title = "Parsing JSON with more context",
               date = 2019-03-31 21:14:39 UTC,
               location = "Mill Valley, California",
               image = Just (Image {filename = "flower.jpeg",
                                    width = 640,
                                    height = 360})})

What we get back from Aeson’s decode function is a function wrapped in a Maybe. This accounts for the possibility that the JSON parsing might fail, and it leads to our using the awkward makeHeaders <*> Just table syntax for calling the function. (You can read more about why the <*> operator is necessary in this article by David Tchepak.) This is also why the final result is a Maybe Headers instead of a Headers.

A flower.

This flower has nothing to do with this article, but after all this talk about images it would be silly not to include one.

Making it total

There’s one more improvement we should make: let’s keep the makeImage function from bringing our program down when it’s given an unknown filename. The first version looked like this:

It’s simple to make it safer:

We could use this function directly to populate the “image” field of Headers, which already has the type Maybe Image, but that wouldn’t be semantically correct—the maybe-ness of that field indicates whether there is an image for that article, not whether we were able to look up the image metadata successfully. What we can do instead is to write a FromJSON instance for an even more elaborate type:

We can make this a little more concise—although not necessarily more readable—by moving the definition of makeLookupFn within the definition of parseJSON and by moving some pattern matching outward.

Let’s try out this new instance.

> let table = (HM.fromList [("flower.jpeg", (640, 360))]) :: ImageSizeTable
> let jsonString1 = "{\"title\": \"Parsing JSON with more context\",
                      \"date\": \"2019-03-31T14:14:39-07:00\",
                      \"location\": \"Mill Valley, California\",
                      \"image\": \"flower.jpeg\"}"
> let makeHeaders1 = (decode jsonString1)
                         :: Maybe (ImageSizeTable -> Maybe Headers)
> makeHeaders1 <*> Just table
Just (Just (Headers {title = "Parsing JSON with more context",
                     date = 2019-03-31 21:14:39 UTC,
                     location = "Mill Valley, California",
                     image = Just (Image {filename = "flower.jpeg",
                                          width = 640,
                                          height = 360})}))

This is our expected Headers, now wrapped within an extra Maybe layer to represent the possibility that the image couldn’t be looked up. (Remember, the outermost Maybe comes from the fact that the JSON parsing might fail.) If we try this again with an unrecognized image file,

> let jsonString2 = "{\"title\": \"Parsing JSON with more context\",
                      \"date\": \"2019-03-31T14:14:39-07:00\",
                      \"location\": \"Mill Valley, California\",
                      \"image\": \"invalid.jpeg\"}"
> let makeHeaders2 = (decode jsonString2)
                         :: Maybe (ImageSizeTable -> Maybe Headers)
> makeHeaders2 <*> Just table
Just Nothing

Again, the outermost Maybe is a Just because the JSON parsing was successful; within that, we have Nothing because “invalid.jpeg” wasn’t in the list of known images.

Nested maybes are usually better than crashing at runtime but they are a bit awkward to work with. Passing this result through the join function from Control.Monad would convert our Maybe (Maybe Headers) to a Maybe Headers, getting rid of the nesting. Some do notation might also help to flatten the types out.

The broader point is that types don’t have to be “simple” to be instances of FromJSON. By defining instances for more complex types you can add extra context—and even error handling—while keeping the code pure, explicit about its data dependencies, and confined to the JSON-parsing layer of your application.


  1. Okay, realistically, everyone uses YAML for this, not JSON. Since everything I say will be equally applicable to both formats, though, I might as well write about JSON, which I think is more widely used.↩︎