Parsing and emitting JSON is a necessity for basically any program that wants to talk to the internet. Haskell's most widely used JSON package, aeson
, is the de-facto standard choice here. Its basic usage is easy enough: define your Haskell types for some JSON data, derive FromJSON
and ToJSON
instances, and you're ready to encode and decode. Of course, real-life JSON data and API returns are never quite that simple; work with aeson long enough, and you'll likely need more complicated use cases. For instance, what if you need to:
- parse JSON data where the attribute names are different from your Haskell property names? e.g. where the attribute names are snake_case and your code is camelCase?
- have several different JSON parsers for the same Haskell data? If you use typeclasses, there has to be a single 'canonical' parser.
- parse a string enum which can only be one of a few alternatives? e.g. your JSON
"user_type"
can only be one of"user"
,"admin"
, or"customer_support"
. - use one field to determine how to parse the rest of the document? e.g. for Slack's Web API, you need to check the "ok" field first to determine whether the document contains data, or whether it only has an error message.
- deal with weird strange JSON formats, like if collections are sent as dictionaries with ordered keys instead of a list?
- parse JSON fields without needing to create a new datatype?
The autoderived parsers that aeson gives you won't cut it for these sorts of situations. Fortunately, it is possible to use aeson for any complicated JSON parsing solution you need, including the ones listed above, but it can be surprisingly nonobvious how to do so. So here's a cheatsheet for some common operations using aeson.
Note that the later examples will make heavy use of monadic code; for the more complicated use cases of aeson, there's really no way around it. You should be able to understand the logic behind what's going on even if you're not 100% on monads, but if you're not comfortable with reading monadic code and do-syntax, take a look at my introduction to monads.
All examples were tested on aeson 1.4.4.0.
Importing and using
Add to your package.yaml
/cabal file:
dependencies:
- aeson
In modules where you need to manipulate JSON:
import Data.Aeson
import Data.Aeson.Types
While the import of Data.Aeson.Types
isn't strictly necessary, it's so often useful that it's usually worth importing.
Basics
Autoderiving parsers and serializers
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}
import Data.Aeson
import GHC.Generics
data Foo = Foo
field1 :: Int
{ field2 :: String
,
}deriving (Show, Generic, ToJSON, FromJSON)
-- ToJSON so that we can encode *to* a JSON string,
-- FromJSON so that we can parse *from* a JSON string
This is the 'preferred' way to automatically derive JSON parsers and encoders for your types, using Generic. Essentially, this gives aeson enough information to introspect on the structure of Foo and figure out all the fields it needs and what they're named, and thus how to construct one using a JSON string. Don't forget to enable the language extensions.
Like I mentioned before, the autoderived instances aren't particularly flexible, and won't cut it for even slightly complicated formats. It will only parse JSON where the attribute names are exactly the same as the Haskell names, including case. In practice, I find that I never write my instances this way; I prefer manually implementing the ToJSON and FromJSON instances for more control.
Decoding from JSON strings
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString.Lazy as LB
jsonString :: LB.ByteString
= "{ \"field1\": 27, \"field2\": \"hello!\" }"
jsonString
maybeFoo :: Maybe Foo
= decode jsonString
maybeFoo
> maybeFoo
λ>>> Just (Foo {field1 = 27, field2 = "hello!"})
decode :: FromJSON a => LB.ByteString -> Maybe a
Aeson only seems to provide functionality to decode from lazy ByteStrings.
Encoding to JSON strings
myFoo :: Foo
= Foo
myFoo = 909
{ field1 = "take your time"
, field2
}
> encode myFoo
λ>>> "{\"field1\":909,\"field2\":\"take your time\"}"
encode :: ToJSON a => a -> LB.ByteString
Basically, if you're working with aeson, use lazy ByteStrings.
Core JSON types
-- Any possible JSON value.
data Value
= Object Object
| Array Array
| String Text
| Number Scientific
| Bool Bool
| Null
-- Just JSON objects, e.g. things constructed using {...}
type Object = HashMap Text Value
-- Just JSON arrays, e.g. things constructed using [...]
type Array = Vector Value
Text, Bool, HashMap and Vector are as you'd expect. Scientific is from the scientific
package, and represents arbitrary-precision numbers.
Constructing JSON values directly
While the data definition above is already enough to construct any valid JSON value you want, there's some convenience functions for constructing JSON objects. (By the way, pay attention to the difference between JSON values and JSON objects.)
{-# LANGUAGE OverloadedStrings #-}
customValue :: Value
= object
customValue "list_price" .= (150000 :: Int)
[ "sale_price" .= (143000 :: Int)
, "description" .= ("2-bedroom townhouse" :: String)
,
]
> customValue
λ>>> Object
(fromList"sale_price" , Number 143000.0 )
[ ( "list_price" , Number 150000.0 )
, ( "description"
, ( String "2-bedroom townhouse"
,
) ])
object :: [Pair] -> Value
(.=) :: ToJSON v => (strict) Text -> v -> Pair
Common use cases
Implementing a custom parser
Eventually the default parsers won't cut it. If you need custom parsing behavior, you can always write your ToJSON and FromJSON values by hand.
For writing ToJSON instances, you can use the convenience functions from before to build aeson Values.
{-# LANGUAGE OverloadedStrings #-}
data Person = Person
firstName :: String
{ lastName :: String
,
}deriving (Show)
-- our fields are snake_case instead
instance ToJSON Person where
Person { firstName = firstName, lastName = lastName }) =
toJSON ("first_name" .= firstName
object [ "last_name" .= lastName
,
]
> encode (Person "Karl" "Popper")
λ>>> "{\"first_name\":\"Karl\",\"last_name\":\"Popper\"}"
For writing FromJSON instances, the main functions you'll want are withObject
and (.:)
.
-- our fields are snake_case instead
instance FromJSON Person where
-- note that the typeclass function is parseJSON, not fromJSON
= withObject "Person" $ \obj -> do
parseJSON <- obj .: "first_name"
firstName <- obj .: "last_name"
lastName return (Person { firstName = firstName, lastName = lastName })
karlJSON :: LB.ByteString
= "{\"first_name\":\"Karl\",\"last_name\":\"Popper\"}"
karlJSON
> decode karlJSON :: Maybe Person
λ>>> Just (Person {firstName = "Karl", lastName = "Popper"})
If you have an optional field, use (.:?)
instead of (.:)
.
data Item = Item
name :: String
{ description :: Maybe String
,
}deriving (Show)
instance FromJSON Item where
= withObject "Item" $ \obj -> do
parseJSON <- obj .: "name"
name <- obj .:? "description"
description return (Item { name = name, description = description })
> decode "{\"name\": \"Very Evil Artifact\"}" :: Maybe Item
λ>>> Just (Item {name = "Very Evil Artifact", description = Nothing})
withObject :: String -> (Object -> Parser a) -> Value -> a
(.:) :: FromJSON a => Object -> Text -> Parser a
(.:?) :: FromJSON a => Object -> Text -> Parser (Maybe a)
Note that Parser implements Monad and Alternative. So if you need to do more complex things like take in both snake_case and camelCase keys for the same field, or conditionally parse one field based on the value of another, you can use the normal applicative/monadic tools for doing so. We'll see some examples of doing that later.
Parsing enum datatypes
The autoderive works for simple enum types as well.
data UserType = User | Admin | CustomerSupport
deriving (Generic, ToJSON, FromJSON)
> encode CustomerSupport
λ>>> "\"CustomerSupport\""
But the output is, once again, exactly the same case as the Haskell code. So if you're trying to parse some API enum you'll need to write custom instances once again.
The ToJSON instance should be fairly obvious. Writing the FromJSON instance is a little bit trickier.
instance FromJSON UserType where
= withText "UserType" $ \text ->
parseJSON case text of
"user" -> return User
"admin" -> return Admin
"customer_support" -> return CustomerSupport
-> fail "string is not one of known enum values" _
Parsing weird JSON formats
Since the Parser
type is a monad, we can write as complicated conditional logic as we want inside our parser code.
For instance, let's say an API we're working with can either send us some data or an error message; we need to check the "ok"
attribute first to see which way to parse it. We might represent this with a sum type on the Haskell side. How do we write our FromJSON instance?
import Data.Text
data APIResult
= JSONData Value
| Error Text
deriving (Show)
instance FromJSON APIResult where
= withObject "APIResult" $ \obj -> do
parseJSON <- obj .: "ok"
ok if ok
then fmap JSONData (obj .: "data")
else fmap Error (obj .: "error_msg")
goodData :: LB.ByteString
= "{\"ok\":true,\"data\":{\"foo\":2}}"
goodData
badData :: LB.ByteString
= "{\"ok\":false,\"error_msg\":\"no_credentials\"}"
badData
> decode goodData :: Maybe APIResult
λ>>> Just (JSONData (Object (fromList [("foo",Number 2.0)])))
> decode badData :: Maybe APIResult
λ>>> Just (Error "no_credentials")
Another annoying situation might be if collections are sent as dictionaries with ordered keys instead of as JSON lists. But again, we can handle this:
-- e.g. our API sends us data like
--
-- {
-- "element1": 42,
-- "element2": -20,
-- "element3": 1000
-- }
--
-- instead of [42, -20, 1000]
import qualified Data.List as L
import qualified Data.HashMap.Strict as HM
data JSONHashList a = HashList [a]
deriving (Show)
instance FromJSON a => FromJSON (JSONHashList a) where
= withObject "JSONHashList" $ \obj ->
parseJSON let kvs = HM.toList obj
= L.sortOn (\(key, _) -> key) kvs
sorted = map (\(_, val) -> val) sorted
vals = mapM parseJSON vals
parsed in fmap HashList parsed
weirdListData :: LB.ByteString
= "{\"element1\":42,\"element2\":-20,\"element3\":1000}"
weirdListData
> decode weirdListData :: Maybe (JSONHashList Int)
λ>>> Just (HashList [42,-20,1000])
Parse a type directly from a Value
Right now we can parse from a ByteString, but what if we already have a Value
?
The simplest way is to use fromJSON
:
fromJSON :: FromJSON a => Value -> Result a
value :: Value
= object [ "first_name" .= "Juniper", "last_name" .= "Lerrad" ]
value
> fromJSON value :: Result Person
λ>>> Success (Person {firstName = "Juniper", lastName = "Lerrad"})
However, fromJSON
returns aeson's own custom Result
type, which is all fine and dandy, but probably not what you're passing around in the rest of your application.
Thankfully, Data.Aeson.Types
provides the parseMaybe
and parseEither
functions, which return values of types rather more compatible with the rest of the Haskell ecosystem:
parseMaybe :: (a -> Parser b) -> a -> Maybe b
parseEither :: (a -> Parser b) -> a -> Either String b
Since parseJSON
from the FromJSON
typeclass already has the type Value -> Parser a
, we can use it to define useful utility functions. Plugging it into the first argument of parseMaybe
and gives us:
fromJSONValue :: FromJSON a => Value -> Maybe a
= parseMaybe parseJSON
fromJSONValue
> fromJSONValue value :: Maybe Person
λ>>> Just (Person {firstName = "Juniper", lastName = "Lerrad"})
Have multiple parsing functions for a single type
Sometimes you might have several different JSON formats for the same object, and then the typeclass solution won't cut it. But we just saw that parseMaybe
and parseEither
take a parser function as their first argument. We used the function that the FromJSON typeclass provides before, but there's nothing stopping us from putting something else there.
data Person = Person
firstName :: String
{ lastName :: String
,
}deriving (Show)
snakeCaseParser :: Value -> Parser Person
= withObject "Person" $ \obj -> do
snakeCaseParser <- obj .: "first_name"
firstName <- obj .: "last_name"
lastName pure (Person { firstName = firstName, lastName = lastName })
pascalCaseParser :: Value -> Parser Person
= withObject "Person" $ \obj -> do
pascalCaseParser <- obj .: "FirstName"
firstName <- obj .: "LastName"
lastName pure (Person { firstName = firstName, lastName = lastName })
snakeCasePerson :: Value
= object
snakeCasePerson "first_name" .= ("Dimitri" :: String)
[ "last_name" .= ("Blaiddyd" :: String)
,
]
pascalCasePerson :: Value
= object
pascalCasePerson "FirstName" .= ("Dimitri" :: String)
[ "LastName" .= ("Blaiddyd" :: String)
,
]
> parseMaybe snakeCaseParser snakeCasePerson :: Maybe Person
λ>>> Just (Person {firstName = "Dimitri", lastName = "Blaiddyd"})
> parseMaybe snakeCaseParser pascalCasePerson :: Maybe Person
λ>>> Nothing
> parseMaybe pascalCaseParser snakeCasePerson :: Maybe Person
λ>>> Nothing
> parseMaybe pascalCaseParser pascalCasePerson :: Maybe Person
λ>>> Just (Person {firstName = "Dimitri", lastName = "Blaiddyd"})
Parse a type directly from an Object
Sometimes we already know that we have a JSON Object and don't need the full generality of Value. But all of our functions thus far either parse from a ByteString or a Value.
However, since withObject
takes in a parser that takes in an Object and turns it into a parser that takes a Value, we can get to what we want by just removing the withObject wrapping and defining the parser separately. So instead of defining FromJSON instances the way we did above, we can do it like this:
{-# LANGUAGE OverloadedLists #-}
personParser :: Object -> Parser Person
= do
personParser obj <- obj .: "first_name"
firstName <- obj .: "last_name"
lastName return (Person { firstName = firstName, lastName = lastName })
instance FromJSON Person where
= withObject "Person" personParser
parseJSON
personObject :: Object
= [("first_name", "Anthony"), ("last_name", "Yoon")]
personObject
> parseMaybe personParser personObject
λ>>> Just (Person {firstName = "Anthony", lastName = "Yoon"})
Parsing without a new datatype
Since the parseX
family of functions takes in a parser directly, there's no need to define a new datatype for one-off or bespoke parses.
tupleizeFields :: Value -> Either String (Int, Bool)
= parseEither $
tupleizeFields "<fields>" $ \obj -> do
withObject <- obj .: "field1"
field1 <- obj .: "field2"
field2 return (field1, field2)
tupleJSON :: Value
= object
tupleJSON "field1" .= 955
[ "field2" .= True
,
]
> tupleizeFields tupleJSON
λ>>> Right (955,True)
Putting it all together, we can go straight from a ByteString to parsed data without having to define a datatype at all.
import Data.ByteString.Lazy
tupleizeFieldsBS :: ByteString -> Either String (Int, Bool)
= do
tupleizeFieldsBS input <- eitherDecode input
object let parser = (\obj -> do
<- obj .: "field1"
field1 <- obj .: "field2"
field2 return (field1, field2))
parseEither parser object
> tupleizeFieldsBS "{\"field1\":955,\"field2\":true}"
λ>>> Right (955,True)
Less common stuff
Parsing nested fields
While being able to write parsers to access nested fields is a natural consequence of the monad instance for Parser, it may not immediately spring to mind the first time you need to do it.
-- { contact_info: { email: <string> } }
nested :: Value -> Parser String
= withObject "ContactInfo" $ \obj -> do
nested <- obj .: "contact_info"
contact .: "email"
contact
> parseMaybe nested $ object
λ| [ "contact_info" .=
| object [ "email" .= "williamyaoh@gmail.com" ]
| ]
>>> Just "williamyaoh@gmail.com"
If you find yourself doing this a lot, it might even be worth it to define a new operator specifically for this use case:
import Data.Text
(.->) :: FromJSON a => Parser Object -> Text -> Parser a
.->) parser key = do
(<- parser
obj .: key
obj
nested' :: Value -> Parser (String, String)
= withObject "ContactInfo" $ \obj -> do
nested' <- obj .: "contact_info" .-> "email"
email <- obj .: "contact_info" .-> "address" .-> "state"
state return (email, state)
> parseMaybe nested' $ object
λ| [ "contact_info" .= object
| [ "email" .= "williamyaoh@gmail.com"
| , "address" .= object
| [ "state" .= "OK"
| , "zip_code" .= "74008"
| ]
| ]
| ]
>>> Just ("williamyaoh@gmail.com","OK")
Parsing multiple JSON values in the same string
The default functions that aeson provides don't allow you to inspect what's left over in the input string after parsing a Value, so if you need to do things like parse a complicated file format where JSON is somewhere in the format, or parse multiple JSON values appended to the same file, the functions we've looked at up to now won't cut it.
Thankfully, aeson also exposes its attoparsec
parsers, so we can use all the tools we have for manipulating parser combinators to handle JSON input as well.
You'll likely want to import parser-combinators
as well.
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.Combinators
-- from parser-combinators
import Data.Aeson
import Data.ByteString.Lazy
import qualified Data.Attoparsec.ByteString.Lazy as Atto
jsonStr :: ByteString
= "{ \"foo\": 555 }"
jsonStr
input :: ByteString
= jsonStr `mappend` jsonStr
input
> Atto.parse (many json) input
λ>>> Done
""
Object (fromList [ ( "foo" , Number 555.0 ) ])
[ Object (fromList [ ( "foo" , Number 555.0 ) ])
, ]
-- attoparsec Parser, not aeson Parser
json :: Parser Value
Useful auxilliary libraries
Since aeson is so widely used, there are a fair amount of libraries in the ecosystem that provide extra functionality on top of what is provided in aeson itself. You don't need any of these libraries to work with JSON, but you might find them useful.
Pretty-printing
Aeson doesn't provide a way to pretty-print the encoded JSON strings by default, but the aeson-pretty
package does.
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import Data.Aeson.Encode.Pretty
import qualified Data.ByteString.Lazy.Char8 as B
encodePretty :: ToJSON a => a -> (lazy) ByteString
> json = object
λ"period" .= ("yearly" :: String)
[ "metadata" .= object
, "created_at" .= ("2019-05-01" :: String)
[ "views" .= 0
,
]
]> B.putStrLn $ encodePretty json
λ>>> {
>>> "period": "yearly",
>>> "metadata": {
>>> "views": 0,
>>> "created_at": "2019-05-01"
>>> }
>>> }
If you need more control over how the output is formatted, aeson-pretty also provides encodePretty'
:
encodePretty' :: ToJSON a => Config -> a -> (lazy) ByteString
data Config = Config
confIndent :: Indent
{-- how to sort object keys
confCompare :: Text -> Text -> Ordering
,-- how to output numeric types
confNumFormat :: NumberFormat
, confTrailingNewline :: Bool
,
}
defConfig :: Config
-- * 4 spaces per indent
-- * don't sort keys
-- * don't add trailing newline
data Indent = Spaces Int | Tab
data NumberFormat
= Generic
| Scientific
| Decimal
| Custom (Scientific -> Data.Text.Lazy.Builder)
> config = defConfig
λ= Spaces 2
{ confIndent = compare
, confCompare
}> B.putStrLn $ encodePretty' config json
λ>>> {
>>> "metadata": {
>>> "created_at": "2019-05-01",
>>> "views": 0
>>> },
>>> "period": "yearly"
>>> }
Embedding literal JSON values in code
The aeson-qq
package provides a quasiquoter to allow you to directly write JSON strings into your code and have them converted into Values.
{-# LANGUAGE QuasiQuotes #-}
import Data.Aeson
import Data.Aeson.QQ
users :: Value
= [aesonQQ|
users
{
"users": [
{
"username": "michael.oakeshott",
"id": 1
},
{
"username": "miguel.de.cervantes",
"id": 4
}
]
} |]
This quasiquoter also allows you interpolate in any values that implement ToJSON by enclosing them with #{...}
.
Newer versions of aeson provide a simple version of this quasiquoter in Data.Aeson.QQ.Simple
, but without the ability to interpolate values.
Doing data access directly on Values
If you need to directly grab data from within a Value, you can always just use pattern matching. However, this quickly gets pretty tedious if you need to do anything more complicated than grabbing a single, surface-depth attribute.
The lens-aeson
provides (what else?) lenses for accessing JSON data.
{-# LANGUAGE QuasiQuotes #-}
import Control.Lens
-- from lens
import Data.Aeson
import Data.Aeson.QQ
import Data.Aeson.Lens
-- from lens-aeson
someJSON :: Value
= [aesonQQ|
someJSON
{
"data": {
"timestamps": {
"created_at": "2019-05-11 17:53:21"
}
}
}
|]
> someJSON ^? key "data".key "timestamps".key "created_at"
λ>>> Just (String "2019-05-11 17:53:21")
A full explanation of lenses is outside the scope of this article. Here are some more examples of using lens-aeson
, as well as exercises for learning lenses themselves.
JSON seems to be the de-facto standard interchange format for the internet for the time being. So if you're doing anything online with Haskell, you'll likely end up working with aeson, one way or another. With this, you should be equipped to handle most of the JSON-related situations you come across.
If there's one thing to take away from this post, keep in mind that the Parser
type (and its Applicative/Monad instances) is what drives all the fancy JSON ingesting. If you're having trouble parsing something, it's likely in how you're constructing values of this type. Pay attention to all the functions that produce Parser values.
While aeson is the de-facto standard JSON library, waargonaut
is a recent addition to the ecosystem with a focus on supporting JSON parsing through term-level parsers rather than typeclasses. I haven't actually used it and can't comment on its usefulness, but if you need more flexible parsing, it may be worth taking a look at.
Come across any particularly hairy JSON and having trouble wrangling it in Haskell? Got a comment? Talk to me!
Before you close that tab...
Want to write practical, production-ready Haskell? Tired of broken libraries, barebones documentation, and endless type-theory papers only a postdoc could understand? I want to help. Subscribe below and you'll get useful techniques for writing real, useful programs straight in your inbox.
Absolutely no spam, ever. I respect your email privacy. Unsubscribe anytime.