So you've started writing type-safe database queries using Beam or another Haskell database library. No more writing raw SQL strings into your code and praying that your integration tests don't let things slip through the cracks. But maybe having just your queries be type-safe isn't good enough for you. Long-lived application are going to have to change their database schema eventually. Can you manage those changes in Haskell as well? Make them type-safe and reliable? As it turns out, you can, with Beam's built-in migration functionality.
If you don't know about it, Beam is one of a number of Haskell libraries for providing type-safe, composable database access. Beam also provides migration functionality for managing your database schema. Unfortunately, there's basically no documentation about how to use it. Rest assured, it is completely functional, if difficult to start using.
This post collects everything I learned while trying to get up and running with Beam's migrations, and should help you get started as well.
There are actually a lot of different Haskell tools for managing migrations directly in pure Haskell, from the most barebones like postgresql-simple-migration
and drifter
up to the more heavyweight migration frameworks provided by Beam, Squeal, and Hasql. Some of these options will only work for specific DB backends, predominantly Postgres. If you're looking to make a decision between the various options, this post won't go into detailed comparisons between any of these options, but should give you enough information about Beam's migrations to allow you to make your own comparison.
Before we dive into exploring the migration framework proper, we'll need an example schema that we're interested in. Let's assume we're working with Postgres. We'll need the beam-core
, beam-migrate
, and beam-postgres
packages. These examples were tested against beam-migrate-0.5.1.2
.
Since we're working with migrations, we'll build up the schema step by step. Let's say that we're trying to set up a database to store orders for a flower shop. For now, we only care about storing information about each flower we stock, and what quantity of each flower is in each order. So we'll have a table for flowers
, a table for orders
, and a many-to-many join table between orders and flowers.
When writing migrations in Beam, you first need Haskell datatypes for your entities and database. So let's write those.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE TypeFamilies #-}
import Data.Int ( Int32, Int64 )
import Data.Text ( Text )
import Data.Scientific ( Scientific )
import Data.Time ( UTCTime )
import Database.Beam
data FlowerT f = Flower
flowerID :: Columnar f Int32
{ flowerName :: Columnar f Text
, flowerPrice :: Columnar f Scientific
,
}deriving (Generic, Beamable)
data OrderT f = Order
orderID :: Columnar f Int32
{ orderTime :: Columnar f UTCTime
,
}deriving (Generic, Beamable)
data LineItemT f = LineItem
lineItemOrderID :: PrimaryKey OrderT f
{ lineItemFlowerID :: PrimaryKey FlowerT f
, lineItemQuantity :: Columnar f Int64
,
}deriving (Generic, Beamable)
data FlowerDB f = FlowerDB
dbFlowers :: f (TableEntity FlowerT)
{ dbOrders :: f (TableEntity OrderT)
, dbLineItems :: f (TableEntity LineItemT)
,
}deriving (Generic, Database be)
instance Table FlowerT where
data PrimaryKey FlowerT f = FlowerID (Columnar f Int32)
deriving (Generic, Beamable)
= FlowerID . flowerID
primaryKey
instance Table OrderT where
data PrimaryKey OrderT f = OrderID (Columnar f Int32)
deriving (Generic, Beamable)
= OrderID . orderID
primaryKey
instance Table LineItemT where
data PrimaryKey LineItemT f =
LineItemID (PrimaryKey OrderT f) (PrimaryKey FlowerT f)
deriving (Generic, Beamable)
= LineItemID
primaryKey <$> lineItemOrderID
<*> lineItemFlowerID
Unfortunately, Beam's migration framework seems to require these types to be written first, because they need to be used in the migration code as well. It would be nice to be able to autogenerate the datatype code as well, but alas.
Once we have this, we can write our first migration. All the major functions you'll need to actually write migrations are in Database.Beam.Migrate.SQL.Tables
.
{-# LANGUAGE OverloadedStrings #-}
import Database.Beam.Migrate
import Database.Beam.Backend
import Database.Beam.Postgres
-- It's unfortunate that we have to define this ourselves.
utctime :: BeamSqlBackend be => DataType be UTCTime
= DataType (timestampType Nothing True)
utctime
initialSetup :: Migration Postgres
CheckedDatabaseSettings Postgres FlowerDB)
(= FlowerDB
initialSetup <$> (createTable "flowers" $ Flower
= field "id"
{ flowerID
int notNull unique= field "name"
, flowerName Just 20)) notNull unique
(nationalVarchar (= field "price"
, flowerPrice Just (20, Just (2)))) notNull
(numeric (
})<*> (createTable "orders" $ Order
= field "id"
{ orderID
int notNull unique= field "order_time"
, orderTime
utctime notNull (defaultTo_ (cast_ currentTimestamp_ utctime))
})<*> (createTable "line_items" $ LineItem
= OrderID $
{ lineItemOrderID "order_id" int notNull
field = FlowerID $
, lineItemFlowerID "flower_id" int notNull
field = field "quantity"
, lineItemQuantity
bigint notNull
})
initialSetupStep :: MigrationSteps Postgres
()CheckedDatabaseSettings Postgres FlowerDB)
(= migrationStep
initialSetupStep "initial_setup"
const initialSetup) (
Here we can see why we needed to write the datatypes before writing the migration itself; Beam uses the "shape" of the datatypes to store information about which field/table name corresponds to which attribute/record type. This pattern should be familiar if you've already used Beam's query interface, where Beam uses the polymorphism of the datatype fields to store either literal data or expressions to compute said data. Here, it's used to store table mapping information.
The various database types that can be used for columns are defined in Database.Beam.Query.DataTypes
.
With this, we have enough to actually run this migration using the simple implementation Beam has defined in Database.Beam.Migrate.Simple
.
import Control.Monad.Fail ( MonadFail )
import Database.Beam.Migrate.Simple
import qualified Database.Beam.Postgres.Migrate as PG
-- Beam's simple runner doesn't run destructive migrations
-- by default, so we have to override that.
allowDestructive :: (Monad m, MonadFail m) => BringUpToDateHooks m
= defaultUpToDateHooks
allowDestructive = pure True }
{ runIrreversibleHook
migrateDB :: Connection
-> IO (Maybe (CheckedDatabaseSettings Postgres FlowerDB))
= runBeamPostgresDebug putStrLn conn $
migrateDB conn
bringUpToDateWithHooks
allowDestructive
PG.migrationBackend initialSetupStep
Once we have our working migration, we can also produce the DatabaseSettings
that we need to actually run any queries. CheckedDatabaseSettings
contains all the information we need, we just have to throw some of it away.
flowerDB :: DatabaseSettings Postgres FlowerDB
= unCheckDatabase $ evaluateDatabase initialSetupStep
flowerDB
exampleQuery :: Connection -> IO [FlowerT Identity]
= runBeamPostgres conn $
exampleQuery conn $
runSelectReturningList select (all_ (dbFlowers flowerDB))
If you've used Beam before, you might notice that we don't need to handwrite DatabaseSettings
to query the database, since we can just derive it from our migration. Which is... nice, except that we still have to define our custom datatypes before we can write our migrations.
We've got our initial database setup done, but the point of a migration framework is to manage changes over time. What kinds of things might change in our quaint little flower shop?
Perhaps the shop owner comes to us and says that they also want to display scientific names for their flowers alongside the common names. So we'll need an extra column in our database, and we'll want to rename our original name
column to avoid confusion.
The idea is simple enough, but how do we implement it in Beam?
Once you start actually changing the database schema, Beam's preferred way of representing those changes is as functions from one CheckedDatabaseSettings
to another. To see what I mean, we'll change the name of the original name
column first.
import Control.Arrow
changeCommonName :: CheckedDatabaseSettings Postgres FlowerDB
-> Migration Postgres (CheckedDatabaseSettings Postgres FlowerDB)
= do
changeCommonName settings <- alterTable (dbFlowers settings) $
withColumnChange -> do
\dbFlowers' <- renameColumnTo "common_name" (flowerName dbFlowers')
commonName pure $ dbFlowers' { flowerName = commonName }
pure $ settings { dbFlowers = withColumnChange }
changeCommonNameStep :: MigrationSteps Postgres
CheckedDatabaseSettings Postgres FlowerDB)
(CheckedDatabaseSettings Postgres FlowerDB)
(= migrationStep
changeCommonNameStep "change_common_name"
changeCommonName
fullMigration :: MigrationSteps Postgres
()CheckedDatabaseSettings Postgres FlowerDB)
(= initialSetupStep >>> changeCommonNameStep fullMigration
You can see that changeCommonName
is the actual code, which we lift into a MigrationStep
so that we can combine it with the setup step and run them. This is the basic pattern for each migration you'll write: create a function that takes in an existing schema and returns a Migration
, add the migration code in its body, and then lift and combine it.1
So far so good. Unfortunately, adding the scientific name column is a little harder. After all, our FlowerT
datatype doesn't have an attribute for it. We can't add it to our existing datatype, since that would mess up our existing migrations; they wouldn't have a database field to put into that attrtibute! We'll need to create an entirely new datatype, which also means creating a new database type to hold our new table type.
data FlowerT' f = Flower'
flowerID' :: Columnar f Int32
{ flowerCommonName' :: Columnar f Text
, flowerScientificName' :: Columnar f Text
, flowerPrice' :: Columnar f Scientific
,
}deriving (Generic, Beamable)
data FlowerDB' f = FlowerDB'
dbFlowers' :: f (TableEntity FlowerT')
{ dbOrders' :: f (TableEntity OrderT)
, dbLineItems' :: f (TableEntity LineItemT)
,
}deriving (Generic, Database be)
instance Table FlowerT' where
data PrimaryKey FlowerT' f = FlowerID' (Columnar f Int32)
deriving (Generic, Beamable)
= FlowerID' . flowerID' primaryKey
At this point you can probably see why Beam's migration tooling can get very annoying; if we want to actually use the migration incrementally, we'll have to keep creating new types, and keep updating which DB type is our most current one throughout our code. And we still haven't written our migration yet.
retable :: Table tbl
=> CheckedDatabaseEntity Postgres db (TableEntity tbl)
-> Migration Postgres (CheckedDatabaseEntity Postgres db' (TableEntity tbl))
= alterTable t pure
retable t
addScientificName :: CheckedDatabaseSettings Postgres FlowerDB
-> Migration Postgres (CheckedDatabaseSettings Postgres FlowerDB')
= do
addScientificName settings <- alterTable (dbFlowers settings) $
withColumnChange -> do
\dbFlowers' <- addColumn (field "scientific_name"
scientificName Just 20)) unique)
(nationalVarchar (pure $ Flower'
= flowerID dbFlowers'
{ flowerID' = flowerName dbFlowers'
, flowerCommonName' = scientificName
, flowerScientificName' = flowerPrice dbFlowers'
, flowerPrice'
}<- retable (dbOrders settings)
orders' <- retable (dbLineItems settings)
lineItems' pure $ FlowerDB'
= withColumnChange
{ dbFlowers' = orders'
, dbOrders' = lineItems'
, dbLineItems'
}
addScientificNameStep :: MigrationSteps Postgres
CheckedDatabaseSettings Postgres FlowerDB)
(CheckedDatabaseSettings Postgres FlowerDB')
(= migrationStep
addScientificNameStep "add_scientific_name"
addScientificName
fullMigration :: MigrationSteps Postgres
()CheckedDatabaseSettings Postgres FlowerDB')
(= initialSetupStep
fullMigration >>> changeCommonNameStep
>>> addScientificNameStep
We can't even reuse the tables from the original FlowerDB
directly; we have to write our own retable
function to get the type tetris to line up correctly. Frankly, once you have to, y'know, do actual migrations using Beam, it gets incredibly unwieldy.
Let's say there's one last change that we want to make. We want to reward loyal customers, so we want to give frequent customers a loyalty account, and add points for each order they make. For simplicity's sake, we'll just say there's a single point per order.
I'm going to skip the datatype updates necessary; they're pretty mechanical and not very interesting. We'll have two new datatypes, OrderT'
and LoyaltyAccountT
, and a new DB type, FlowerDB''
. (I never said Beam's migration framework was good.) The only thing of note is that the loyalty ID needs to be of type PrimaryKey LoyaltyAccount (Nullable f)
, since not every order made might have an associated loyalty account; after all, not everyone will sign up before buying something.
addLoyaltyAccounts :: CheckedDatabaseSettings Postgres FlowerDB'
-> Migration Postgres (CheckedDatabaseSettings Postgres FlowerDB'')
= do
addLoyaltyAccounts settings <- alterTable (dbOrders' settings) $
withLoyaltyID -> do
\dbOrders' <- addColumn (field "loyalty_id" (maybeType int))
loyaltyID pure $ Order'
= orderID dbOrders'
{ orderID' = LoyaltyAccountID loyaltyID
, orderLoyaltyID' = orderTime dbOrders'
, orderTime'
}<- createTable "loyalty_accounts" $
loyaltyAccounts LoyaltyAccount
= field "id"
{ loyaltyAccountID
int notNull unique= field "customer_name"
, loyaltyAccountCustomerName Just 30)) notNull
(nationalVarchar (
}<- retable (dbFlowers' settings)
flowers'' <- retable (dbLineItems' settings)
lineItems'' pure $ FlowerDB''
= flowers''
{ dbFlowers'' = withLoyaltyID
, dbOrders'' = lineItems''
, dbLineItems'' = loyaltyAccounts
, dbLoyaltyAccounts''
}
addLoyaltyAccountsStep :: MigrationSteps Postgres
CheckedDatabaseSettings Postgres FlowerDB')
(CheckedDatabaseSettings Postgres FlowerDB'')
(= migrationStep
addLoyaltyAccountsStep "add_loyalty_accounts"
addLoyaltyAccounts
fullMigration :: MigrationSteps Postgres
()CheckedDatabaseSettings Postgres FlowerDB'')
(= initialSetupStep
fullMigration >>> changeCommonNameStep
>>> addScientificNameStep
>>> addLoyaltyAccountsStep
And with that, we have a few examples of doing the most common schema updates. But phew, these migrations are rather a mouthful for how little we're actually doing in them.
If you're currently evaluating different options for managing your database schema, hopefully this post has helped you form your own opinion of Beam. My own personal opinion is that Beam's current migration tooling isn't very good, and I'll probably either stick to doing migrations manually or look into some other tools.
The full migration code in this post is available here, for reference.
Found this useful? Got a comment to make? Talk to me!
Before you close that tab...
Want to write practical, production-ready Haskell? Tired of broken libraries, barebones documentation, and endless type-theory papers only a postdoc could understand? I want to help. Subscribe below and you'll get useful techniques for writing real, useful programs straight in your inbox.
Absolutely no spam, ever. I respect your email privacy. Unsubscribe anytime.
Footnotes
↥1 One footgun to be aware of is that Beam expects the columns and tables created in migrations to be used in a linear way; that is, once you rename a column or drop a table, you shouldn’t use the old value anymore. Since Haskell doesn’t have linear types yet, doing this resource management is on you. For instance, we created the commonName
column by renaming the old column, passing in flowerName dbFlowers'
. But what if we accidentally reused the old column instead of returning the updated table with commonName
inside?
...
-> do
\dbFlowers' <- renameColumnTo "common_name" (flowerName dbFlowers')
commonName pure $ dbFlowers' { flowerName = flowerName dbFlowers' }
...
This would still typecheck, but would produce a migration that wouldn’t do anything.