Abstracting out common columns in Opaleye

December 28, 2019

As a follow-up to my last post on which type-safe DB library to use, let’s go over something that might have caught your attention about Opaleye. Unlike a lot of the other library choices available, Opaleye makes it possible to abstract out common columns, cutting down on having to redefine table mapping code. How?

Let’s say we have the following datatypes, set up the way Opaleye likes them:

as well as the following Opaleye table mappings:

This works. It’s a little redundant though. If you’re like me, having to rewrite the exact same code for the created_at and updated_at columns hurts your heart. Even the id column could potentially be factored out. Could we do better?


Fortunately, Opaleye already gives you enough functionality to do so, as it bases its table definitions around product-profunctors. I can’t claim to fully understand the abstraction myself, but the gist of it in this case is that it allows you to take smaller components, like individual column mappings or groups of columns, and lay them out “side-by-side” to create a combined table mapping. For instance, we can create the mappings for just our timestamps like so:

After removing these fields from our datatypes, we can then redefine our tables like so:

We’ve removed a source of repetiveness and potential errors. Success!

Notice that the types of our tables have changed. Now when we write queries or do inserts, we’ll get back a tuple with our datatype inside. There’s not really a way to avoid this, since it’s not like this is Ruby or JavaScript; we can’t dynamically add new fields to our types after they’ve been defined. Your queries will have to be adjusted accordingly.


We’re not done abstracting this out yet. There are two issues that we can see from this definition:

  • Having to match on a tuple every time we query is sort of annoying. What if we end up with lots of common columns that we want to abstract? Then every query will start becoming a mess of underscores when we don’t need everything:

  • Truthfully, we don’t want developers to need to provide values for these timestamps every time they run an insert or update. In fact, for the most part we don’t want them to be able to manually change them at all. We’d rather have Opaleye just pass DEFAULT as the value for updated_at on every update, for instance.

Avoiding tuple blindness

Using a tuple for this isn’t just annoying, it’s also somewhat dangerous, since there’s no information about which part of the tuple signifies what information. What if you mixed up the order of updated_at and created_at in your pattern match?

Thankfully, Opaleye’s table definitions are flexible. One thing you could do is to create a single entity type that “wraps” common fields. Then you could create type aliases for this for your different types.

Once we define a table mapping for this, we now have the best of both worlds: our common fields are factored out, and we no longer have to pattern match on fields we don’t want to use. We also have meaningful accessors that tell us what each piece of information signifies.

Note the use of unit to stub out the owner field on orders. Maybe lots of the entities in our system have an internal admin who is responsible for triaging bugs or tickets arising from said entity, but orders don’t. This way we can accomodate even columns that are only “mostly common,” but which also don’t seem like they belong with the entity itself.

Once we have all this, we just need to create a way to “lift” column mappings for our “base” entity types into our Entity wrapper with everything stubbed out:

You could go even further and write lenses for Entity to make it even more convenient to use.

Restricting updates/inserts

You might have noticed that the type signatures of our tables duplicate their parameters. You might reasonably wonder what this duplication is for.

As it just so happens, the fact that the Table type has two parameters instead of just one is the key to resolving our update situation.

The first parameter is what data you need to provide to insert into that table. The second parameter is what data you get back when querying.1 I’ll refer to these as the write type and read type.

Up until now, we’ve been using required to define the column names. As you might expect, this signifies to Opaleye that we have to provide a value for that column on writes. Opaleye provides a similar function called optional that leaves the read type of said column the same as required, but allows the write type to be Maybe. Updating our timestamps function gives us:

Passing a Nothing on writes will result in Opaleye passing DEFAULT to the database, while still allowing you to pass through the created_at unchanged on an update.

We can go one step further and prevent a value from being passed for updated_at at all, by taking advantage of the fact that Table and TableFields are both instances of Profunctor.

Now that the write type of the updated_at column has been stubbed out to unit, we don’t have to worry about it at all on writes; assuming a DEFAULT value has been specified in our database, it’ll automatically update for us.

Wrapping up

After working through all these problems and abstracting them out, reflecting on how we did it reveals that we were just constructing normal functions and values; outside of using optional, we didn’t have to lean on any Opaleye-specific functionality at all. All we did was build up larger table mappings from smaller table mappings.

This, I think, is part of the beauty of Haskell; everything is built from the same basic components, and if you can grasp a small amount of extra concepts (in this case, how product-profunctors fit together) you already have everything you need to contort libraries in ways their authors didn’t expect.

Found this useful? Still have questions? Talk to me!


You might also like


Footnotes

↥1 This also makes it possible to make a table read-only on the Haskell side. Use the Profunctor instance on Table to make the write type Void.


Before you close that tab...