Things software engineers trip up on when learning Haskell

April 12, 2020
« Previous post   Next post »

You've worked in the software industry for a while, or you're otherwise coming from experience developing in another language. Most likely you've worked in an imperative language, and now you want to find out what all the fuss about functional programming is.

Let's say you asked me whether I thought that learning Haskell was difficult, coming from this background. My answer would be, "probably not," as long as you approach things calmly and prepare yourself to not have skills or concepts transfer over, and to start over from a blank slate.

Still, even though the intrinsic difficulty of learning the language is probably overexaggerated, there are definitely impedance mismatches and areas where you'll likely find preconceptions getting in your way. So, some things that you might find surprising, unintuitive, or otherwise different from how you're used to doing it:

  • No, Haskell is not "just another programming language."

  • No, you don't need to learn category theory or other mathematics to use Haskell.

  • Haskell is fast, but getting C-level performance is not trivial. You will often still need to sacrifice readability to shave off cycles. Additionally, profiling and testing is mandatory if you care about performance. See this comment for a more in-depth discussion of this point.

    (Thanks for pointing this out, Chris Smith!)

  • Haskell is not a magic bullet. Using the language will not magically make all your technical problems go away.

  • Haskell will not magically make your code bug-free. It makes it easier to achieve that, but not trivial.

  • Haskell will not magically make your code parallel. You will still need to design/modify your code for it. It makes parallelism easier, but not trivial.

  • Yes, Haskell still requires runtime checks. You need less of them, and the compiler can even help you not forget to do them.

  • Yes, you still need to test your code.

  • It is still possible to write bad code in Haskell. Thankfully, the language makes it easier to prevent this from happening.

  • Yes, Haskell/functional programming do actually have mutation and state. It's just not used as often. In Haskell, you need to be working in IO (or something similar) to make use of it, since mutation is a side effect.

  • Yes, Haskell/functional programming do actually have side effects like I/O. The language just forces you to cleanly separate them from your pure code, the way you would normally anyways.

  • Yes, you're going to encounter lots of weird operators like <>, >>=, <$>, <*>, and >=>. Yes, you're going to have to get used to them. I promise they're there for a reason; they're not just there to confuse you.

  • No, you technically don't need to write type signatures for your functions. Yes, you should still write type signatures for your functions. If the compiler knows what you were trying to write, it can give you better type errors. (Thanks for pointing this out, Greg Travis!)

  • True and False are uppercase. Pythonistas rejoice.

  • "Not-equals" is '/=', not '!='.

  • Negating a boolean is 'not' instead of '!'.

  • Always wrap negative numbers in parentheses. (-1) instead of -1, (-x) instead of -x, and so on. You'll likely get a compile error otherwise, as the negative sign gets parsed as a binary operator, not a unary one.

  • No, unlike a lot of other languages, Haskell does not automatically convert numeric types. If you do a lot of work with numerical data, get used to converting. See Stephen Diehl's What I Wish I Knew When Learning Haskell for a guide on numeric conversions.

  • Statements don't exist, only expressions.

  • Lazy evaluation is about composability and refactoring, not performance. Occasionally getting better asymptotics is more a happy coincidence than something to rely on.

  • You're going to, in one instance or another, run into a brick wall of type errors. You will gnash your teeth, tear your hair out, curse the soul of every conniving son of a coyote who ever decided to write a compiler. This is normal. I know that this is going to be frustrating, but at the very least, remember that the compiler is being pedantic about your types for a reason. Restrictiveness is helpful, because we want to make bugs impossible.

  • Monads are not some mystical, otherwordly concept. Using them is actually pretty mundane. If you approach them as if you need to have your mind blown, you're going to miss the point.

  • You will probably run into a situation where you open up a library, know that there's a specific function that must exist and you know you need, but can't find it within the library. Typically this is because instead of defining concrete, named functions to do certain things, Haskell makes code more reusable by relying on offloading core functionality to typeclasses. But this also has the effect of making that functionality harder to find for newcomers. An example is parser combinator libraries; you'll look at the documentation and not find a function to glue two parsers 'X' and 'Y' together to parse 'either X or Y'. Where is that function? Well, it's the (<|>) function, because parsers implement the Alternative typeclass.

    Some typical typeclasses that you need to keep an eye out for if it seems like there's some functionality "missing": Monad, Applicative, Monoid, Semigroup, Alternative, MonadPlus, Foldable, Traversable, Random.

    (Thanks for pointing this out, Daniel Brice!)

  • You will probably not understand how to design large programs in Haskell for a while.

  • No, you cannot just read a book and expect to know Haskell. Get out there and write real code.

  • "Safe" does not mean the same thing that it means in other programming contexts. When Haskellers talk about some code being "safe," it's not just about avoiding undefined behavior. Haskell is "safe" in that sense as well, but in this context "safe" typically means some code throws no exceptions and handles all possible edge cases. In short, that it can't crash.

  • You cannot "unwrap" an 'IO a' to get an 'a'. If you start working in IO, you have to keep working in IO.

  • Yes, the string situation is a bit of a mess. Bottom line: don't use the String type unless you have to, default to using strict Text. Use ByteString when your data really is completely arbitrary sequences of bytes.

  • (x -> y) -> z and x -> (y -> z) are very different functions.

  • Unlike other languages, you'll find it weird that what seem like extremely basic things often don't have a consensus on what best practice is. Error and exception handling, for instance, or accessing a database. Don't worry too much about picking the "best" option; any choice will get you 90% of the way there, and all the discussion about pros and cons is quibbling about the remaining 10%. Pick whichever seems to have the best documentation and can solve your problem and worry about optimizing your choice of solution later.

  • Error handling is not hard, but will take more up-front time to understand, because Haskell has so many different ways to do it.

  • You will find logging difficult the first time you do it. This is normal, but it may seem weird to you coming from another language, where adding logging is as easy as just shoving a logging statement wherever you need it.

  • On the other hand, it's actually quite easy to insert printf-style debugging statements if you can't figure out what's going wrong with some code. Check out Debug.Trace. It won't suffice for things like logging to an external service, but it's good for short debugging sessions.

  • Accessing a database can be... complicated, mainly due to the proliferation of different viable libraries. Save yourself some headaches and just use postgresql-simple1 until you find your patience abrading against its limitations. Once you do, I've written a comparison of Haskell DB libraries to help you choose.

  • Unfortunately, there isn't a default/standard way of storing test data files e.g. if you need example images to run your neural network on.

  • Write your code permissively at first, then restrict it, not the other way around. For instance, there's nothing wrong with having every function in IO, passing around strings and raw JSON values, etc. when prototyping. It's generally easier to then lock things down and remove IO from functions that don't need it, make the types more precise, rather than to try to design things perfectly from the start, realise that you're missing some crucial functionality, and have to redesign everything.

  • Similarly, after learning about designing types for your domain to make bugs impossible, you will be tempted to lock down your types as much as possible. You don't need to lock down your types as much as possible.

  • Also, because of this, you don't need dynamic types to deal with weird JSON. Even if an API can return data in different formats, you can easily deal with that at runtime in a strongly typed language. Just pass around raw JSON values instead of trying to define a new type.

  • Each time you start a new project, it's likely that you'll have very long compile times. Thankfully this is less of a problem after you compile once, but it's still a problem.

  • GHCi has a working debugger, but you may find it easier to just give up on finding something equivalent to Eclipse or similar. Debug things either by calling them from the interpreter, through property-based tests, or by using Debug.Trace. Since your core functions are pure, their return values are what matters anyways. (You are making your core logic pure, right?)

  • Unfortunately, the tooling situation in general is a little suspect. If you're comfortable working from the command line, then the situation is pretty decent, with ghcid for automatic recompilation, hlint and stylish-haskell/brittany for linting and code formatting, and so on. But editor support is more anemic if you're not using Emacs. You may have to do without things like autocompletion, error highlighting, and automatic refactorings.

  • Deploying binaries/compiled code is not fundamentally different from deploying Python/Ruby/JS/PHP/etc. It's just bundling files together and copying them over to your server.

  • You will probably not understand how Stack and Cabal work for a while, or how they differ from each other. If you can get them to build your code, great! Don't worry about how they work too much. If you can't, ask someone to help you out, then move on with your life for now.

  • Compiler messages are going to make no sense to you for a while. Often the message will say something totally unrelated to what the actual problem is. You can at least count on them to point to the right place in your code, so look closely at that line, even if the message itself seems like gibberish.

  • Documentation is bad. There's really no getting around this. You will also need to be comfortable reading a more terse, academic style of documentation.

  • No, there is no syntax for optional or named parameters. All function arguments are required and positional. While not nearly as syntactically convenient, you can use data structures like records or Maybe for this instead.

  • No, language extensions are not a misfeature. Use them.


Whew, that's a lot of stuff. It's a lot of negative stuff. It's enough negative stuff that you might reasonably wonder if it's even worth putting up with Haskell's bullshit.

Well, in my (extremely biased) opinion, I think the answer is yes. Do I still get confused popping open a library and finding zero useful documentation, do I still run up against walls of impenetrable type errors and have no idea what to do at first? Do I miss conveniences from other languages, like anonymous records in Ruby/JS/etc., or being able to test module internal functions in Rust? Absolutely, but in return I get the most rock-solid language I've ever used.2 I don't know a single other language in which I would be comfortable writing something, leaving it alone for 6 months or a year, then coming back and immediately making changes without fear of breaking something. I don't know any other language where I can have zero knowledge of a codebase, make a change deep in the stack in 10 minutes, and if the compiler fact-checks me, be 99% confident that it works correctly. I don't know any other language where a library can be functionally 'done.'

This list is necessarily incomplete, as even though there are certain things that come up again and again as roadblocks for industry developers learning Haskell, I can't predict everything you might have trouble with.

If you have experience learning Haskell, and think I missed something that would fit, or otherwise if you found this post helpful or have questions, talk to me!

« Previous post   Next post »

Before you close that tab...


Footnotes

↥1 Or sqlite-simple, mysql-simple, etc.

↥2 To actually write production programs. You could make a case for Idris or Agda being more reliable, but as of now I’m not writing web servers in those.