Of Ideas and Men
Iđ
You know when youâre writing imperative code, and you start swearing because you just realized you need a variable that you havenât had in scope for like ten frames on the call-stack? Weâve all been there, and weâve all come up with some expressions worthy of a sailor.
Letâs say maybe youâre writing the drawing code for a game, and have lots of methods like this:
void Scene::draw(RenderTarget& target) const {
for (Drawable &child : children_) {
.draw(target);
child}
}
void Character::draw(RenderTarget& target) const {
head_.draw(target);
torso_.draw(target);
feet_.draw(target);
fancyParticleSystem_.draw(target);
}
And everything is going fine and dandy, until you realize that your fancy particle system needs the time elapsed since the last frame in order to do fancy particle things.
Iâm speaking from experience, here. This exact situation has happened to me, at least twice; youâd think Iâd learn.
You stop for a second to think about how youâre going to fix this. Continuing to curse isnât going to help anything, so otherwise you have two options going forwards: you can either change every call-site of draw
to ensure that you can get the elapsed time to the particle system, or you can phone it in and make the damn thing global.
You feel drawn towards the making-it-global solution, but youâd feel bad about it the next day, and besides, we both know it wouldnât pass code review. With a sigh, you resign yourself to updating the method signature:
void Drawable::draw(RenderTarget& target, float delta) const;
In our small example here, there are seven lines you need to change in order to make it happen. Just imagine how much refactoring would be necessary in a real codebase!
Two hours later, you realize you need another parameter. You go through everything again, plumbing it through. Itâs tedious, and you canât help but feel like maybe thereâs a better way of doing this.
Or perhaps youâre writing some Javascript, and the seemingly innocuous line
var cash = bank.currentUser.account.getValue();
crashes in a firey wreck with the demonstrably-evil error undefined is not a function. Apparently one of those properties doesnât exist, but youâre not sure which. Because you are a seasoned veteran, you know what the fix is; you need to make sure that each of those properties exists before indexing into it. With celerity (and some indignation that the API guys donât seem to be able to do their job properly), you hammer out the solution:
var cash;
if (bank && bank.currentUser && bank.currentUser.account) {
= bank.currentUser.account.getValue();
cash else {
} return null;
}
Great. Youâve gone from one line to five, but hey, at least it will work. So you hit F5 in your browser, and your heart sinks. Now the call-site is missing a null check, so you go to fix that too, this time carefully crawling upwards making sure nothing else will explode.
You find yourself writing dozens of snippets along the lines of âcheck if itâs null, if it is, return null, otherwise do what you were doing anywayâ. Itâs not a fun experience, and somewhere deep down you feel like this should be the kind of problem the runtime should be able to solve for you; after all, itâs a purely mechanical change requiring no thought whatsoever.
Isnât that the kind of thing computers are really good at?
Maybe instead of being a hip Javascript developer, youâre a crusty, suit-wearing and fun-hating programmer who works for a bank. The Man has gotten on your case because sometimes the code that is supposed to transfer money from one bank account to another crashes half way through, and one account ends up not being credited. The bank is losing money, and itâs your ass on the line.
After some inspired grepping, you track down the offending function:
void transfer(int amount, int src, int dst) {
(dst).addFunds(amount);
getAccount(src).addFunds(-amount);
getAccount}
The code itself was written years ago, and nobody has really looked at it since. Since youâre on a deadline (and a maintenance programmer at a bank), you donât have the time or patience to familiarize yourself with every last detail. You decide to just throw everything into a transaction and be done with it.
Unfortunately, your code-base is in C++1, which means you donât have any transaction primitives. You slap together some RAII voodoo that will reset a variable back to its old value unless the commit
method is called on it before the end of scope. Itâs a good enough solution, and youâre happy with it.
And then you go to actually use your new voodoo scope guard, and realize that itâs going to take a lot of refactoring to get it to do what you want. You need to add two lines for every variable that could change. Itâs going to be a maintenance nightmare, but hey, thatâs what the other guys are for.
Your code ends up looking like this:
void transfer(int amount, int src, int dst) {
&debit = getAccount(src);
Account &credit = getAccount(dst);
Account
{
(debit, &debit);
Transaction _debit(credit, &credit);
Transaction _credit
.addFunds(amount);
credit.addFunds(-amount);
debit
.commit();
_debit.commit();
_credit}
}
The two lines of actual code have ballooned into eight, and six of them do nothing but book-keeping for the logic you actually want. Thatâs a lot of boilerplate that had to be written, and itâs going to be hard to maintain if anyone ever wants to expand the function.
Despite being a dull corporate drone, you canât help but wonder, since the compiler is keeping track of which variables are changing anyway, why it canât also automatically generate these scope guards for you?
Are you noticing a trend yet? Thereâs a common theme to these examples: you find yourself writing lots of boilerplate that is mindless and seems like the compiler should be able to do for you. In every case itâs a solved problem â thereâs some particular functionality youâre trying to plug in that doesnât really depend on the existing code (accessing the environment, doing error handling, performing transactions).
Like, in principle you could write code that would make these changes for you, but where would you put it? The language doesnât support it, and everyone is going to hate you if you go mucking with the build system trying to inject it yourself. For all intents and purposes, you seem to be SOL.
And so you find yourself in a pickle. Here you are, a programmer â someone who gets paid to automate tedious things â and you canât figure out a means of automating the most tedious part of your day-to-day programming. There is something very wrong here.
IIđ
Itâs probably not going to surprise you when I say that there is an abstraction that fixes all of these issues for us. Sadly, itâs widely misunderstood and mostly feared by any who are (un)lucky enough to hear its name spoken aloud.
If youâve been paying attention, you might have noticed Iâve been talking about Haskell lately, and youâve probably guessed what abstraction Iâm alluding to.
Thatâs right.
Monads.
No! Stop! Do not close the tab. You know the immense frustration that you have with non-technical people when they see a math problem and give up immediately without thinking about it for even two seconds? Those are the same people who close browser tabs when they see the word âmonadâ. Donât be one of those people. Youâre better than that.
Monads are poorly understood somewhat because theyâre weird and abstract, but mostly because everyone who knows about this stuff is notoriously bad at pedagogy. They say things like âa monad is just a monoid in the category of endofunctorsâ. Thanks for that, Captain Helpful!
Monads have been described as many things, notably burritos, boxes that you can put something into, and descriptions of computation. But thatâs all bullshit. What a monad actually is is something weâll get to in a little bit, but hereâs what a monad does for you, insofar as you care right now:
Monads are reusable, modular pieces which provide invisible plumbing for you. After a monad has been written once, it can be used as library-code to provide drop-in support for idioms.
Need error handling? Thereâs a monad for that. Non-determinism? Thereâs a monad for that too! Donât want to write your own undo/redo system? You guessed it â thereâs a monad for that.
I could keep going for a while, but Iâm sure you get the picture.
IIIđ
Letâs go through an example, shall we? The Maybe
monad has traditionally been the âhello worldâ of monads, so weâll start there, because who am I to break tradition? The Maybe
monad provides drop-in support for null
checks and propagation (ie. error handling). Consider the following Haskell function, which performs division:
divide :: Int -> Int -> Int
= x `div` y divide x y
In the unfortunate case of , this seemingly innocuous function will crash your Haskell program with a Exception: divide by zero. Thatâs to be expected, but it would be nice if the whole thing wouldnât explode. Letâs change our function to be a little safer:
safeDivide :: Int -> Int -> Maybe Int
0 = Nothing
safeDivide _ = Just (x `div` y) safeDivide x y
Aside: Youâll notice we have two definitions for safeDivide
, here. The first one is specialized (think template specialization in C++) for the case of , and the second is the general case. In Haskell, this is known as pattern matching.
Anyway, the most salient change here is the new return type: Maybe Int
. Though itâs not technically true, we can think of this type as meaning âwe have a computation of type Int
that we want to use in the Maybe
monadâ. In general, we will use (Monad m) => m a
to describe a computation of an a
with associated monadic functionality m
.
And so if the return type of safeDivide
is Maybe Int
, it logically follows that Maybe Int
must be a type, which implies it has values. But what are these values? They canât be integers (1, 5, -64), since those are of type Int
, and remember, Haskell is very picky about its types.
Instead, values of Maybe Int
are either something (Just 1
, Just 5
, Just -64
), or Nothing
. You can think of Nothing
as being essentially a null
, except that it is not a bottom value (it only exists in the context of Maybe a
; it canât be assigned to arbitrary reference-types).
Our safeDivide
function can now be understood thusly: if , the result of our division is Nothing
(a failed computation), otherwise the result is Just z
, where . Remember, the Just
bit is there to specify we have a value of type Maybe Int
(which can be Nothing
), and not an Int
(which canât).
Another way to think of this is that weâve transformed divide
which isnât a total function (itâs undefined for ), into a total function (defined everywhere).
At first it doesnât seem like weâve really gained much, besides annotating that our safeDivide
function can fail. But letâs see what happens when we use our new function:
divPlusFive :: Int -> Int -> Maybe Int
= do divided <- safeDivide x y
divPlusFive x y return (divided + 5)
If you squint and look at <-
as an =
, this looks a lot like imperative code. We first compute the result of safeDivide x y
, and then âreturnâ2 that plus five. There is seemingly nothing of interest here, until you look at the type of divePlusFive
: it also returns a Maybe Int
, but we didnât have to write any code to explicitly make that happen â it looks like weâre just dealing with Int
s, since we can explicitly add an Int
to it! But indeed, the result of divPlusFive 5 0
is actually Nothing
. What gives?
Hereâs the idea: when youâre working inside a (Monad m) => m a
, the code you write deals only with the a
bit, and Haskell silently transforms it into a monadic context (read: provides plumbing) for you. In the case of Maybe a
, Haskell generates a dependency graph for every expression you write. If the result of any subexpression is Nothing
, that will propagate downstream throughout the graph (and can explicitly be handled, if you want to provide sensible defaults, or something).
divePlusFive 5 0
first computes safeDivide 5 0
, which is Nothing
, and since divided + 5
depends on this, it too becomes Nothing
. What weâve done here is captured the semantics of
var divided = safeDivide(x, y);
if (divided == null) return null;
return divided + 5;
except that we donât need to explicitly write that null
check anywhere. The Maybe
monad handles all of that plumbing for us (including upwards in the call-stack, if that too is in the Maybe
monad)!
And just like that, we never need to write another null
check ever again.
IVđ
The obvious question to be asked here is âhow the hell did it do that?â, and thatâs a fantastic question. Iâm glad you asked. The secret is in that little do
keyword, which looks like it drops you into imperative mode.
But it doesnât.
The do
block in Haskell is actually just syntactic sugar for an environment which transforms your implicit semicolons into a user-defined operator called >>=
(pronounced âbindâ). This transformation turns
= do divided <- safeDivide x y
divPlusFive x y return (divided + 5)
into
= (safeDivide x y) >>= (\divided -> return (divided + 5)) divPlusFive x y
where \x -> y
is an anonymous function taking x
and returning y
. The magic, it would seem, is all in this >>=
operator, but what is it? As usual, we will begin with looking at its type (specialized on the Maybe
monad), and seeing what we can deduce from it.
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
So, >>=
takes two parameters, a Maybe a
(type a
wrapped in the Maybe
monad), and a function a -> Maybe b
. From last time, we know that since there are no constraints on a
, we canât construct one out of thin air, so >>=
must be somehow feeding the a
from Maybe a
into this function.
In effect, this is why we were able to pretend we were computing an Int
in our function that was supposed to return a Maybe Int
. Here, >>=
has been silently composing together our functions which might individually fail (safeDivide
), into a single function (divPlusFive
) that might fail.
Perhaps you are beginning to see why monads are an abstraction capable of solving all of our original problems: they are silently adding the notion of a particular context to computations that didnât originally care.
Letâs look at the implementation for >>=
, which turns out to be surprisingly simple for Maybe
:
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
>>=) Nothing _ = Nothing
(>>=) (Just x) f = f x (
Again, this is function is pattern-matched, first for the case of trying to bind Nothing
with any function (it doesnât matter what the function is; our computation has already failed). The other option for a Maybe a
is to be Just x
, the result of which is applying x
to the function f
.
Because Haskell has lazy evaluation, this operator definition is semantically equivalent to the short-circuiting &&
operator in C++: as soon as you get a Nothing
, it will stop processing and give you back a Nothing
. Cool!
To reiterate: after a monad has been written once (essentially, given a suitable implementation of >>=
), it becomes library-code. If you annotate your return type as (Monad m) => m a
, anything inside of a do
block will be transformed to use the monad you asked for. Haskell essentially begins injecting user-defined code for you after each semicolon.
Vđ
OK, this is all really cool stuff. But what is a monad, actually?
Well, just like a right-angled triangle is any object that satisfies the constraint , a monad is any object subject to some constraints (essentially some rules about what cancels out) that ensure it behaves sensibly in monadic contexts.
m
is the monadic type itself, with return
being a primitive to âget intoâ3 the monad4. Together, these things formally define a monad.
Itâs important to note that anything which has this signature that follows the laws is a monad, regardless of whether it appeals to our (naive) intuitions about âprogrammable semicolonsâ. In fact, in a few posts we will discuss the âfree monadâ which is a monad that does absolutely nothing of the sort.
However, as a first look at monads, our thusly developed intuition will be good enough for the time being: monads provide invisible plumbing in a modular and reusable fashion.
But getting back to the definition, what is this return
, thing? We havenât talked about that yet. As usual, weâll look at its type to see what we can suss out.
return :: (Monad m) => a -> m a
Well, thatâs actually really easy; it just takes an a
and puts it into a monad. For Maybe
, this is defined as:
return :: a -> Maybe a
return x = Just x
Pretty basic, huh? All this does is, given an a
, transforms it into a Maybe a
by saying itâs not Nothing
, but is in fact Just
the value that it is.
Earlier, when squinting at divPlusFive
as imperative code, I put âreturnâ in scare quotes. You can see why, now â itâs not doing at all what we thought it was! To refresh your memory, hereâs the function again:
divPlusFive :: Int -> Int -> Maybe Int
= do divided <- safeDivide x y
divPlusFive x y return (divided + 5)
Here, return
isnât the thing youâre used to in imperative code; itâs actually just this function thatâs part of the monad. Letâs go through why itâs necessary really quickly.
Notice that divided + 5
must be of type Int
, since addition is over two parameters of the same type, and 5 is most definitely an Int
. But divPlusFive
is supposed to result in a value of type Maybe Int
! In order to clear up this discrepancy, we just return
it, the types work out, and Haskell doesnât yell at us. Awesome!
We should probably go through a few more examples of monads and their implementations in order to get a solid conceptual grasp of whatâs going on, but this post is already long enough. Weâll save a few for next time to ensure we donât get burned out.
That being said, hopefully this post has served as motivation for âwhy we want monadsâ and given some idea of what the hell they are: modular, reusable abstractions which provide plumbing. They work by silently transforming your computation of a
into a computation of (Monad m) => m a
, stitching together individual subexpressions with a user-defined operator >>=
. In the case of Maybe
, this operator computes as usual until it gets a Nothing
, and then just fast-forwards that to the final result.
Despite their bad reputation, monads really arenât all that scary! Now arenât you glad you didnât close the tab?
Here is where the allegory breaks down because if youâre working at a bank youâd be writing in COBOL, not C++, but just go with me on this one, OK?â©ïž
We will talk more about these scare quotes in the section V.â©ïž
There is conspicuously no way of âgetting outâ of a monad; many monads do provide this functionality, but it is not part of the definition of a monad, since it doesnât always make sense. For example, if you could get âoutâ of a transaction, it no longer provides a transaction, now does it?â©ïž
Amusingly, comonads (the âoppositeâ of a monad) provides no way of getting into the comonad; you can only get out.â©ïž