Abusing Constraints for Automatic Documentation
Constraints
At work I was recently tasked with figuring out what API calls our program makes, and more interestingly, which code-paths lead to those API calls. Determining this by hand is tedious and error-prone, and worse, doesn’t stay up to date with code changes. Instead, let’s see how we can use the type system to eliminate the pain.
The existing code was organized around a class HasAPI
that looks something like this:
type HasAPI :: Service -> Symbol -> Constraint
class HasAPI srv name where
type APICall srv name
callAPI :: APICall srv name
Here, HasAPI
is a type class with an associated type family APICall
which gives the type for making the call. For example, there might be an instance:
instance HasAPI ShoutService "shout" where
type APICall ShoutService "shout" = String -> IO String
= pure $ fmap toUpper str callAPI str
This is a silly example — the real codebase makes actual API calls — but it serves for demonstration.
Our goal is to document every codepath that makes any use of callAPI
, in some sense, “infecting” every path with some marker of that fact. This is a common experience to Haskell programmers; in fact, IO
has this same pattern of infectiousness. Whenever you make a function perform IO, every type in the callstack needs to document the fact it performs IO
. This is the inspiration we will take, except that changing types is extremely expensive. What if we pushed a constraint around instead?
Propagating Constraints
The trick is to define a new class, of the same shape as HasAPI
:
type CallsAPI :: Service -> Symbol -> Constraint
class CallsAPI srv name
but crucially, we give CallsAPI
no instances. On first blush, this seems insane: why introduce a class with no methods and no instances? Having no methods means it can’t do anything useful. Having no instances means GHC can never eliminate the constraint, and thus must propagate it upwards. This is the infectiousness we want; any function which makes an API call must document that fact in its type — failure to do so will result in GHC failing to compile with the message No instance for (CallsAPI srv name)
.
The trick now is to ensure that callsAPI
produces a CallsAPI
constraint. The easy way to do this is a little renaming to ensure existing polymorphic code continues work:
type UnsafeHasAPI :: Service -> Symbol -> Constraint
class UnsafeHasAPI srv name where
type APICall srv name
unsafeCallAPI :: APICall srv name
type HasAPI :: Service -> Symbol -> Constraint
type HasAPI = (UnsafeHasAPI srv name, CallsAPI srv name)
callAPI :: forall srv name
. HasAPI srv name
=> APICall srv name
= unsafeCallAPI callAPI
Any code written against the old HasAPI
constraint will continue to work (modulo the instance definitions,) but concrete calls to callAPI
now result in a dangling, unsatisfiable CallsAPI
constraint. You’ll need to go through the codebase now, and document every transitive call to the API with matching CallsAPI
constraints. Thankfully, HLS can help with this task: it will underline the missing cases, and suggest a code action that will automatically add these constraints to the type. Rinse and repeat, until every code path is documented.
Great success! We have automatically found every codepath that makes an API call, and forced them to document that fact. Better yet, we have solved the problem once and for all; our coworkers also must document any new API calls they make, lest their code not compile. It seems like we’re done!
Except for one fact: GHC will rudely refuse to compile our project, even if we correctly track all of our API calls. The problem of course, is that all we have managed to do is force main
to collect every CallsAPI
constraint. But GHC will still complain No instance for (CallsAPI srv name)
. Of course, you could just give an orphan instance in the same module that defines main
, which would work, but this doesn’t give you any sort of external documentation. It’s nice when you read the code, but it doesn’t help the business people.
Solving The Unsolvable
A better approach here is to selectively solve the CallsAPI
constraints, which we can do with some Haskell dark magic. The Dict
type captures a constraint, giving us a convenient way to manipulate constraints:
type Dict :: Constraint -> Type
data Dict c where
Dict :: c => Dict c
We can write an eliminator to bring the c
from a Dict c
into scope, which, importantly, allows us to solve otherwise-unsolved constraints:
:: (c => r) -> Dict c -> r
(\\)Dict = f f \\
If we can get our hands on a Dict (CallsAPI Srv Name)
, we can use (\\)
to convince GHC to compile our program.
GHC is happy to give us dictionaries for constraints it knows about:
showIntDict :: Dict (Show Int)
= Dict showIntDict
but unfortunately, refuses to give us dictionaries for unsolved constraints:
callsAPIDict :: forall srv name. Dict (CallsAPI srv name)
= Dict
callsAPIDict
-- Error: No instance for (CallsAPI srv name)
It seems like we’re just as stuck, but we have a trick up our sleeve. The first step is to define another class with an instance in scope. GHC will happily give us a dictionary for such a thing:
class Trivial
instance Trivial
trivialDict :: Dict Trivial
= Dict trivialDict
and now for something naughty:
callsAPIDict :: forall srv name. Dict (CallsAPI srv name)
= unsafeCoerce trivialDict callsAPIDict
Behind the scenes, GHC compiles classes into records, instances into values of these records, and replaces wanted constraints with function arguments taking those records. By ensuring that Trivial
and CallsAPI
are both empty classes, with no methods or super-classes, we can be certain the generated records for these classes will be identical, and thus that it is OK to coerce one into the other.
Armed with withDict
and callsAPIDict
, we can play the part of the constraint solver and satisfy constraints ourself. GHC will happily compile the following example:
ex :: HasAPI ShoutService "shout" => IO String
= callAPI @ShoutService @"shout" "hello world"
ex
-- Look ma, no HasAPI constraint!
test :: IO String
= ex \\ callsAPIDict @ShoutService @"shout" test
So that’s the rough technique. But how do we actually use it in anger?
Automatically Documenting the Server
Our actual use case at work is to add these API calls to our swagger documentation. Swagger is this automatically generated manifest of an API surface; we want to document the fact that some API calls might call other ones. Our server is one big servant application, and servant is extensible. So the real technique is to build a servant combinator that eliminates HasAPI
constraints when you document them in the API definition.
Getting into the nitty gritty bits of servant is beyond the scope of this post, but we can sketch the idea. Servant APIs use the type-level (:>)
operator to combine information about an endpoint. For example, we might expose another service:
type ServantAPI = "api" :>
"echo"
:> ReqBody '[JSON] String
:> Get '[JSON] String
This definition states that we have a REST server with a single route, api/echo
which responds to POST
requests, returning a JSON-encoded string, which takes a JSON-encoded string as the request body.
A servant server for ServantAPI
would have type Server ServantAPI
, where Server
is a type family given by HasServer
. Evaluating the type family results in String -> Handler String
, so in order to implement this server, we would need to provide a function of that type.
Let’s implement our server endpoint:
echo :: CallsAPI ShoutService "shout"
=> String
-> Handler String
= liftIO $ callAPI @ShoutService @"shout" str echo str
Unfortunately, due to our earlier work, we can’t eliminate the CallsAPI
constraint, and thus we can’t actually use echo
as the handler for our endpoint.
It’s important to note that servant’s DSL is extensible, and we can add our own machinery here. The first step is to build a type that we can use in servant:
type MakesAPICall :: Service -> Symbol -> Type
data MakesAPICall srv name
We can now build a second version of ServantAPI
:
type ServantAPI = "api" :>
"echo"
:> MakesAPICall ShoutService "shout"
:> ReqBody '[JSON] String
:> Get '[JSON] String
In order to actually run our endpoint, we need to give an instance of HasServer
for our new MakesAPICall
combinator:
instance HasServer api ctx
=> HasServer (MakesAPICall srv name :> api) ctx
where
type ServerT (MakesAPICall srv name :> api) m =
Dict (CallsFed srv name) -> ServerT api m
=
route _ ctx f Proxy @api) ctx $ fmap ($ callsAPIDict @srv @name) f route (
The ServerT
instance here adds a Dict (CallsFed srv name)
to the type of the handler required to satisfy this endpoint, while route
automatically fills in the dictionary whenever the handler needs to be run. In an ideal world, we could give our ServerT
instance as:
type ServerT (MakesAPICall srv name :> api) m =
CallsFed srv name => ServerT api m
but GHC doesn’t let us use quantified types on the right-hand sides of type families, so this is unfortunately a no-go. Playing games with Dict
instead is the best approach I’ve found here, but I’d love to hear if anyone has a better idea.
We still can’t use echo
as a handler, but we can use makesCall echo
as one, where makesCall
is given as:
makesCall :: (c => r) -> Dict c -> r
= (\\) makesCall
Servers that document their API calls via MakesAPICall
and which wrap their handlers with makesCall
can now eliminate CallsFed
constraints. Since this is the only way of eliminating CallsFed
constraints, we can be sure that every API call is correctly documented in the servant DSL!
The final step here is to add an instance of HasSwagger (MakesAPICall srv name :> api)
, but the details are gory and devoid of educational value. Suffice it to say that this instance was written, and now we have automatically generated JSON documentation describing which server endpoints make which other API calls. This documentation is guaranteed to be correct, because updating it is the only way to convince GHC to compile your code.