Wouldn't it be Simpler Without a Router?

While I’ve been a proponent of writing web software in Go for years, using “routers” (or “muxers” or “dispatchers”) is something I’ve been long disenchanted with and have actively avoided in my own code for some while. Not infrequently I mention this to someone and they look at me like I have something stuck in my teeth or really bad body odor. After briefly verifying those are not the case and ascertaining their consternation is due to my lack of a router, I then get to impart to them what I mean. Here’s a brief but hopefully explanatory article to give the background and an alternative approach.

Anatomy of a Router

The basic definition of a router is a software component which allows a developer to describe the properties of an HTTP request which should be matched in order to run a piece of code to respond to that request. There are many different forms and implementation details but the essence is “if the request is like this, run this code”.

While I’m not a software historian, from what I can tell routers have their roots in web application frameworks. In the case of Ruby on Rails, routes are configured in config/routes.rb.

Some older frameworks used configuration files for setting up routes. The good old Java Servlet environment uses an XML file to describe what URLs match which Java classes.

It seems the trend in recent years is to define these sorts of routes in code as it is more flexible, but I get ahead of myself.

On GitHub you can find quite a few examples of Go routers, as well as Go frameworks which include routers. A typical route example using the Go standard library ServeMux looks like this:

mux := http.NewServeMux()
mux.Handle("/some-path", someHandler)
mux.HandleFunc("/some-other-path", func(w http.ResponseWriter, r *http.Request) {
	// other handler code or call out to it
})
// ...

// mux can then be used as a Handler to dispatch accordingly

What Routers Do

Now if we dissect what’s happening here, we can glean some architectural wisdom from this code:

  • The router is separating the decision of under what circumstances should a handler be called from the handler code itself.
  • In more complex cases, the router might perform processing on the request and extract information from it, and then call the code with this additional information. I won’t include an example because every framework does it differently but if you’re familiar with routers you’ve seen examples of routes for /post/:postid and that :postid ends up getting passed to the code that handles the route.
  • And adding more functionality some routers might support various shorthands for responding to a request, including things like converting a string into a response body, or marshaling a response as JSON, or whatever. (Not shown in the example but if you’re familiar with routers you’ve probably seen things like a string return value that magically becomes the response body.)

So while I won’t expound on the types of features that various routers include, some of them are quite sophisticated and have all kinds of shortcuts and abbreviations for extracting and transforming data from the request and swizzling your return information into a response.

In contrast, it is interesting to note that at the end of the day, the example above (or any of its framework-specific variations) can be essentially be rewritten as:

h := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	if r.URL.Path == "/some-path" {
		someHandler.ServeHTTP(w, r)
		return
	}
	if r.URL.Path == "/some-other-path" {
		// other handler code or call out to it
		// and some where there's some w.Write() calls
		return
	}
}))

Slightly longer, but also very obvious and completely customizable (that if statement can match on any aspect of the request, and no assumptions are made about the response).

What Routers Do Not

Conversely, this means that routers require you to specify a pattern to match separate from the handler code. This ends up being an abstraction around what is in essence an if statement. To be clear, with a router you write:

mux.Handle("/some-path", someHandler)

Instead of:

if r.URL.Path == "/some-path" { someHandler.ServeHTTP(w, r) }

The handler code for routers also usually cannot modify the request or response without fully handling it - i.e. they must handle themselves, not as part of a chain where it can perform a specific function along the way toward a response being generated. (There’s a note below on “middleware”, I’m getting there…)

It also seems to be a bit of a myth that routers must perform some highly optimized search through your routes to determine which code to run. While this could be true for an application with thousands of individual routes, virtually every application I’ve ever even heard of has far fewer - often only a few dozen. The cost of performing a few dozen if statements, each of which are a couple of string comparisons, is quite small. Any single disk hit or database query is likely to be orders of magnitude more time than your all of your request routing logic without any optimization. So while it’s true it is possible to get into performance problems due to unoptimized route handling, in the real world, this very very unlikely to become a problem for most applications. (And if it does become a problem, converting things to use some sort of map lookup is probably a simple fix anyway. My point being that the emphasis I see in some routing packages about how optimized they are seems to be out of sync with the requirements of most applications.)

Routers also, depending on how fancy they are, tend to add code dependencies on whatever framework they are a part of. Every time you do something that isn’t just http.ResponseWriter and *http.Request your code now has to be aware of and depend on whatever else just got introduced. Whether or not this dependency is worth it is your decision, but it’s certainly worth your consideration.

Handler Chains to the Rescue

That said, let’s look at an alternate approach.

Most libraries providing routing also have some sort of “middleware” concept. (n.b. I dislike the term “middleware” simply because it’s horribly unspecific. If you don’t know what I mean when I say middleware, that just proves my point. The definition I’m using is something that can examine and modify the request and/or response as part of a sequence or pipeline for request processing. A “handler” can respond to a request but “middleware” can make changes and pass it on to be handled by something else.)

So what happens if we take the basic http.Handler interface and transform it into something that can either handle requests or modify them?

The Handler interface looks like this:

type Handler interface {
	ServeHTTP(w http.ResponseWriter, r *http.Request)
}

With one simple change, we can arrive at an interface that is very similar but allows for w and r (including the context.Context on r) to be modified or replaced entirely:

type ChainHandler interface {
	ServeHTTPChain(w http.ResponseWriter, r *http.Request) (http.ResponseWriter, *http.Request)
}

And ServeHTTPChain of course works exactly like ServeHTTP except that implementations can modify r or w or replace them entirely, forming a chain where the next Handler/ChainHandler can work on this new (r, w) pair.

With some helping glue code we arrive at something that looks like this (note that the guts of these handler functions are included inline but they could just as easily be moved out to separate types each implementing Handler or ChainHandler as applicable):

var hl HandlerList

// imply context cancel on w.Write() - really just book keeping for handler chain
hl = append(hl, NewContextCancelHandler())

// you can modify the context like this
hl = append(hl, ChainHandlerFunc(func(w http.ResponseWriter, r *http.Request) (http.ResponseWriter, *http.Request) {
	u := &struct{}{} // TODO: load currently logged in user
	return w, r.WithContext(context.WithValue(r.Context(), "user", u))
}))

// regular http.Handlers can also do things like set response headers
hl = append(hl, http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	switch path.Ext(r.URL.Path) {
	case `.html`:
		w.Header().Set("cache-control", "no-store") // tell browsers not to store html pages
	case `.jpg`, `.css`, `.js`:
		w.Header().Set("cache-control", "max-age=3600") // but static assets they should keep for an hour
	}
}))

// if you want to respond to a request intended for this handler, you just do so - when you w.Write() (or w.WriteHeader()) it stops the chain
hl = append(hl, http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	if r.URL.Path == "/some-place.html" {
		fmt.Fprintf(w, `<html><body>This is some place.</body></html>`)
		return
	}
}))

// common Go handler patterns work well, here we serve a static asset
staticFileServer := http.FileServer(http.Dir("/path/to/static"))
hl = append(hl, http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	switch path.Ext(r.URL.Path) {
	case `.jpg`, `.css`, `.js`:
		staticFileServer.ServeHTTP(w, r)
		return
	}
}))

// anything else is not found
hl = append(hl, http.NotFoundHandler())

// this becomes our handler that we would normally provide to an http.Server, this also ensures that Close() gets called
h := hl.WithCloseHandler()

To be fair, there are some details that need to be cared for in order to make this approach work. Making it well-defined and clear when a request has been successfully handled is one of them, which I address by making the first item in the chain replace w with one that cancels the request context when w.Write() is called. We also want a way to allow handlers to use either ChainHandler or Handler in cases where that helps compatibility with other existing code. Another detail is that we add a Close method to w to ensure that things that need to know when a response is written - gzipping is the most common example - have an opportunity to properly complete. (Although in retrospect, Flush() can probably be used for this purpose.)

Have a look at the full example on Go playground.

So while perhaps having sacrificed some beauty (in the eye of certain beholders, myself not included) we’ve nonetheless in a relatively few lines of code created a pattern that:

  • Allows requests and responses to be easily modified as needed during the pipeline. This enables many kinds of behavior modifications including common requirements like gzipped responses, setting Cache-Control headers, , etc.)
  • Supports passing request-scoped information via r.Context().
  • Can easily make use of existing Handler implementations. (HandlerList can accept either http.Handler or ChainHandler)
  • Does not introduce dependencies on a third-party library into your code. Anything can add a ServeHTTPChain method without having to know anything else about the environment in which it’s called. This leads to more idiomatic Go code: proper use of interfaces and minimizing knowledge of other components wherever possible. (Contrast this with the many Go frameworks that require you to depend on their code for all of your handlers with argument types like whateverframework.Context and anotherflamework.Request.)
  • Is relatively easy to understand and support. I realize that code readability is often subjective (and strongly influenced by the developers’ previous experiences and familiarity), however the fact that the decision of whether or not a request should be responded to is moved into the handler itself means there is less “magic” happening.

Summary

I’ve used variations of this approach on a number of projects and it has worked very well. I’m currently involved in writing a Go library that utilizes this approach and I’ll add the link here when it’s ready. In any case, hopefully this article helps explain the rationale.

Love this idea? Hate it? Let me know in the comments!

Share Comments
comments powered by Disqus