Cowboy in the desert.

Perceptual consistency in RavenDB

September 2, 2013

In "UI design for eventual consistency", I wrote about how we commonly got bug reports like this:

I went to the "add machine" page, entered the machine information, clicked save, but when I went to the environments page, my machine wasn't listed. A few seconds later I hit refresh and there it was.

This was a result of Raven's use of asynchronous indexes. As a developer using RavenDB, you're supposed to embrace eventual consistency. There are common patterns to deal with this, for example:

  • you could accept the 'add machine' request as a command, and queue it to be processed, then redirect the user when the processing is done; or,
  • you could store the new machine in an intermediate cache, and include it when querying the list

And that makes sense when your operations are "submit a request to refund this bank transaction". But when you have a dozen different document types and you need a bit of CRUD and you aren't web scale, these patterns are, frankly, overkill.

So in my naivety, Octopus 1.0 was scattered with:

.Customize(x => x.WaitForNonStaleResultsAsOfNow())

This of course had problems, and for Octopus 2.0 I resolved to embrace the eventual consistency nature of Raven.

At first, I went down the path of having our REST API return a flag indicating whether results of a query were stale. Our UI would then render a message to notify the user, or automatically refresh after a second to try and get the non-stale results. While it made the UI feel faster, it was a bit jarring.

However when building some command line tools to use the API, I realised that most of the time, these tools would rather wait for non-stale results. So that became a flag (e.g., /api/environments?nonStale=true) that was provided as part of a URI template. This caused some debate; after all, should clients even be able to request such a thing?

Perceptual consistency

I wasn't happy with either solution, and I felt like there was no good compromise. Then I stumbled upon this thread where Chris Marisic made an ingenious suggestion (emphasis mine):

You want to write this record, then in the same request query that index with WaitForNonStaleResultsAsOfLastWrite, then redirect the user to the list after the index is synchronized. This avoids making the common operation, reads, from always waiting, and makes the uncommon operations, writes, to have to wait.

I'd never seen this suggestion before, but I think it's brilliant. I'm thinking about it as "perceptual consistency", or perhaps, "per-user consistency".

Perceptual consistency

What it translates to is that in the Octopus REST API, when you perform a PUT/POST/DELETE, we'll wait for the indexes to be non-stale before returning success (for a little while at least). So you may see a slight pause when you hit "create machine". But when you are redirect and perform a GET, that request won't wait for non-stale results. And you won't care, because by the time you're redirected, the index is up-to-date enough to include the machine you just added. Brilliant!

We'll still provide the non-stale flag on query results to indicate whether the results of your GET request are stale, but there's no way to tell the API whether to wait for non-stale results or not anymore. And from a UI point of view you'll always see results that look consistent to you.

Tagged with: Architecture