Stack Exchange data access and caching coding practices

Question

I am very interested in the internal coding practices and standards used within Stack Exchange, particularly regarding caching and data access: how does the team code when it comes to falling back from a local cache to a central cache like Redis, and ultimately falling back to SQL server? Are there any helpers to encapsulate boilerplate work or is it 'if cache 1 not null... If cache 2 not null... Get from db' all over the place with variations on a case by case basis?

Marc Gravell · Accepted Answer · 2014-09-17 07:15:37Z

Basically, we have multiple cache implementations that all conform to a single interface; let's hugely simplify, and call it:

ICache { // grossly simplified to the point of being unrecognisable
    string Get(string key);
    void Set(string key, string value, int duration, bool broadcast = false);
}

("broadcast" indicates whether we should use pub/sub to make all servers aware of a change - essentially, we use this to perform eager cache invalidation)

We then have:

an in-memory cache implementation
a redis cache implementation
a nul-cache implementation (always returns negative answers, etc)

We then simply daisy-chain them; so the implementation of the redis cache might be:

string ICache.Get(string key) {
    var l1 = tail.Get(key);
    if(l1 != null) return null;

    var l2 = redis.GetWithExpiry(key);
    if(l2 != null) {
        tail.Set(key, l2.Value, l2.Duration);
    }
    return l2;
}

Then the consuming code can either do:

var val = cache.Get(key);
if(val == null) {
    // whatever we need to get the value
    // val = ...

    cache.Set(key, val, duration, false);
}
// use val

which means most of the code doesn't need to care about the various cache implementations - it just has access to a single cache instance. Calling code can also use a helper extension method we have that wraps that, but uses a few other tricks too:

var val = cache.GetSet(key, ctx => {
    return ctx.DB.DoStuff();
}, numbers);

This does a few extra things like background refresh - depending on numbers, it can usually return data without executing the lambda, but will also return stale data (i.e. "expire after 5 minutes, consider stale after 3 minutes"), but will schedule the data to be refreshed in the background so it never actually expires, yet the work happens away from the main request-servicing threads - with some signals etc to prevent the same refresh happening more than once.

That's pretty much the main parts. Obviously I'm missing lots out, and over-simplifying the few bits that are there.

Thanks for your insight, you've made my week twice now! Now, say you broadcast a message to invalidate some cached item -- assuming that happens via Redis Pub/Sub, are you basically broadcasting the key to invalidate and then having your Redis implementations invalidate that key from your "L1"/HttpContext.Cache? — tuespetre
– tuespetre, Commented Sep 17, 2014 at 13:32

Stack Exchange Network

Stack Exchange data access and caching coding practices

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Stack Exchange data access and caching coding practices

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions