Original Comment:

Cache in the Attic

Thursday, September 13, 2007 - 04:35 PM

On Cache

The servers have been under a lot of stress pretty much since Project Wonderful took up residence. The web servers tend to be ok-ish, but the database is under a lot of load. Serving ads requires a fair amount of work, you need to update the database for every ad viewed and clicked on, and use that data to decide what ad to show. For PW, you also need to keep track of the ever-changing bids and such associated with the ad campaigns.

Ryan North seems to always be tinkering with the underlying code, trying to optimize the database pieces to be friendlier, but there's only so much (even) he can do, until we can afford some heftier database boxes.

While I leave him to his own devices, I'm stuck trying to make sure that my clients' sites are still reasonably responsive even when the database is slow.

The solution, obviously, is to cache that data locally, so I have to re-fetch as little as possible as infrequently as possible. There's different ways to do this, and I use a combination of solutions.

While there are solutions out there to pseudo-automate this, layers that sit between your code and the database and try to store the results of queries for you, I tend not to trust these. I have a lot of intertwined tables, and most of my queries end up getting data from multiple places, I feel I would end up spending more time telling this intermediate layer when to expire old/stale data is worth the effort.

I use a (sort of) MVC set up on the servers. Most "real" work is done in perl libraries, and then display is left to HTML::Mason. Those are my two best options for caching.

In the past, I've almost exclusively used Mason's built-in caching. It's great because it makes it easy to store rendered HTML, which means that I get two wins for one: reduced load on the database, and less computation on the web server (assuming that the HTML requires more resources to render than the caching takes to retrieve the data, which generally it does). I use this for most elements of clients' homepages, for things like "todays comic", the list of news posts, &c.

The down-side is that the reason I have a dynamic site, is because it's dynamic. There are a lot of places where it just doesn't make sense to cache the HTML. Take the Goats Forums, for example. Not only is there a lot of database work there (each comment, its text, info about the user who posted the comment, &c.), but it can be views approximately 3.8 zillion different ways. Users ca choose a flat, threaded, or nested view, that view can be ascending or descending over time. Users can jump to specific comments, meaning that previous comments won't get seen. Users can set a "threshold" to block lower rated comments, &c.

The code is loosely based on an old version of Slash. They solve this (or at least they did back when I was looking at the code) by just outputting a rendered version of the entire thread, and ignoring user preferences for old/archive/locked discussions. This probably saves a lot of server resources, but it felt "wrong" to me, I'm weird like that.

So last week, I finally did what I should have done ages ago, and wrote the caching into the back-end libraries. I now cache the fetched "Comment" objects to disk (using Cache::FileCache, and expire the cached version any time there is any kind of update to a discussion. This required some fairly extensive changes to the way the underlying code worked, although I tried to isolate it as much as possible, especially since some of that is pretty old and crufty.

I expected this to break immediately and spectacularly, and cause me major headaches, but so far no one has commented. Hopefully that's a good thing, and not bad. Changes like this are always interesting, because if they work, no one notices, and then you just get emails complaining that you wasted a week not working.

Probably, they're just bitching and moaning in the forums, and I'm just not updating the cache with those comments. I choose to think that not seeing those is feature, not a bug.


This discussion has been locked. Feel free to start a new one to share your wisdom with us.