Introducing Rack::Cache

October 24, 2008

I’m pleased to finally release a side-project I’ve been whittling away at for a few months: Rack::Cache is a piece of Rack middleware that implements most of RFC 2616’s caching features with a basic set of storage options (disk, heap, and memcached) and a configuration system for tweaking cache policy. It should work equally well with any Rack-enabled Ruby web framework and is tested on Ruby 1.8.6 and 1.8.7.

high-level rack-cache diagram

Rack::Cache relies entirely on standard HTTP headers produced by your application. It has no application level caching API.

You need to understand HTTP caching.

The middleware piece sits near the front of each backend process and acts like a gateway proxy cache (e.g, Varnish, Squid) but without requiring the infrastructure investment of a separate daemon process. This approach is quite different from the integrated caching systems built into most Ruby web frameworks. There are pros and cons which I plan to explore with some depth in future posts here.

The basic goal is standards-based HTTP caching that scales down to the early stages of a project, development environments, light to medium trafficked sites, stuff like that. HTTP’s caching model is wildly under-appreciated in the Ruby web app community and my hope is that making its benefits more accessible will lead to wider understanding and acceptance.

Rack::Cache is still very much a work in progress but is far enough along to be useable (it’s serving this site, in fact). The following HTTP caching features are currently supported:

Expiration-based caching. Responses are served from cache while fresh without consulting the backend application. The Cache-Control: max-age=N and Expires response headers control a response’s freshness lifetime.
Validation. The cache stores responses even if no freshness information is present so long as there’s a cache validator (Last-Modified or ETag). Subsequent requests result in a conditional GET request to the application and if the stored response is unmodified, it’s served from cache. Your backend should never generate the same response twice.
Vary support.

There are a few notable things still missing.

Explicit purge / manual cache invalidation. There is currently no supported method for manually invalidating a cache entry.
Multi-thread. There’s nothing fundamentally hard about this, I just haven’t got around to testing in a threaded environment and working out the kinks.
Performance. The current focus is on nailing down a solid implementation of HTTP’s basic caching features and providing good documentation. I have not profiled the system or performed any benchmarks. Anecdotal evidence suggests that the performance situation is basically swell.

Finally, I’d like to note that Rack::Cache is based largely on the work of the internet standards community and that portions of its design were inspired by Django’s cache framework and Varnish’s configuration language (VCL). Huge thanks to everyone involved with any of this stuff.