Rails and Scaling with Multiple Databases

Joe points out the following from this interview with Twitter’s Alex Payne:

The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there’s no facility in Rails to talk to more than one database at a time.

(David and Rafe weigh in also. Good stuff)

We’ve run against multiple (five now) separate PostgreSQL servers for a long time now. To be clear, that’s five separate “databases” in the sense that PostgreSQL uses the term. Not a replicating / mirror setup - separate databases with different data but with similar structure.

Each of our clients gets a schema (again, in the postgres sense of the word) on one of the database boxes. There are multiple client schemas per database. When a box becomes over utilized we get another box and move schemas around.

Sidebar: this whole schema-per-client thing might seem like overkill but when we say client, we mean, a company that processes health claims for 50 to 1000 self-funded employers. Each client is fairly massive in the volume of data they load into the system and our entire market consists of maybe 400 total prospects. Those 400 companies process something like 60% of the covered US population’s health claims every year.

In addition to the client databases, we have a master-slave balanced and replicated shared schema that stores user account information and other data applicable to all clients.

We run multiple fastcgi dispatchers on two severely underutilized web boxes. Our database load dwarfs our web load as we’re doing live data-warehouse style queries, which are insanely intensive when compared to generating the HTML or PDFs to display the results.

When an HTTP request comes in, it can go to any of the fastcgi dispatchers. We practice shared nothing. We establish, from the shared schema, which client the user belongs to and can then determine the server and schema that houses their data. Each fastcgi dispatcher has a pool of connections: one dispatcher has one connection to each postgres server. We scope the connection in using a set of Rails extensions similar to ActiveRecord::Base::with_scope at the beginning of the request.

This situation is surely very different from Twitter’s but I hope it shows the difference between the statement made by Alex, “there’s no facility in Rails to talk to more than one database at a time,” and the significantly more problematic, “you cannot talk to more than one database at a time.” While the former may be true (and I’ll argue in a moment that it’s not), the latter is clearly false.

Talking about this on the level of having “multi-database connectivity” come out of the box in Rails is simplifying the problem to an unreasonable level, I think. Would the multi-database connectivity features meet my needs or Twitter’s? They are different problems with minor overlap and we’re talking about the minor overlapping piece like it’s the biggest part of the problem.

Most of the time spent getting our setup running was in the conceptual and data wrangling phases. The amount of time it took to implement the multi-database connectivity was negligible compared to the amount of time it took to devise a method of splitting things out at the data level. When all was said and done, the Ruby/Rails related bits were implemented in no more than 40-50 lines of code.

In my case, ActiveRecord provided exactly the right level of functionality. I can have multiple database connections established and write code to manage when each should be used. Control of which connection is used is managed at the model level. Connections cascade up inheritance chains and I can specify that one model use the connection specified on another model using a simple delegate statement:

class A < ActiveRecord::Base
end

class B < ActiveRecord::Base
   class << self
     delegate :connection, :to => A
   end
end

Changing A’s connection changes B’s without effecting the connection used by any other model. We have a simple macro (uses_connection_of) that brings this down to a one liner for each top level model class:

class C < ActiveRecord::Base
  uses_connection_of B
end

This is only the tip of the framework level customizations we’ve made to Rails over almost two years of development. In most cases, I find the base functionality well balanced for the general (80%) case. We expect to write additional framework code when we get into special case territory, which our multi-database/schema setup clearly is, and which Twitter’s seems to also be.

When I consider what contributed to the unraveling of J2EE, one thing that stands out is that it tried to do too much. The promise was that of infinite scalability based on tooling, which assumes that designing scalable systems is a general case problem. I now firmly believe that this is flawed reasoning. Frameworks don’t solve scalability problems, design solves scalability problems.

I picked up a word from Joe a few years back and find myself using it a lot: “friction.” When referring to framework and tooling, “friction” is a (subjective) measure of how much the tooling gets in your way when trying to solve a specific-case problem. I’ve come to evaluate frameworks based on two rough metrics: how far the framework goes in solving the general case problem out of the box and how little friction the framework creates when you have to solve the specific-case problem yourself. When a framework finds a balance between these two areas, we call it “well designed.”

Measured along these lines, there are portions of Rails that have a less than perfect balance but I don’t think multi-database connectivity is one of them. It seems to me that moving too far in one direction on this would cause lots of friction for moving in other directions. There just doesn’t seem like there’s a lot of general case to solve here when you dig into the details.

Bottom line for me is that Twitter’s scaling and multi-database connection issues seem to be just that: Twitter’s issues. David’s response seems to indicate that he believes Rails could probably do more here but how far could framework level support really go and how much friction would be created?