IBM Poopheads: "LAMP Users Need to Grow Up"

May 27, 2005

Let’s do it:

According to Daniel Sabbah, general manager of IBM’s Rational division, LAMP – the popular Web development stack – works well for basic applications but lacks the ability to scale.

Nope. We call bullshit. After wasting years of our lives trying to implement physical three tier architectures that “scale” and failing miserably time after time, we’re going with something that actually works.

If you look at the history of LAMP development, they’re really primative tools … the so-called good enough model. The type of businesses being created around those particular business models are essentially going to have to grow up at some point.

No. The LAMP stack is a properly constructed piece of software. Features are added when an actual person has an actual need that arises in the actual field, not when some group of highly qualified architecture astronauts and marketing splash-seekers get together to compete for who can come up with the most grown-up piece of useless new crap to throw in the product.

The LAMP model works because it was built to work for and by people building real stuff. The big vendor / big tools model failed because it was built to work for Gartner, Forrester, and Upper Management whose idea of “work” turned out to be completely wrong.

Now you’re saying that the primitive yet successful LAMP model should adopt the traits of the sophisticated yet failing big vendor model.

I believe that in the same way that some of those simple solutions are good enough to start with, eventually, they are going to have to come up against scalability, Sabbah said during a press conference at the IBM Rational User Conference in Las Vegas.

We can’t scale? Really? Are you insane?

Alright, that last jab may have been a bit unfair. I think what Sabbah is really talking about is PHP. I can’t be sure but none of ~~Yahoo!~~, Amazon, Ebay, or Google seem to be using PHP widely on their public sites. But then again, they aren’t using Websphere/J2EE, .NET, or other scalable physical three tier architectures either.

UPDATE: See comments for interesting notes on PHP usage at Yahoo!.

While we’re talking about architectures, I’d like to jump into a brief commentary on what’s really at the root of the debate here.

There ~~are~~were two widely accepted but competing general web systems architectures: the Physical Three Tier Architecture and the Logical Three Tier Architecture. IBM (and all the other big tool vendors) have been championing one of them and LAMP is a good framework for the other (although you’ll rarely hear anyone admit that LAMP provides an overall architecture).

The Physical Three Tier Architecture

Many large enterprise web applications tried really hard to implement a Physical Three Tier Architecture, or they did in the beginning. The idea is that you have a physical presentation tier (usually JSP, ASP, or some other *SP) that talks to a physical app tier via some form of remote method invocation (usually EJB/RMI, CORBA, DCOM) that talks to a physical database tier (usually Oracle, DB2, MS-SQL Server). The proposed benefits of this approach is that you can scale out (i.e. add more boxes) to any of the physical tiers as needed.

Great, right? Well, no. It turns out this is a horrible, horrible, horrible way of building large applications and no one has ever actually implemented it successful. If anyone has implemented it successfully, they immediately shat their pants when they realized how much surface area and moving parts they would then be keeping an eye on.

The main problem with this architecture is the physical app box in the middle. We call it the remote object circle of hell. This is where the tool vendors solve all kinds of interesting what if type problems using extremely sophisticated techniques, which introduce one thousand actual real world problems, which the tool vendors happily solve, which introduces one thousand more real problems, ad infinitum…

It’s hard to develop, deploy, test, maintain, evolve; it eats souls, kills kittens, and hates freedom and democracy.

Over the past two years, every enterprise developer on the planet has been scurrying to move away from this architecture. This can be witnessed most clearly in the Java community by observing the absolute failure of EJB and the rise of lightweight frameworks like Hibernate, Spring, Webwork, Struts, etc. This has been a bottom up movement by pissed off developers in retaliation to the crap that was pushed on them by the sophisticated tool vendors in the early century.

Which brings us nicely to an architecture that actually works some times and loves freedom.

The Logical Three Tier Architecture

More specifically, the Shared Nothing variant of the Logical Three Tier Architecture says that the simplest and best way to build large web based systems that scale (and this includes enterprise systems goddamit) is to first bring the presentation and app code together into a single physical tier. This avoids remote object hell because the presentation code and the business logic / domain code are close to each other.

But the most important aspect of this approach is that you want to push all state down to the database. Each request that comes into the presentation + app tier results in loading state for a set of objects from the database, operating on them, pushing their state back down into the database (if needed), writing the response, and then getting the hell out of there (i.e. releasing all references to objects loaded for this request, leaving them for gc).

That’s the rule.

So the physical database tier and the physical presentation + app tier make up our logical three tier architecture but I’d like to talk about one other latch-on piece of this setup because it’s interesting to contrast it with how the Physical Three Tier purists deal with the same problem.

Fine Grained Caching

Some mechanism for caching becomes really important when you decide that you are spending too much money on hardware (note that both of these architectures will scale up and out, on each physical tier independently, for as far and wide as you can pay for hardware). Adding some form of caching reduces the amount of hardware needed dramatically because you’ve reduced utilization somewhere.

In the physical three tier architecture, there is generally a lot of sophisticated mechanisms for caching and sharing object state at a very granular level in the app tier to reduce utilization on the the database and increase response time. This is cool and all but it increases utilization on the app tier dramatically because so much time is now spent managing this new state.

The introduction of state (even just a little state for caching objects) forces the app tier to take on a lot of the traits of the database. You have to worry about object consistency and be fairly aware of transactions. When that’s not fast enough what ends up happening is that more fine grained caching is added at the presentation tier to reduce round trips with the app tier.

Now you have three places that are maintaining pretty much the same state and that means you have three manageability problems. But this is, you know, cool because it’s really complex and sophisticated and the whiteboard looks interesting and lots of arm waving now.

Screw Fine Grained Caching

Shared Nothing says, screw that - the database is the only thing managing fine grained state because that’s it’s job, and then throws up caching HTTP proxy server(s) in a separate (and optional) top physical tier. Cached state is maintained on a much simpler, coarse grained level with relatively simple rules for invalidation and other issues.

When the Shared Nothing cache hits, it provides unmatched performance because the response is ready to go immediately without having to consult the lower tiers at all. When it misses, it misses worse than the fine grained approach because chances are good you’ll be going all the way to the database and back. But it turns out that it usually doesn’t matter. My experience says that you get as good or better performance with the coarse grained approach as you do with the fine grained approach for much less cost, although it’s hard to measure because the savings are distributed in very different ways.

The Shared Nothing + Caching Proxy setup scales like mad and I don’t just mean that it scales to really massive user populations. It scales low too. It’s easy to work with when you’re developing and testing on a single machine. It’s easy to have a simple prototype app done in a day. It’s easy to teach someone enough that they can go play and figure stuff out as they go. It’s easy to write tests because the entire system is bounded by the request and there’s no weird magic going on in the background.

The big vendor / big tool architectures sacrificed simplicity and the ability to scale low because they decided that every application was going to have one million users and require five 9’s from the first day of development.

As I write this, Bill de hÓra postulates: All successful large systems were successful small systems. I believe him and what that means to us right now in this article is that it is exceedingly hard to build large systems with the big vendor / big tool approach because it is exceedingly hard to build small systems with the same.

Let’s get back to the woodshed

While Sabbah was critical of LAMP’s capabilities, he said IBM is going to ensure companies which started with that model will be able to “grow and change with the rest of the world”.

He believes most businesses want technology that is stable, evolutionary, historical and had support.

L A M P = (S)table (E)volutionary (H)istorical (S)upport

“What we are trying to do is make sure businesses who start there [with LAMP] have a model, to not only start there but evolve into more complex situations in order to actually be able to grow,” he said.

This is where I really wanted to jump in because I think this mentality is holding back adoption of very simple yet extremely capable technology based purely on poor reasoning. This view of systems design says that complexity is required if growth is desirable and that complex situations can only be the result of complex systems.

There’s a guy who just spent 50 years or something locked in a room writing a 1200 page book proving that this is just wrong. It would appears that there is very little relationship between the complexity of a program and the complexity of the situation it produces.

The complexity for complexity mindset is the bane of a few potentially great technologies right now:

Static vs. Dynamic Languages
J2EE vs. LAMP
WS-* vs HTTP

I like to complain when someone calls Python a scripting language because the connotation is that it is simple. But it is simple, right? So there shouldn’t be any complaining. I’m not objecting to someone calling Python simple, I’m objecting to then saying that because it is simple, it must only be capable of simple things.

The Need For Complex Systems

“You’ve seen us do a lot with PHP and Zend and you’ll see us do more. I can’t say more. It [PHP] needs to integrate with enterprise assets but it needs to remember where it came from and where its value was. It can’t start getting too complex as it will lose its audience,” Sabbah said.

The need for complex systems in the enterprise was and still is greatly overestimated. The trick isn’t to make PHP more complex, it’s to make the enterprise less complex. You need to equate complex requirements with complex systems less and start asking “do we really need this?” more.

The funny thing about all this is that my opinion on this matter has formed largely based on concepts that you guys told me, so I’m sure you’ll pull through on this one.