Eric Wong’s mostly pure-Ruby HTTP backend, Unicorn, is an inspiration. I’ve studied this file for a couple of days now and it’s undoubtedly one of the best, most densely packed examples of Unix programming in Ruby I’ve come across.
Unicorn is basically Mongrel (including the fast Ragel/C HTTP parser), minus the threads, and with teh Unix turned up to 11. That means processes. And all the tricks and idioms required to use them reliably.
We’re going to get into how Unicorn uses the OS kernel to balance
connections between backend processes using a shared socket,
fork(2), and accept(2) — the basic Unix prefork model in
100% pure Ruby.
But first …
A gentle introduction to the world of UNIX IPC. Covers fork, signals, pipes, FIFOs, file locking, POSIX message queues, semaphores, shared memory segments, memory mapped files, UNIX sockets. Not a ton of depth, but that’s okay – you can read all of it in about 15 minutes and have a good feel for the pros and cons of all the different types of IPC.
Check out Beej’s Guide to Network Programming and Beej’s Quick Guide to GDB too.
The technique in a nutshell:
The basic idea of what’s going to happen is that we will create a pair of pipes and then
fork(). The child process will hold the pipe that does the writing and the parent the one that does the reading. Now, the parent willexec. This is a bit odd. Normally when you fork, then exec, it’s the child process which does the exec. However, here we really want the new version of the program to have access to all of the old file descriptors. Luckily,execlpreserves these. As an added benefit, the program gets the exact same process ID.
Boom. Nice.
Insanely useful when you’re trying to avoid thread and process synchronization primitives — mutexes, flock, etc. — in concurrent code, which should basically be always. Rack::Cache’s file stores use some of these techniques to allow multiple backends to work against the same filesystem without file locks or a separate central writing process.
Jeremy Zawodny takes a look at the * is Unix thing and throws in some additional goodness: more on fork(2), the benefits of copy-on-write, and atomic file operations.
@paulsmith’s simple preforking echo server in C.
Aristotle Pagaltzis comes through with the simple preforking echo server in Perl.
Jacob Kaplan-Moss does the prefork echo server example from my Unicorn is Unix piece in Python. Awesome. Let’s see some more of these. Where you at, Perl?
It’s important to understand how fork(2), pipe(2), and exec(2) work. I don’t want to hear anymore of this “fork is a hack” shit from any of you :)
Sometimes! Or, fork(2) is a very fast operation on legitimate operating systems. I didn’t realize it could be as fast as spawning a thread, though.
Brilliant!