Building on Open Source

Wednesday, 14 January 2009

Building on Open SourceKestral photo by mugley

When we plan new engineering projects at Twitter, we measure our requirements against the capabilities of open source offerings, and prefer to use open source whenever it makes sense. By this approach, much of Twitter is now built on open source software.

In some cases, our requirements—in particular, the scalability requirements of our service—lead us to develop projects from the ground up. We develop these projects with an eye toward open source, and are pleased to contribute our projects back to the open source community when there is a clear benefit. Below are two such projects, Kestrel and Cache-Money. Every tweet touches one or both of these key components of the Twitter architecture.

Kestrel’s Wonderful Plumage

Kestrel is a message queue server we use to asynchronously connect many of the services and functions underlying the Twitter product. For example, when users update, any tweets destined for SMS delivery are queued in a Kestrel; our SMS service then reads tweets from this queue and communicates with the SMS carriers for delivery to phones. This implementation isolates the behavior of SMS delivery from the behavior of the rest of our system, making SMS delivery easier to operate, maintain, and scale independently.

Users of the Starling message queue server will find Kestrel familiar, as Kestrel is a port of Starling from Ruby to Scala. In addition to being generally more efficient, Kestrel adds several new features, such as a facility for handling significantly bursty queues.

Robey is the lead developer of Kestrel. You can read his lively journal entry on Kestrel’s latest features. Kestrel is available on github.

As Good as Cache-Money

Cache-Money is an elegant write-through caching plugin for Ruby on Rails. In write-through caching, new or updated data is first written to an efficient cache (such as memcached) and then stored in a database; subsequent requests for this data are then likely to read the data from the faster cache, rather than from the slower database. In addition to the efficiency gains associated with caching, this technique also addresses the risk of short-term replication lag between master and slave databases since data written during the lag time will likely be present in the cache. Cache-Money plugs directly into Rails’s ActiveRecord to transparently provide this functionality.

Nick is the lead developer of Cache-Money. Check out his blog for an excellent introduction. Cache-Money is available on github.