Timeboxing

Thursday, 1 April 2010

When you build a service that talks to external components, you have to worry about the amount of time that a network call will make. The standard technique for protecting yourself is to use timeouts.

Most network libraries let you set timeouts to protect yourself but for local computation there are few tools to help you.

Timeboxing is the name we’ve given to a technique for setting timeouts on local computation. The name is borrowed from the handy organizational technique.

Let’s say you have a method that can take an unbounded amount of time to complete. Normally it’s fast but sometimes it’s horribly slow. If you want to ensure that the work doesn’t take too long, you can box the amount of time it will be allowed to take before it’s aborted.

One implementation we use for this in Scala is built on the Futures Java concurrency feature. Futures allow you to compute in a separate thread while using your current thread for whatever else you’re doing. When you need the results of the Future, you call its get() method which blocks until the computation is complete. The trick we use is that you don’t need to do other work in the meantime, you can call get() immediately with a timeout value.

Here’s an example (in Scala):

import java.util.concurrent.{Executors, Future, TimeUnit}
val defaultValue = "Not Found"
val executor = Executors.newFixedThreadPool(10)
val future: Future[String] = executor.submit(new Callable[String]() {
  def call(): String = {
    // There's a small chance that this will take longer than you're willing to wait.
    sometimesSlow()
  }
})
try {
  future.get(100L, TimeUnit.MILLISECONDS)
} catch {
  case e: TimeoutException => defaultValue
}

If returning a default value isn’t appropriate, you can report an error to the user or handle it in some other custom fashion. It entirely depends on the task. We measure and manage these slow computations with Ostrich, our performance statistics library. Code that frequently times out is a candidate for algorithmic improvement.

Even though we’ve described it as a technique for protecting you from runaway local computation, timeboxing can also help with network calls that don’t support timeouts such as DNS resolution.

Timeboxing is just one of many techniques we use to keep things humming along here at Twitter.

@stevej