Improving performance on twitter.com

By
Tuesday, 29 May 2012

To connect you to information in real time, it’s important for Twitter to be fast. That’s why we’ve been reviewing our entire technology stack to optimize for speed.

When we shipped #NewTwitter in September 2010, we built it around a web application architecture that pushed all of the UI rendering and logic to JavaScript running on our users’ browsers and consumed the Twitter REST API directly, in a similar way to our mobile clients. That architecture broke new ground by offering a number of advantages over a more traditional approach, but it lacked support for various optimizations available only on the server.

To improve the twitter.com experience for everyone, we’ve been working to take back control of our front-end performance by moving the rendering to the server. This has allowed us to drop our initial page load times to 1/5th of what they were previously and reduce differences in performance across browsers.

On top of the rendered pages, we asynchronously bootstrap a new modular JavaScript application to provide the fully-featured interactive experience our users expect. This new framework will help us rapidly develop new Twitter features, take advantage of new browser technology, and ultimately provide the best experience to as many people as possible.

This week, we rolled out the re-architected version of one of our most visited pages, the Tweet permalink page. We’ll continue to roll out this new framework to the rest of the site in the coming weeks, so we’d like to take you on a tour of some of the improvements.

No more #!

The first thing that you might notice is that permalink URLs are now simpler: they no longer use the hashbang (#!). While hashbang-style URLs have a handful of limitations, our primary reason for this change is to improve initial page-load performance.

When you come to twitter.com, we want you to see content as soon as possible. With hashbang URLs, the browser needs to download an HTML page, download and execute some JavaScript, recognize the hashbang path (which is only visible to the browser), then fetch and render the content for that URL. By removing the need to handle routing on the client, we remove many of these steps and reduce the time it takes for you to find out what’s happening on twitter.com.

Reducing time to first tweet

Before starting any of this work we added instrumentation to find the performance pain points and identify which categories of users we could serve better. The most important metric we used was “time to first Tweet”. This is a measurement we took from a sample of users, (using the Navigation Timing API) of the amount of time it takes from navigation (clicking the link) to viewing the first Tweet on each page’s timeline. The metric gives us a good idea of how snappy the site feels.

Looking at the components that make up this measurement, we discovered that the raw parsing and execution of JavaScript caused massive outliers in perceived rendering speed. In our fully client-side architecture, you don’t see anything until our JavaScript is downloaded and executed. The problem is further exacerbated if you do not have a high-specification machine or if you’re running an older browser. The bottom line is that a client-side architecture leads to slower performance because most of the code is being executed on our users’ machines rather than our own.

There are a variety of options for improving the performance of our JavaScript, but we wanted to do even better. We took the execution of JavaScript completely out of our render path. By rendering our page content on the server and deferring all JavaScript execution until well after that content has been rendered, we’ve dropped the time to first Tweet to one-fifth of what it was.

Loading only what we need

Now that we’re delivering page content faster, the next step is to ensure that our JavaScript is loaded and the application is interactive as soon as possible. To do that, we need to minimize the amount of JavaScript we use: smaller payload over the wire, fewer lines of code to parse, faster to execute. To make sure we only download the JavaScript necessary for the page to work, we needed to get a firm grip on our dependencies.

To do this, we opted to arrange all our code as CommonJS modules, delivered via AMD. This means that each piece of our code explicitly declares what it needs to execute which, firstly, is a win for developer productivity. When working on any one module, we can easily understand what dependencies it relies on, rather than the typical browser JavaScript situation in which code depends on an implicit load order and globally accessible properties.

Modules let us separate the loading and the evaluation of our code. This means that we can bundle our code in the most efficient manner for delivery and leave the evaluation order up to the dependency loader. We can tune how we bundle our code, lazily load parts of it, download pieces in parallel, separate it into any number of files, and more — all without the author of the code having to know or care about this. Our JavaScript bundles are built programmatically by a tool, similar to the RequireJS optimizer, that crawls each file to build a dependency tree. This dependency tree lets us design how we bundle our code, and rather than downloading the kitchen sink every time, we only download the code we need — and then only execute that code when required by the application.

What’s next?

We’re currently rolling out this new architecture across the site. Once our pages are running on this new foundation, we will do more to further improve performance. For example, we will implement the History API to allow partial page reloads in browsers that support it, and begin to overhaul the server side of the application.

If you want to know more about these changes, come and see us at the Fluent Conference next week. We’ll speak about the details behind our rebuild of twitter.com and host a JavaScript Happy Hour at Twitter HQ on May 31.



-Dan Webb, Engineering Manager, Web Core team (@danwrong)