The Twitter Engineering Blog

Information from Twitter's engineering team about our technology, tools and events.

Posts from Engineering: research

Evaluating language identification performance

We language-annotated nearly 200k Tweets from 2014 in 68 languages, being careful to select them in a way that allows you to measure recall and precision well in order to evaluate and improve our language identification performance. You can download all the annotated Tweets.

Read more...

Investing in MIT’s new Laboratory for Social Machines

Today, @MIT announced the creation of the Laboratory for Social Machines, funded by a

Read more...

All-pairs similarity via DIMSUM

Given a dataset of sparse vector data, we solve the problem of finding all similar vector pairs according to a similarity function.

Read more...

Using Twitter to measure earthquake impact in almost real time

With the help of researchers at Stanford, we used geo-tagged Tweets from around the world to measure earthquake impact in almost real time.

Read more...

Twitter #DataGrants selections

Learn more about the six institutions we’ve selected to receive Twitter #DataGrants.

Read more...

Introducing Twitter Data Grants

Today we’re introducing a pilot project we’re calling Twitter Data Grants, through which we’ll give a handful of research institutions access to our public and historical data.

Read more...

New Twitter search results

We just shipped a new version of the Twitter app with a brand new search experience that blends the most relevant content - Tweets, user accounts, images, news, related searches, and more - into a single stream of results. This is a major shift from how we have previously partitioned results by type (for instance, Tweet search vs. people search). We think this simplified experience makes it easier to find great content on Twitter using your mobile device.

Read more...

Dimension Independent Similarity Computation (DISCO)

MapReduce is a programming model for processing large data sets, typically used to do distributed computing on clusters of commodity computers. With large amount of processing power at hand, it’s very tempting to solve problems by brute force. However, we often combine clever sampling techniques with the power of MapReduce to extend its utility.

Read more...

Studying rapidly evolving user interests

Twitter is an amazing real-time information dissemination platform. We’ve seen events of historical importance such as the Arab Spring unfold via Tweets. We even know that Twitter is faster than earthquakes! However, can we more scientifically characterize the real-time nature of Twitter?

Read more...