Infrastructure Twitter Sparrow tackles data storage challenges of scale

In this blog post, we discuss the scale at which Twitter operates through Project Sparrow, an initiative that shifted the architecture of our data pipelines from a batch event approach to streaming.

Insights Over-squashing, Bottlenecks, and Graph Ricci curvature

Over-squashing is a common plight of Graph Neural Networks. In this post, we discuss how this phenomenon can be understood and remedied through the concept of Ricci curvature.

Insights Reconsidering Tweets

In this blog, we share findings from a follow-up analysis in which we seek to understand how prompts cause people to reconsider potentially harmful or offensive content before they hit send.

Infrastructure Scaling data access by moving an exabyte of data to Google Cloud

In this blog, we discuss how we approached migrating an exabyte of data to Google Cloud to make it easier for our Tweeps to analyze and visualize data.

Infrastructure Powering real-time data analytics with Druid at Twitter

In this blog, we share an overview of Twitter’s Druid ecosystem and discuss our work towards a unified ingestion experience.

Insights Understanding Twitter conversations: A Wordle case study

On the Data Science team, a piece of our job is understanding how conversations are happening on Twitter. In this blog, we use Wordle as a case study to showcase how we think about similar analyses.

Insights Graph machine learning with missing node features

This blog post discusses how feature propagation can be an efficient and scalable approach for handling missing features in graph machine learning applications.

Infrastructure Data transfer in Manhattan using RocksDB

In this blog post, we talk about a performance and stability problem we encountered while migrating Manhattan's storage engine to RocksDB and how we solved it.

Insights Next generation data insights using natural language queries

In this blog, we share details about how we built an in-house product called Qurious, which allows our internal customers to get answers to their analytical queries through natural language questions.

Open source Introducing a new Swift package for Apache Thrift

Twitter’s new Thrift library is an open-source, standalone, lightweight, data encoding library. In this blog, we share our library so iOS developers outside Twitter can start using Thrift data.

See more