Another look at MySQL at Twitter and incubating Mysos

Thursday, 16 April 2015

While we’re at the Percona Live MySQL Conference, we’d like to discuss updates on how Twitter uses MySQL, as well as share our plans to open source Mysos, a new MySQL on Apache Mesos framework.

MySQL at Twitter
Since Twitter was founded, MySQL has been one of our key data storage technologies. We store data in hundreds of schemas and our largest cluster is thousands of nodes serving millions of queries per second. At the scale of Twitter, we are pushing MySQL to its limits.

At Twitter, MySQL is used in two ways:

  • As part of data services: MySQL is used as storage nodes of a distributed data store within Twitter’s own sharding framework. Here we leverage the reliability and high performance of MySQL on individual storage nodes while the sharding framework manages the distribution and high availability of our data. Some of our biggest MySQL clusters are over a thousand servers.
  • As relational data stores: We use MySQL replication for fault-tolerance and read-scalability. We scale to a large volume of reads using clusters with standard MySQL replication. These clusters store a wide variety of data from commerce and ads to authentication, trends, internal services and more.

As Twitter has grown as a service, so has our need to scale storage technologies like MySQL, PostgreSQL, Vertica and Manhattan in production. To make this possible, we’re actively hiring for positions including database administrators, site reliability engineers and software engineers working on distributed storage technology.

Contributing to the open source community

Twitter has benefited greatly from the MySQL community and we’ve contributed many patches back upstream. Examples of patches contributed to MySQL include:

  • Bug #75298: Purge thread should check purge_sys->state after every batch
  • Bug #74512: excessive split/merge for unique sec index
  • Bug #74511: adaptive_hash_searches_btree not updated
  • Bug #72520: os_event_wait_time_low(): wait time calculation is messed up
  • Bug #71411: buf_flush_LRU() does not return correct number in case of compressed pages

Twitter is also part of the WebScaleSQL initiative, which just won the 2015 Corporate Contributor Award to the MySQL community.

The goal of WebScaleSQL is to enable the scale-oriented members of the MySQL community to work closer to build and add more features to MySQL that are specific to deployments in large scale environments. You can check out some of our examples of patches contributed to WebScaleSQL here.

Introducing and incubating Mysos

In an effort to improve the scalability and management of our MySQL clusters, we’ve begun work on a new framework called Mysos. The Mysos project leverages Apache Mesos to build a scalable database service for MySQL. Mesos provides primitives to allow Mysos to reliably schedule, monitor and communicate with MySQL instances. As a storage framework, Mysos will be able to use recently-added persistent storage primitives within Mesos. It dramatically simplifies the management of a MySQL cluster and is designed to offer:

  • Efficient hardware utilization through multi-tenancy (in performance-isolated containers)
  • High reliability through preserving the MySQL state during failure and automatic backing up to/restoring from HDFS
  • An automated self-service option for bringing up new MySQL clusters
  • High availability through automatic MySQL master failover
  • An elastic solution that allows users to easily scale up and down a MySQL cluster by changing the number of slave instances

Having initially developed Mysos, we’re now announcing our plans to open source the project and our intention to build a strong, independent open source community around Mysos. We are still in the early stages and the code isn’t meant for production usage yet. However, we are starting to seed the code and a proposal at the Apache Incubator with engineers from Twitter, Mesosphere and the Apache Mesos community. We invite anyone interested in scaling MySQL on Mesos to reach out to the Mysos community — we’d love to have you involved. At the moment, we’re best reached by visiting the #mysos IRC channel on Freenode.

Acknowledgements

The Mysos project was born out of a collaboration between members of the Cloud Infrastructure team at Twitter (Yan Xu, Chris Lambert, Vinod Kone, Dominic Hamon, Jie Yu, Ben Mahler, Brian Wickman), members of the @TwitterOSS team (Chris Aniszczyk, Dave Lester), members of the @TwitterDBA team (Chuck Sumner, Jonah Berquist, Pascal Borghino), and the MySQL team (Calvin Sun, Inaam Rana, Tugrul Bingol).