As one of the most critical infrastructure at Twitter, Observability provides highly scalable data collection and visualization services. This blog post gives overview of our architecture and shares our experience in developing and operating our systems.
As one of the most critical infrastructure at Twitter, observability provides highly scalable data collection and visualization services. Our post gives overview of our architecture and shares our experience in developing and operating our systems.
We explore lessons we learned while adding strong consistency to Manhattan and describe several problems that had to be solved along the way (implementing TTLs in a strongly consistent manner, doing distributed log truncations).
Figuring out the minimal number of users one must expose to an experimental treatment to collect actionable data is not a trivial task. We explain how we approach this problem with Twitter’s A/B testing platform (DDG), and how we communicate issues of statistical power to experimenters.
We recently learned about — and immediately fixed — a bug that affected our password recovery systems for about 24 hours last week. The bug had the potential to expose the email address and phone number associated with a small number of accounts (less than 10,000 active accounts). We’ve notified those account holders today, so if you weren’t notified, you weren’t affected.
We recently released Autograd for Torch, which greatly simplified our workflow when experimenting with complex deep learning architectures. The Twitter Cortex team is continuously investing in better tooling for manipulating our large datasets, and distributing training processes across machines in our cluster.
Today we’re open-sourcing four components of our training pipeline, so the community using Torch and/or Autograd can simplify their workflows when it comes to parallelizing training, and manipulating large, distributed datasets.