We recently released Autograd for Torch, which greatly simplified our workflow when experimenting with complex deep learning architectures. The Twitter Cortex team is continuously investing in better tooling for manipulating our large datasets, and distributing training processes across machines in our cluster.
Today we’re open-sourcing four components of our training pipeline, so the community using Torch and/or Autograd can simplify their workflows when it comes to parallelizing training, and manipulating large, distributed datasets.
Using a second control can be a tempting method of validating experiment results. We explore the statistics underlying usage of a second control, and conclude that this approach is strictly inferior to using a single large control.
We describe our experimental visual analytics approach for funnel analysis, which helps us explore how users interact with the user interfaces and gain new insights for improving user engagement with Twitter.
We language-annotated nearly 200k Tweets from 2014 in 68 languages, being careful to select them in a way that allows you to measure recall and precision well in order to evaluate and improve our language identification performance. You can download all the annotated Tweets.