Insights Distributed training of sparse ML models — Part 3: Observed speedups

Using our customized data and model parallel distributed training strategy provides training speed improvements of up to 60x over single-node training for sparse machine learning models at Twitter.