Last year we announced the introduction of our new distributed stream computation system, Heron. Today we are excited to announce that we are open sourcing Heron under the permissive Apache v2.0 license. Heron is a proven, production-ready, real-time stream processing engine, which has been powering all of Twitter’s real-time analytics for over two years. Prior to Heron, we used Apache Storm, which we open sourced in 2011. Heron features a wide array of architectural improvements and is backward compatible with the Storm ecosystem for seamless adoption.
Everything that happens in the world happens on Twitter. That generates a huge volume of information in the form of billions of live Tweets and engagements. We need to process this constant stream of data in real-time to capture trending topics and conversations and provide better relevance to our users. This requires a streaming system that continuously examines the data in motion and computes analytics in real-time.
Heron is a streaming system that was born out of the challenges we faced due to increases in volume and diversity of data being processed, as well as the number of use cases for real-time analytics. We needed a system that scaled better, was easier to debug, had better performance, was easier to deploy and manage, and worked in a shared multi-tenant cluster environment.
To address these requirements, we weighed the options of whether to extend Storm, switch to another platform, or to develop a new system. Extending Storm would have required extensive redesign and rewrite of its core components. The next option we considered was using an existing open-source solution. However, there are a number of issues with respect to making several open systems work in their current form at our scale. In addition, these systems are not compatible with Storm’s API. Rewriting the existing topologies with a different API would have been time consuming, requiring our internal customers to go through a very long migration process. Furthermore, there are different libraries that have been developed on top of the Storm API, such as Summingbird. If we changed the underlying API of the streaming platform, we would have had to rewrite other higher-level components of our stack.
We concluded that our best option was to rewrite the system from the ground-up, reusing and building upon some of the existing components within Twitter.
Heron represents a fundamental change in streaming architecture from a thread-based system to a process-based system. It is written in industry-standard languages (Java/C++/Python) for efficiency, maintainability, and easier community adoption. Heron is also designed for deployment in modern cluster environments by integrating with powerful open source schedulers, such as Apache Mesos, Apache Aurora, Apache REEF, Slurm.
One of our primary requirements for Heron was ease of debugging and profiling. Heron addresses this by running each task in a process of its own, resulting in increased developer productivity as developers are able to quickly identify errors, profile tasks, and isolate performance issues.
To process large amounts of data in real-time, we designed Heron for high scale, as topologies can run on several hundred machines. At such a scale, optimal resource utilization is critical. We’ve seen 2-5x better efficiency with Heron, which has saved us significant OPEX and CAPEX costs. This level of efficiency was made possible by both the custom IPC layer and the simplification of the computational components’ architecture.
Running at Twitter-scale is not just about speed, it’s also about ease of deployment and management. Heron is designed as a library to simplify deployment. Furthermore, by integrating with off-the-shelf schedulers, Heron topologies safely run alongside critical services in a shared cluster, thereby simplifying management. Heron has proved to be reliable and easy to support, resulting in an order of magnitude reduction of incidents.
We built Heron on the basis of valuable knowledge garnered from our years of experience running Storm at Twitter. We are open sourcing Heron because we would like to share our insights and knowledge and continue to learn from and collaborate with the real-time streaming community.
Our early partners include both Fortune 500 companies, including Microsoft, and startups who are already using Heron for an expanding set of real-time use cases, including ETL, model enhancement, anomaly/fraud detection, IoT/IoE applications, embedded systems, VR/AR, advertisement bidding, financial, security, and social media.
“Heron enables organizations to deploy a unique real-time solution proven for the scale and reach of Twitter,” says Raghu Ramakrishnan, Chief Technology Officer (CTO) for the Data Group at Microsoft. “In working with Twitter, we are contributing an implementation of Heron that could be deployed on Apache Hadoop clusters running YARN and thereby opening up this technology to the entire big data ecosystem.”
We are currently considering moving Heron to an independent open source foundation. If you want to join this discussion, see this issue on GitHub. To join the Heron community, we recommend getting started at heronstreaming.io, joining the discussion on Twitter at @heronstreaming and viewing the source on GitHub.
Large projects like Heron would not have been possible without the help of many people.
Thanks to: Maosong Fu, Vikas R. Kedigehalli, Sailesh Mittal,Bill Graham, Neng Lu, Jingwei Wu, Christopher Kellogg, Andrew Jorgensen, Brian Hatfield, Michael Barry, Zhilan Zweiger, Luc Perkins, Sanjeev Kulkarni, Siddharth Taneja, Nikunj Bhagat, Mengdie Hu, Lawrence Yuan, Zuyu Zhang, and Jignesh Patel who worked on architecting, developing, and productionizing Heron.
Thanks to early testers who gave us valuable feedback on deployment and documentation.
 Twitter Heron: Streaming at Scale, Proceedings of ACM SIGMOD Conference, Melbourne, Australia, June 2015.
 Storm@Twitter, Proceedings of ACM SIGMOD Conference, Snowbird, Utah, June 2014.