Using Twitter data to study the world's health

Tuesday, 14 July 2015

The start of web epidemiology
The overwhelming amount of health-related conversations on Twitter has been a goldmine for the field of epidemiology in that people are often very open and candid about their health issues in Tweets. One of the earliest innovators in this field is Dr. John Brownstein (@johnbrownstein), who has used Twitter data to study a myriad of interrelated public health issues including chronic disease, disease detection, quality of patient care, gun violence and more.

One of the primary attributes that drew Dr. Brownstein to Twitter data was the real-time nature of the social platform. Traditional research methods involving panels, studies and surveys can take months to gather data. Twitter data was a perfect choice in that it instantly surfaced comparable data on a global scale. The ready availability of Twitter’s Data APIs made the data source even more attractive in that it was easily within reach of researchers with limited financial and human resources.

Another attribute that appealed to Dr. Brownstein is the openness of people when discussing health issues. One study around foodborne illnesses (i.e., food poisoning) especially surprised Dr. Brownstein and his team when they learned that millions of people Tweeted about stomach problems, even potentially embarrassing topics such as diarrhea. One conclusion that he has drawn about this openness to discussing health problems on social media is because people are looking to to commiserate with others and platforms like Twitter makes it very easy to share personal experiences.

Digital traces are these breadcrumbs that people leave behind about their health. On an individual and collective level, they can tell us about people’s health

Dr. John Brownstein@johnbrownstein

Dr. John Brownstein has always been interested in how data streams can provide a collective view of people’s health. He trained at Yale to become an epidemiologist, a scientist who examines the spread of diseases, and ultimately earned his Ph.D in the field. In most epidemiology studies, he saw that researchers tended to leverage traditionally obtained clinical data. This clinical data often lagged well-behind the actual occurrence of an illness or disease and was further geographically limited, and he realized that there had to be a more timely way to collect data. These frustrations with traditional research methods led him to explore the use of social data in epidemiology studies.

Dr. Brownstein’s new study methods further shaped the path to his current role as the Chief Innovation Officer of Boston Children’s Hospital (@BostonChildrens). In addition to this position as a key leader of innovation, Dr. Brownstein is also an Associate Professor at Harvard Medical School, directs the Computational Epidemiology Group within the Informatics Program at Boston Children’s and is the co-founder of HealthMap (@healthmap). HealthMap is a unique project that brings together disparate data sources to show a comprehensive view of the current global state of infectious diseases. It’s little wonder that all of these roles and experiences have helped position him as one of the world’s premier web epidemiologists.

Dr. John Brownstein

Public health research using Twitter data
Dr. Brownstein’s team of more than 50 researchers have delved into some fairly complex research topics through the use of Twitter data. One of the first studies where the HealthMap team applied Twitter data was tracking cholera after a recent earthquake in Haiti. This represented a significant breakthrough as official reports from health agencies lag two weeks after the first outbreak, whereas Twitter data represented a potential way to track diseases faster in future events.

Since that time, Dr. Brownstein and his team have demonstrated that Twitter data can provide daily city-level monitoring in order to track flu outbreaks. Another project studied whether the team could determine drug side-effects through the use of Twitter data. They looked at 23 common medications such as antidepressants and over-the-counter medicines, and found 60,000 relevant Tweets during a six-month period with nearly 5,000 of those messages specifically describing side effects of the medications. The hope is that this type of research can alert the FDA to unexpected side effects of other new drugs in years to come.

Another of the team’s project focused on how Twitter data can serve as a sentinel in emergency situations. By looking at the conversations pertaining to the Boston Marathon bombing, they found that Twitter data could help signal current threats to emergency rooms as well as help direct other emergency service efforts.

Dr. Brownstein was one of six recipients of a Twitter #DataGrant and used the data to measure the incidence of and conversations around foodborne illnesses. The hope is that this research will produce a new monitoring method for health issues at restaurants and help to identify potentially contaminated food products. His team is currently in the process of building a platform capable of machine learning that will listen to Twitter for the next outbreak of food poisoning, all thanks to users conversations of their symptoms and offending restaurants.

Tracking foodborne illnesses with Twitter data

One last application of Twitter data is its use as an indicator for atypical sleeping patterns. The most recent research report by Dr. Brownstein’s research team showed that what and when people Tweeted could indicate sleep deprivation. This research helped validate that people with sleep issues had more negative sentiment and found themselves more isolated in terms of social interactivity. This research is helping researchers to think about other novel ways of investigating abnormal sleep patterns as well as hopefully finding solutions to otherwise sleepless nights.

Studying sleeping patterns with Twitter data

What’s next in digital epidemiology?
So how can this research help the world? Studies using Twitter data can help reveal overlooked risk factors and further shed light on the societal burden of diseases. Researchers can also identify those places where certain diseases are worse than others, making it easier for health professionals to provide prevention and treatment.

It is Dr. Brownstein’s firm belief that Twitter data will have a deep impact the next time a global pandemic happens.

Understanding how a disease starts to manifest itself better positions response teams to address the physical symptoms and geographic reach. Twitter data serves as a first source of discussion about diseases, making it easier to find new cases and new patterns. By having a more clear understanding of where diseases spread, the hope is that epidemiologists will be able to communicate and address the risks in real-time.

We’ll keep you posted on Dr. Brownstein’s future research using Twitter data in digital epidemiology.