Editorial note: This blog was first published on 2 December 2021 and last updated 24 August 2022 to include updates to our approach.
In October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. Since then, we’ve shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million Tweets and nine terabytes of media. More than 26,000 researchers have accessed these datasets, empowering an unprecedented level of empirical research into state-backed attacks on the integrity of the conversation on Twitter.
We strive to provide timely updates, alongside comprehensive data, whenever our teams identify and remove these campaigns, however, this year, due to technical issues and significant risks to the physical safety of our employees posed by certain disclosures, we have only provided one update. During this time, we’ve been working to identify a sustainable path forward, without compromising on our goals of providing meaningful transparency.
Today, in addition to disclosing eight additional datasets in our archive, we’re sharing an update on what we’ve learned from these efforts and how we intend to advance data-driven transparency in 2022 and beyond.
What we’ve learned so far
Where we’re headed in 2022
With these lessons in mind, as well as the emergent risks we see to the physical safety of our employees around the world tied to potential disclosures, we’re changing our approach in an effort to continue to provide expanded transparency about our content moderation actions. Here’s what you’ll see in the coming months:
In early 2022, we will launch the Twitter Moderation Research Consortium (TMRC) — a global group of experts from across academia, civil society, NGOs, and journalism studying platform governance issues.
- A proven track record of research on content moderation and integrity topics (or affiliation with a group that does such research, such as a university, research lab, or newspaper).
- Appropriate plans and systems for safeguarding the privacy and security of the data provided by the consortium.
Later in 2022, we will for the first time share similarly comprehensive data about other policy areas, including misinformation, coordinated harmful activity, and safety.
As part of this change, we will discontinue our fully public dataset releases, prioritizing release to the consortium. Existing datasets will continue to be available for download indefinitely — and our public data offerings, including free access to our APIs (including the full archive of Tweets) remain available.
Transparency is core to our mission. Our goal with these changes is to provide more transparency about more issues, while grappling with the considerable safety, security, and integrity challenges in this space. We’ll continue to learn and iterate on our approach over time and share those findings publicly along the way.
7 June 2022
Today, we’re opening up the Twitter Moderation Research Consortium to a limited group of researchers. We’ll use this initial period to gather learnings and make adjustments to program design, where needed, ahead of our forthcoming public launch. Feedback from these researchers will help shape and inform our work.
During this period, membership will be open to applicants who were granted access to our information operation data sets during prior disclosures. Researchers with prior access may re-apply for the Consortium during this phase, and will be evaluated in line with the below updated criteria:
Later this year, we’ll open up the application for Consortium membership to the wider public and share key learnings from the beta period.
As we’ve said previously, transparency is core to our work here. Through this updated approach, we aim to share more about what we’re seeing on the Twitter service, while addressing the safety, security, and integrity challenges that accompany these disclosures. Down the line, we’ll disclose data about other policy areas, including misinformation, coordinated harmful activity, and safety. More to come.
24 August 2022
We’re sharing an update on our Twitter Moderation Research Consortium (TMRC) program. In the coming days, our TMRC global partners – the Stanford Internet Observatory, the Australian Strategic Policy Institute, and Cazadores de Fake News – will publish independent research about Twitter’s latest information operation data sets. These 15 data sets include platform manipulation campaigns originating from the Americas, Asia Pacific (APAC), Europe, the Middle East and North Africa (EMEA), and Sub-Saharan Africa (SSA).
As we noted last year (above), we will now prioritize sharing information operations data with the Consortium. While we continue to share data with researchers about the networks we remove – including Twitter's assessment of the presumptive country of origin, based on the technical signals that we observe – we will no longer provide specific attribution information. Meaning, we won’t publicly disclose whether an information operation was carried out by a specific state actor, or other actor.
Our goal is to remain transparent about the activity we identify on Twitter, while addressing the considerable safety, security, and integrity challenges that come with disclosures of this kind. Information operations are also increasingly hard to attribute to specific actors – this change allows us to disclose data from a broader set of coordinated campaigns. Further, this change will allow researchers to piece together operations across multiple platforms and services, beyond what’s possible with just one platform.
We continue to draw learnings from the initial early access period. We’re focused on developing a global group of Consortium members, and to date, have accepted applications from researchers around the world. Later this year, we’ll open the TMRC application process to a broader group of researchers.
Down the line, we intend to share data about a wider range of Safety and Integrity policy areas.
Did someone say … cookies?