Insights

Investing in privacy enhancing tech to advance transparency in ML

By

and

Thursday, 20 January 2022

As part of our commitment to responsible machine learning (ML), we’re continuously looking for opportunities to advance the field while improving the technology that we rely on at Twitter. We’ve worked to hold ourselves accountable over the past year to the people who use our service by publicly sharing internal research. Through this, we’ve experienced first-hand the challenges of sharing sensitive or otherwise protected data, which is why we’re investing in privacy-enhancing technologies (PETs) to pioneer new methods of public accountability and access to data in a manner that respects and protects the privacy of the people who use our service.

Improving accountability while preserving privacy

PETs are a promising new technology that have the potential to allow data analysis without risk of exposing any underlying personal information to the individuals accessing the data. Where privacy and public accountability are at odds, the emergence of these new technologies makes it increasingly possible for us to achieve both.

Driven by our ML Ethics, Transparency, and Accountability (META) team, we’re embarking on a journey to develop privacy-preserving methods to (1) enable external researchers to access non-public Twitter data, and (2) internally democratize and scale our Responsible ML Workbench, a series of custom-built ML fairness and ethics tools we use at Twitter.

Partnering with OpenMined

As an initial step, we’re partnering with OpenMined, an open-source nonprofit organization pioneering in this space. OpenMined has successfully launched PETs and privacy-preserving machine learning (PPML) methods to reduce research barriers in health care and the public sector.

This collaboration is intended to test and explore the potential for PETs at Twitter. We share OpenMined’s vision of a healthy ecosystem of algorithmic transparency based on an infrastructure of PETs, which could broadly allow for third parties to assess an algorithm’s behavior without exposing either the algorithm or its training data. We hope our support of OpenMined will contribute to open source solutions that are beneficial to all, not just our service.

When we published our recent study on the amplification of political content on Twitter, the aggregate data we were able to safely share limited the research community’s ability to replicate our findings and conduct their own investigations. That’s because much of the data we used for that study was internal, meaning it included non-public engagement data about how people use Twitter.

Our first milestone with OpenMined is a PET-accessible method of replicating our political amplification findings on a synthetic data set similar to the data used in our own research. After that, we hope to share the actual data, protected through PETs. Over time, we hope our partnership will make it possible for researchers and technology professionals outside of Twitter to go beyond the data that is currently available via our public API and conduct research on the same data used in our own internal analyses. You’ll be hearing more from us as we work to meet this milestone.

What’s next

Meaningful transparency and accountability requires dedicated investment and effort. Pursuing these projects openly and supporting organizations like OpenMined is a key part of our Responsible ML initiative. Collaborating with OpenMined will advance Twitter’s commitment to protecting the privacy of people on our service while providing data researchers with the tools they need to do their work and hold us to account. Our investment here is just a first step, and we’ll continue to share what we learn as this technology evolves.

This post is unavailable

This post is unavailable.

Only on X

Post