In October 2020, people on Twitter raised concerns that the saliency model we used to crop images didn’t serve all people equitably. Shortly thereafter, we published our algorithmic bias assessment which confirmed the model was not treating all people fairly. We committed to decrease our reliance on ML-based image cropping since the decision to crop an image is best made by people. In May 2021 we began rolling out changes to how images appear on Twitter. Now, standard aspect ratio photos appear uncropped on the Home Timeline on mobile, and we are working on further improvements that build on this initial effort.
But our work didn’t stop there. In August 2021, we held the first algorithmic bias bounty challenge and invited the ethical AI hacker community to take apart our algorithm to identify additional bias and other potential harms within it. The results of their findings confirmed our hypothesis: we can’t solve these challenges alone, and our understanding of bias in AI can be improved when diverse voices are able to contribute to the conversation.
In this post, we’ll share what we learned through creating and hosting this challenge, what the submissions taught us, and what’s next. We believe it’s critical we start a dialogue and encourage community-led, proactive surfacing and mitigation of algorithmic harms before they reach the public. We are hopeful that bias bounties can be an important tool going forward for companies to solicit feedback and understand potential problems.
Using a community-led approach to build better algorithms
When building machine learning systems, it’s nearly impossible to foresee all potential problems and ensure that a model will serve all groups of people equitably. But beyond that, when designing products that make automatic decisions, upholding the status quo oftentimes leads to reinforcing existing cultural and social biases.
Direct feedback from the communities who are affected by our algorithms helps us design products to serve all people and communities. That’s why we launched this bounty challenge. Our hypothesis was that by creating an opportunity for people who had historically done this sort of work for free, and incentivizing them to be both recognized and rewarded for their contributions, we would be able to learn from a diverse, global community of ethical AI hackers whose lived experiences make it possible for them to discover unintended consequences we wouldn’t have otherwise been able to.
The most challenging aspect of this was creating a grading rubric in order to judge participants’ submissions, inspired by previous frameworks in privacy and security for assessing risk. Why was this difficult? The challenge here was coming up with a rubric that was concrete enough to grade and compare submissions, but broad enough to encompass a wide variety of harms and methodologies. We wanted to make sure our rubric would still allow people to be really creative with their submissions and the problems they perceive. We focused on issues that have historically received less attention in fair ML research, such as representational harms, so we assigned a different number of points to different types of harms. We also encouraged qualitative analyses, grading each submission by not only their code, but their assessment of why their approach and perspective was relevant.
The bias bounty challenge helped us uncover a really wide range of issues in a short amount of time coming from a diverse group of participants. The winning submission used a counterfactual approach to demonstrate that the model tends to encode stereotypical beauty standards, such as a preference for slimmer, younger, feminine, and lighter-skinned faces. The second place submission confirmed the age bias found by the first place submission by showcasing how the algorithm rarely chooses people with white hair as the most salient person in a multi-face image and also studied spatial gaze bias in group photos with people with disabilities. The third place submission analysed linguistic bias for English over Arabic script in memes.
The most innovative prize was given to an entry that demonstrated that the model prefers emojis with lighter skin. And the most generalizable submission was for an adversarial approach that proved that by adding a simple padding around an image, the cropping can be avoided.
We learned a lot from this experience
In addition to validating and expanding upon some of our previous findings, we also noticed multiple submissions identifying similar harms. We were pleased to see submissions that recognized the impact bias in ML can have on groups beyond those addressed in our previous work, such as veterans, religious groups, people with disabilities, the elderly, and individuals who communicate in non-Western languages. Often, the conversation around bias in ML is focused on race and gender, but as we saw through this challenge, bias can take many forms. Research in fair machine learning has historically focused on Western and US-centric issues, so we were particularly inspired to see multiple submissions that focused on problems related to the Global South.
Since one of our goals was to gather feedback from a diverse group of people, we were also encouraged to see submissions from a wide array of participants. We had submissions from around the world, ranging from individuals, to universities, startups, and enterprise companies. It was also inspiring to see successful submissions from a wide range of backgrounds, including people who do not have an expertise in machine learning.
Lastly, results of the bounty suggest biases seem to be embedded in the core saliency model and these biases are often learned from the training data*. Our saliency model was trained on open source human eye-tracking data, which poses the risk of embedding conscious and unconscious biases. Since saliency is a commonly used image processing technique and these datasets are open source, we hope others that have utilized these datasets can leverage the insights surfaced from the bounty to improve their own products.
This is the first step of many that need to come
What we learned through the submissions from this challenge will help inform how we think about similar issues in the future, and how we help educate other teams at Twitter about how to build more responsible models. Moving forward, we encourage others to use our rubric for measuring potential harms and to iterate on it. We will also be implementing the rubric internally as a facet of our model risk assessments to enhance our focus on representational harms.
As we shared in May, we are already working towards no longer using saliency-based cropping on Twitter, and continue to investigate other algorithms Twitter uses to identify areas we can use ML more responsibly. However, saliency modeling is not unique to Twitter, and there are likely many places where saliency is still in use today. We hope that by sharing our blueprint for identifying harms and sharing findings openly, other organisations will learn from our insights.
*We do not intend to suggest that bias in machine learning is purely a data problem. See our paper for a discussion of how modeling decisions amplify bias in the system (argmax bias).
Did someone say … cookies?