Company

Bot or not? The facts about platform manipulation on Twitter

By

and

Monday, 18 May 2020

Going back a few years, there’s been a lot of discussion of “bots” online. Over time, however, it’s become a loaded and often misunderstood term.

People often refer to bots when describing everything from automated account activity to individuals who would prefer to be anonymous for personal or safety reasons, or avoid a photo because they’ve got strong privacy concerns. The term is used to mischaracterize accounts with numerical usernames that are auto-generated when your preference is taken, and more worryingly, as a tool by those in positions of political power to tarnish the views of people who may disagree with them or online public opinion that’s not favorable.

There are also many commercial services that purport to offer insights on bots and their activity online, and frequently their focus is entirely on Twitter due to the free data we provide through our public APIs. Unfortunately, this research is rarely peer-reviewed and often does not hold up to scrutiny, further confusing the public's understanding of what’s really happening.

Let’s break it down and explain the facts.

What’s a bot?

Based on what we’ve described above, there’s a lot of understandable confusion and we need to do a better job of explaining ourselves. In sum, a bot is an automated account — nothing more or less.

Going back a few years, automated accounts were a problem for us. We focused on it, made the investments, and have seen significant gains in tackling them across all surfaces of Twitter. That doesn’t mean our work is done.

This post is unavailable

This post is unavailable.

What’s more important to focus on in 2020 is the holistic behavior of an account, not just whether it’s automated or not. That’s why calls for bot labeling don’t capture the problem we’re trying to solve and the errors we could make to real people that need our service to make their voice heard. It’s not just a binary question of bot or not — the gradients in between are what matter.

That’s why we focus our attention on where the most critical work is. We call it platform manipulation.

What’s platform manipulation?

Our proactive work is focused on manipulation in many forms and that includes the malicious use of automation. As discussed, our policies in this area focus on behavior, not content, and are written in a way that targets the spammy tactics different people or groups could use to try to manipulate conversations on Twitter (rather than on what they're specifically saying).

It’s important to note, not all forms of automation are necessarily violations of the Twitter Rules. We've seen innovative and creative uses of automation to enrich the Twitter experience — for example, accounts like @pentametron and @tinycarebot.

Automation can also be a powerful tool in customer service interactions, where a conversational bot can help find information about orders or travel reservations automatically. This is incredibly useful and efficient for small businesses, especially at a time of social distancing.

So what’s prohibited?

Malicious use of automation to undermine and disrupt the public conversation, like trying to get something to trend
Artificial amplification of conversations on Twitter, including through creating multiple or overlapping accounts
Generating, soliciting, or purchasing fake engagements
Engaging in bulk or aggressive tweeting, engaging, or following
Using hashtags in a spammy way, including using unrelated hashtags in a tweet (aka "hashtag cramming")

Our technological power to proactively identify and remove these behaviors across our service is more sophisticated than ever. We permanently suspend millions of accounts every month that are automated or spammy, and we do this before they ever reach an eyeball in a Twitter Timeline or Search.

We also publish data on our removals every six months in the Twitter Transparency Report.

What about tools like Botometer & Bot Sentinel?

These tools start with a human looking at an account — often using the same public account information you can see on Twitter — and identifying characteristics that make it a bot. So in essence, the account name, the level of Tweeting, the location in the bio, the hashtags used etc.

This is an extremely limited approach.

As mentioned, an account with a strange handle is often someone who was automatically recommended that username because their real-name was taken at sign-up. An account with no photo or location may be someone who has personal feelings on online privacy or whose use of Twitter may expose them to risk, like an activist or dissident. Don’t like to add much of a bio or your location to your account? Some of us at Twitter don’t either. Even if all of these public details are put into a machine learning model to try to probabilistically predict if an account is a bot, when they rely on human analysis of public account information, that process contains biases — from the start.

Let’s take some other everyday examples for clarity. Someone who Tweets 100 times a day with #SuperBowl could just be an extremely engaged individual who loves football (or hates it if their team is bad). If you care deeply about the environment and Tweet about it at a certain political moment when you and your friends want to make an impact, you’re not “political bots” — you're active citizens organizing an organic online campaign to drive change in your community.

These tools do not account for these common Twitter use cases, how far we’ve come, and how things have evolved. As a result nuance can be lost. The outcome? Binary judgments of who’s a “bot or not”, which have real potential to poison our public discourse — particularly when they are pushed out through the media.

That doesn’t mean we’re perfect. Of course not. It’s just that the threat has evolved and the narrative on what’s actually going on is increasingly behind the curve.

If I see suspicious activity, can I report it?

Yes. If you see suspicious activity, report it to us. We add your signal to the hundreds of others we use to inform our technical approach.

If we want to create a healthy information ecosystem, we all have a part to play. We’ve reinforced these points many times and we will get stronger and more direct in our public efforts at this critical moment in the global public conversation. We’re keenly aware of our responsibility in this space. That includes protecting the integrity of our service, continuing to keep platform manipulation off of Twitter, and leading with transparency by sharing regular updates on our progress and learnings.

This post is unavailable

This post is unavailable.

Post