How to put COVID-19 content moderation into context

The Facebook page that called for Egyptians to take to the streets on January 25, 2011—a day that would prove pivotal to the Arab Spring uprisings—was almost relegated to history. Just a few short months prior, Facebook had taken the page down, citing its policy requiring people to use their real name on the platform.

The Arab Spring marked the high point of the dream that huge social media platforms would bring the world together and place the power of the media in the hands of ordinary people. The uprising seemed a confirmation of the lofty free-speech rhetoric with which Facebook, YouTube, and Twitter all launched within the span of a few years in the aughts. But it’s been clear for a long time that the actions of the large U.S.-based internet platforms do not match that rhetoric.

Amid the COVID-19 pandemic, the censorship that has been a feature of the major platforms since their inception has only increased, only now more automated. And amid the pandemic, a funny thing has happened: The platforms are now being praised for finally being able and willing to carry out “content moderation”—a euphemism for what is actually private censorship—at scale. This development shouldn’t be seen as new, but as merely the latest development in how the companies respond to pressure from advertisers, governments, and civil society.

A brief history of content moderation

As the world watched Iranians, then Tunisians, Egyptians, and many others utilize social media to organize protests and to shed light on events ignored by traditional media, social-media companies were understandably proud of how their tools were being used and positioned themselves as champions of free speech. Twitter’s UK general manager famously described the site as “the free speech wing of the free speech party” in 2012, and Facebook’s early mission was to “give people the power to share and make the world more open and connected.” As recently as last fall, Facebook founder Mark Zuckerberg described the company’s values as “inspired by the American tradition, which is more supportive of free expression than anywhere else.”

The U.S. legal frameworks that these platforms were developed around—the First Amendment’s broad protections for editorial discretion and the intermediary immunity afforded by Section 230 of the Communications Decency Act—protected and encouraged curation and empowered the companies to decide both what to publish and, critically, what not to publish.

But the companies’ actions did not follow the lines drawn by the First Amendment for protected and unprotected speech. Most notably, most major platforms chose to ban most forms of nudity from their inception, despite the fact that sexual and nude expression has broad protection in the United States and elsewhere. Seeking to avoid drawing the ire of advertisers and advocacy groups, the companies followed cultural taboo rather than law. Because nude images are relatively easy to detect using technological filters, the choice to ban them was technologically rather simple.

The perception that social-media companies are zealous adherents of the First Amendment was driven largely by how the major platforms handled hateful speech. Hate speech, which is difficult to detect and adjudicate in many contexts, is largely protected by law in the United States, and subject to varying definitions and protections globally. Most platforms were historically more permissive of it, and even after putting bans in place, came under criticism for insufficiently enforcing them. By providing liability protections to the companies for content posted to their platforms, U.S. law provided cover for the platforms, as they hitched their brands to the First Amendment.

Although their policies are often criticized internationally as being “too American” and domestically as being too restrictive, the major platforms have never adhered to the spirit of the First Amendment, or for that matter, any other codified conception of free expression. Rather, their policies are the result of combining the individual values brought to the table by the authors of corporate policy documents and the pressure placed upon companies by civil society, domestic and foreign government actors, and of course, advertisers.

Centralized content moderation existed prior to the advent of Facebook, YouTube, and Twitter, but with far smaller databases. MySpace, for instance, had just 75 million users at its peak, and utilized a centralized but rather ad-hoc system for moderating content. Other services employed decentralized forms of content moderation, such as community curation. The large-scale outsourced moderation that we see now—dubbed “commercial content moderation” by scholar Sarah T. Roberts—emerged shortly after the turn of the last century, and did so rather quietly.

The result? A massive experiment in unofficial regulation of speech that requires users of platforms to snitch on their friends and content moderators, based largely in the global south, to view and censor all the things that we don’t want to see.

Gamergate, the Islamic State, and 2016

The Arab Spring brought some of the first major examples of wrongful content moderation to the fore and sparked internal conversations at the companies about how to deal with certain types of content. In the case of the event page taken down ahead of the 2011 protests in Egypt, the activists behind the page worked with an NGO to restore it and transfer its administration to someone willing to go on the record.

As some of the uprisings throughout the region grew violent, social media became a secondary battlefield as state and quasi-state actors found it a useful tool for repression and propaganda. At the same time, human-rights activists documenting violations found their content removed for transgressing rules against depicting graphic violence. And as the platforms grew, the problem worsened.

More recently, content moderation has become more visible largely because the major platforms started targeting those speakers who—unlike marginalized and vulnerable communities—were not accustomed to having their words erased. Three events drove this development: the coordinated harassment campaign known as “Gamergate” that targeted female video game developers and a media critic, the rise of the Islamic State and its online presence, and the 2016 U.S. presidential election.

Each instance focused negative attention on the major platforms’ decision-making. In the case of Gamergate, platforms mostly refused to take action against harassment and hate-speech. With the Islamic State, platforms were seen as sluggish in their response to taking down graphic terrorist propaganda. And in the 2016 election, platforms failed to address massive, often state-backed disinformation and allowed the normalization of white supremacist content. In response to pressure from internet users and Western, democratic governments, the major platforms pledged to ramp up content moderation.

In the following years, content moderation expanded steadily with the major platforms racing to develop policies, practices, and expertise for a seemingly endless array of content categories, including various types of misinformation and disinformation, many forms of violent extremism, and posts about regulated goods—often under significant pressure from external forces. The major platforms hired tens of thousands of employees to review content for possible removal and look into complaints filed by other users. And this was typically done with little regard for the human rights of the authors and their intended audiences.

Recognizing the human rights consequences of the major platforms’ widespread censorial treatment of user content, in 2018, several civil society groups and academics jointly drafted the Santa Clara Principles. These principles proposed as an initial step that companies engaged in content moderation should provide meaningful due process to impacted speakers and better ensure that enforcement of their content guidelines is fair, unbiased, proportional, and respectful of users’ rights. The three principles required public transparency about the amount of content flagged, restricted, and removed; adequate notice to users of the platform’s rules and of the treatment of the user’s content; and a system of meaningful appeal of any content moderation decisions.

The major platforms have been slow to fully embrace the Santa Clara Principles. A 2019 Electronic Frontier Foundation report tracked more than a dozen platforms across six categories, including how companies respond to specific demands for transparency, sufficient notice for government and platform takedown requests, and the right to appeal. One category encompasses a demand made by civil society for a number of years: that companies provide quantitative data about how many user appeals for content restoration are successful. Only one company, Reddit, fulfilled the requirements in full. Other companies, such as Facebook, provide some data, but exclude broad categories.

What used to be a mostly human job is now becoming increasingly automated. The desire to use automated tools is understandable: Content moderation is a horrible job for human beings, and its scale is so massive that it requires thousands of human reviewers and varied language skills and cultural knowledge. But automated tools are not magic. And we are concerned that the tools being developed are not subject to sufficient external auditing and quality control. Most companies have been opaque about the AI tools and there appears to be a considerable problem with false positives and subsequent over-censorship.

With the global spread of COVID-19, the impossible task of content moderation at scale has only become more impossible. Both content moderation itself and the use of AI have only increased during the COVID-19 crisis. With content moderators sent home and unable to work remotely, the platforms are moving toward AI-tools that we expect will greatly increase the number of false positives. Without the capacity to manage appeals, the effect of these errors is immeasurable.

While medical disinformation has long been a problem, it is seen as more urgent now. Posts urging defiance of public health orders are commonly targeted for removal. Facebook has published special revisions to its coordinating harm, regulated goods, bullying and harassment, and incitement policies. Although, these responses may be understandable, they do make the platforms’ job harder and highlight the deficiencies in their content moderation systems.

As people stay at home and spend more time online, the impact of content moderation errors takes on a new shape. When the only place to express oneself is online, freedom of expression becomes even more vital.

Jillian C. York is the director for international freedom of expression at the Electronic Frontier Foundation.
David Greene is the civil liberties director and a Senior Staff Attorney at the Electronic Frontier Foundation.

Facebook, Twitter, and Google, the parent company of YouTube, provide financial support to the Brookings Institution, a nonprofit organization devoted to rigorous, independent, in-depth public policy research.

Authors

How to put COVID-19 content moderation into context

Subscribe to TechStream

How to put COVID-19 content moderation into context

Jillian C. York and JCY Jillian C. York Director for International Freedom of Expression - Electronic Frontier Foundation David Greene DG David Greene Civil Liberties Director and Senior Staff Attorney - Electronic Frontier Foundation

A brief history of content moderation

Gamergate, the Islamic State, and 2016

Jillian C. York and David Greene