Dylan Moses on transparency and why content moderation and censorship are not the same thing
June 17, 2024 | UGCWelcome to Part 2 of our interview with Dylan Moses, a veteran Trust and Safety professional with hands-on experience at major tech companies, and a Founding Fellow at the Integrity Institute. You can read Part 1 of our interview here.
In this segment, Dylan delves deeper into the critical role of transparency in content moderation, explaining how it not only builds user trust but also supports broader societal and regulatory frameworks. He also discusses practical steps platforms can take to be more transparent, the distinction between content moderation and censorship, and the nuances of implementing moderation practices across different global markets. Additionally, Dylan shares valuable insights and advice for new professionals entering the trust and safety field, and offers a thought-provoking perspective on the future of content moderation by 2030.
When you talk about transparency in content moderation decisions, is that primarily to build user trust or are there other elements there as well?
Dylan Moses: The primary goal of transparency in content moderation is to build user trust. Platforms with billions of users, like Facebook and YouTube, have become the modern public square, raising concerns about how speech is regulated.
There’s growing concern around platforms picking winners and losers in terms of speech and who gets to speak and who doesn’t, which voice rises to the top and which voices get deprioritized. People aren’t really sure what the mechanics are behind the scenes for those decisions. Often, users are left puzzled when their content is removed without a clear understanding of why, receiving generic responses that their post violated certain terms.
Transparency is crucial not just for explaining individual moderation decisions but also for understanding the broader mechanics of how platforms operate – how they design their algorithmically curated environments that impact what is seen and shared. For instance, there’s concern about how automated, engagement-based curation might contribute to serious societal issues, such as influencing eating disorders among children.
I think transparency also extends to the regulatory framework, allowing for better oversight and more effective interventions from law and policy makers. It’s about ensuring platforms are open about their operations to mitigate potential harms while preserving the benefits they offer. This kind of openness is akin to legal due process, where one is entitled to know the specific charges against them and the reasoning behind them, applied here to the context of platform governance.
How can platforms make it clear that they are doing their best and they’re trying to keep things safe and not silence people? What are some of the actions they can take to be more transparent?
Dylan Moses: That’s an excellent question. We’ve moved beyond the era when platforms would simply state they are working hard on these issues. At the time, working at Facebook, I believed there were sincere efforts or at least the right sentiment when it came to moderation, especially from those within the trust and safety teams who genuinely strive to address significant content challenges.
However, simply claiming to do the right thing isn’t enough. We’re at a point now where we need more than Scout’s honor. We need to show how the process works. For example, YouTube’s VP of Trust and Safety helped create a detailed video explaining their content moderation processes. This kind of transparency is essential and can be highly effective. Platforms can produce informative content – like a brief video or a detailed hour-long explanation – detailing what trust and safety involves, how policies are developed, and how decisions are made on controversial issues.
I think it’s important to educate the public on the collaboration that occurs behind the scenes. Trust and Safety decision makers don’t operate in a vacuum, but work with a range of policy experts to craft environments that are safe for all users, stopping only when speech veers into hate speech, misinformation, or extremism.
Additionally, the field of trust and safety is becoming more professionalized with organizations like the Trust and Safety Professional Association and the Integrity Institute now providing external expertise and transparency. They publish a wide range of materials and guidelines that help demystify the process of policy creation and enforcement in content moderation.
These days there are also many trust and safety events, such as TrustCon and RiskCon and Marketplace Risk. These conferences are widely attended and all the organizations that go are sharing their experiences. People are starting to take the lessons they’ve learned out into the public. But I think the platforms need to do it themselves as well.
You wrote in your recent article that content moderation and censorship are absolutely not the same thing. Can you explain the difference between content moderation and censorship?
Dylan Moses: Content moderation, as practiced by professionals at platforms like Facebook, YouTube, or Etsy, is fundamentally different from what we typically understand as censorship.
Censorship often involves a powerful entity using its authority to silence specific viewpoints, potentially with severe consequences like criminal sanctions or financial penalties, and really is more about the government’s ability to stifle expression. In contrast, content moderation on social platforms operates within a set framework where rules are established to maintain inclusivity. If someone violates these rules – say, by promoting hate speech or making threats – they face consequences like temporary bans or permanent suspensions. However, these actions are based on specific violations rather than an attempt to suppress free expression broadly.
Hardly anyone is ever truly censored for speech in the United States by the government. And the times that you are censored for speech, it’s likely because you’ve defamed somebody or you inflicted intentional emotional distress on someone else.
Unlike government-imposed censorship, being moderated on one platform doesn’t prevent individuals from speaking elsewhere. Users can often find other platforms to voice their opinions, suggesting a less restrictive form of regulation. In cases where users find themselves repeatedly banned across different platforms, it usually indicates a pattern of rule violations rather than arbitrary censorship.
Content moderation aims to prevent harm that arises from dangerous speech, such as incitement of violence or spreading misinformation that could disrupt peace and order. For instance, assertions that could lead to electoral misinformation or violence are moderated not out of political bias but to prevent real-world harm.
Although there’s no immediate cause for alarm – at least here in the States – we must be judicious about entrusting a single entity with extensive control over speech governance. This concern echoes the caution of America’s founders who feared government overreach could stifle political discourse, a principle central to the formation of our democratic ideals. Generally, platforms like TikTok or Meta are not engaged in suppressing political speech deliberately. However, there are instances where it can appear that way, underscoring the need for vigilance.
In essence, the distinction lies in moderation aiming to safeguard the platform and its users by enforcing rules, contrasting with censorship’s often punitive and broad silencing of speech. Our approach should involve careful oversight and transparency to ensure that the governance of speech on these platforms aligns with our values of free expression.
How do content moderation practices differ across global markets?
Dylan Moses: Content moderation policies are generally consistent globally because they’re often based on international humanitarian laws and principles, such as those outlined in the US Constitution. These include prohibitions against hate speech that calls for harm against protected groups or uses dehumanizing language. However, the implementation of these policies can vary to address specific regional sensitivities or legal requirements.
For example, while hate speech policies broadly prohibit harmful speech against defined groups, some countries require specific protections or adjustments based on local contexts. In India, for instance, the caste system necessitates particular policies to prevent discrimination and hate speech against people based on their caste. Similarly, in South Africa, historical and ongoing racial tensions require targeted measures to protect certain groups like Boer farmers from hate speech.
These nuances mean that while the overarching principles of content moderation may be scalable and consistent, platforms often need to make adjustments. When specific issues or conflicts arise that the platform was previously unaware of, it becomes necessary to engage with local experts and civil society to understand and address these unique challenges effectively (though this happens to varying degrees across platforms, and some critical voices are missed). This approach helps in developing country-specific policies that respect and protect local cultural and social norms.
So while the foundational standards for content moderation are designed to be universally applicable, they are flexible enough to accommodate layers of specificity to tackle particular local issues as they arise.
Reflecting on your journey through the field of trust and safety, what would you say are some of the key lessons you’ve learned and what advice would you give to new professionals entering this space?
Dylan Moses: One key lesson I’ve learned is the challenge of separating your personal feelings from professional responsibilities, especially when confronted with distressing content. For instance, during the Christchurch attack, I had to review an extremely disturbing live stream at least 60 times, as did other content moderators, a number of them Muslim. All of these folks were watching people who practice the same religion or look similar to them being killed just because of who they are and how they look and how they pray. This deeply affected our diverse team of moderators and it taught me the importance of dispassionate review.
Our legitimacy as moderators comes from our ability to apply policies impartially, after thorough vetting and understanding by all stakeholders involved – it’s important to remember this.
Another vital lesson is recognizing and respecting cultural differences. For me, it was realizing that the way that we do things in the United States is not the same way that people do things in Africa, or Asia, or Latin America, and learning how to both respect that and respect the cultural differences, but also knowing how to segment when the harm arises. Understanding these nuances is crucial not only for effective moderation but also for preemptive measures that address potential harms before they escalate.
Content moderation should also be integrated into any new product feature from the start, much like cybersecurity or privacy protections. This early integration prevents many problems downstream and ensures that the team is prepared to handle issues as they arise.
In terms of advice for new professionals, the first thing is to take care of yourself. The field of trust and safety can be mentally and emotionally taxing, often feeling like you’re constantly in crisis mode. It’s vital for your mental health to disconnect, spend time with loved ones, and maintain a healthy work-life balance.
Finally, ensure your work aligns with your personal morals and values. If you find a disconnect between your principles and the policies you’re required to enforce, it may be time to reconsider your role. Working in an environment where you believe in what you’re doing is crucial for your mental and emotional health.
What do you foresee for the future of content moderation and trust and safety by 2030?
Dylan Moses: By 2030, I anticipate significant advancements in AI’s role in content moderation. OpenAI, for example, has already achieved impressive accuracy rates for AI-driven hate speech moderation, reaching about 95% accuracy, which surpasses the typical 90-91% accuracy rates for human moderators. To me, this suggests a future where AI models, becoming increasingly sophisticated, handle the bulk of content moderation tasks.
One of the greatest challenges previously faced by AI in this field was understanding context, but as these models evolve, their ability to interpret nuances in conversation is improving markedly. This improvement means that soon, AI could handle more complex moderation tasks with less human intervention, though human oversight will remain essential. So I think the biggest trends that we’re going to see is that there will be less human moderation and more moderation by models, but certainly with human oversight.
However, this shift towards AI-driven moderation will not be without its challenges. Issues of fairness, transparency, contestability, and reliability could become more chronic as AI takes a larger role. It will be crucial for platforms to maintain rigorous standards of transparency about how decisions are made and to ensure that these decisions are fair and just, effectively providing a form of digital due process.
Furthermore, as AI starts to generate more content, not just moderate it, we will face new challenges regarding the integrity of content on social platforms. If bots begin to produce harmful content, we will need to consider the implications and develop robust policies to manage this. The varying levels of sophistication and efficacy among AI models mean that ensuring consistent and fair moderation will become increasingly complex.
In essence, the future of content moderation will likely see a blend of advanced AI-driven processes supplemented by critical human judgment, with an increased focus on ethical standards and transparency to guide these developments.