The evolution of image moderation – from manual review to automated systems
October 23, 2024 | Image ModerationIn the mid-2000s, image moderation on the internet felt like a simpler task, even if it was often thankless. Back then, moderators combed through image uploads, searching for signs of explicit nudity, violence, or crude jokes. The mission was straightforward: keep content safe and appropriate for all users. The internet of that era looks quaint in hindsight. Social media was still a nascent force, memes were a new novelty, and most “offensive” content consisted of pixelated images or text-based obscenities.
Today, image moderation has morphed into something far more complex – a high-stakes game of digital whack-a-mole where AI-generated deepfakes and hyper-realistic simulations pose ethical dilemmas that go beyond a simple thumbs up or down.
Consider this: just over a decade ago, the biggest threats in image moderation were explicit photos and crude Photoshopped pranks. Now, content moderators must distinguish between real and AI-generated faces that can mimic not only the appearance but the expressions and nuances of actual human beings.
Take, for example, actor and director Jordan Peele’s famous deepfake video of Barack Obama in 2018, depicting a world leader seemingly giving a controversial speech. The video was presented by Peele as an example of the sheer power – and peril – of AI-generated content and how it can potentially shift public opinion in real time. It was a warning that the stakes are much higher, and the consequences of moderation failures can be catastrophic.
WebPurify has been on the front line of this shift for 18 years. Starting as a content moderation company when user-generated content (UGC) was mostly about managing forums and chatrooms, our business has evolved to meet the challenges of an ever-changing digital ecosystem. We’ve adapted to countless new threats, whether they involve generative AI content, synthetic media, responding to crises in real-time or flagging misinformation campaigns. The key to survival in this space is not just reacting to what’s new but anticipating what’s next. Our team, leveraging nearly two decades of hands-on experience, is constantly iterating our moderation models to identify emerging trends and respond swiftly when new risks arise.
While it’s easy to focus on the present – on the rise of deepfakes and synthetic content – the journey of image moderation is a story of continuous evolution, marked by key moments that have shaped it into what it is today. It’s a story of technology catching up to human ingenuity, ethical lines being redrawn, and the online world becoming ever more sophisticated. Below, we reflect on the early days of manual review all the way up to today’s automated systems powered by AI and machine learning, and show how image moderation has followed an extraordinary path.
Early days of image moderation – from human eyes to basic algorithms
In the web’s early days, the tools of the trade were human eyes and, if you were lucky, some rudimentary algorithms that could scan for basic patterns – like skin tones or colors commonly associated with explicit content. It was a manual slog. Every upload was a judgment call, made by a person staring at a screen and deciding whether an image crossed the line. The scale was manageable because the internet itself was smaller. Social networks were still evolving, and UGC platforms had not yet ballooned into the behemoths they would soon become. But as platforms like MySpace and YouTube ushered in an era where everyone with a camera phone could be a creator, the volume of content to review exploded.
Human moderators were overwhelmed. Suddenly, they were tasked with sifting through millions of uploads a day. What had started as a manageable process became a digital assembly line, where each reviewer’s decision determined whether content was published or flagged. The industry began to realize that a purely human approach was not only unsustainable – it was also inefficient.
The solution lay in finding ways to automate what could be automated, while keeping human oversight for the gray areas where judgment was still essential.
The Machine Learning Leap – teaching computers to see
The arrival of machine learning algorithms in the early 2010s changed the game for image moderation. The new systems could be trained to recognize images based on vast datasets of labeled content – essentially teaching computers to “see” patterns in ways that were previously unimaginable.
Before this leap, moderation primarily relied on basic metadata analysis and manual visual review. For instance, an image might be flagged based on its filename or associated text, such as a title like “explicit photo,” rather than the content of the image itself. With the introduction of machine learning, algorithms evolved to analyze the visual elements directly, enabling them to detect problematic content like nudity, violence, or drug paraphernalia with greater speed and accuracy.
Yet, these early models were still blunt instruments. They could detect a bare expanse of skin but struggled to understand context. A picture of a swimsuit model on a beach could easily be flagged as inappropriate, while a similarly revealing image in an educational context would pass through undetected. It was a learning curve for both the machines and the companies deploying them.
WebPurify and its peers needed to continuously refine these algorithms, adjusting for cultural sensitivities and context – both of which play crucial roles in determining whether an image is suitable for a platform. At WebPurify, we have also always believed in the power of combining the critical thinking of human moderators with the scale and efficiency of AI to provide the utmost accuracy.
Of course, as machine-learning models evolved, so did the threats. New forms of visual manipulation appeared, making it harder for algorithms to keep up. Moderators faced a flood of manipulated images that blurred the lines between what was real and what wasn’t, pushing the limits of what automated systems could reliably identify. This ushered in a new era: one where the rise of synthetic media would reshape the very foundations of what image moderation aimed to achieve.
The Deepfake Challenge: when reality and fabrication converge
The emergence of deepfakes in the late 2010s signaled a shift from moderation as a reactive measure to a proactive necessity. Deepfake technology, with its ability to fabricate highly realistic but entirely fictional images and videos, introduced a new level of complexity to content moderation. Suddenly, the tools that could detect nudity or violence were not enough; moderators needed to be able to distinguish real from fake.
For WebPurify, adapting to the challenges of generative AI meant developing tools that go beyond surface-level content analysis. By using AI-based detection models trained to identify visual inconsistencies and artifacts left behind by synthetic media, WebPurify can more effectively detect manipulated content. These models are built to recognise subtle irregularities, such as lighting mismatches or distortions, which are common in AI-generated images and videos. However, no single tool can catch everything. That’s why WebPurify uses a layered approach to generative AI content moderation — combining AI-driven detection with manual review by human experts, ensuring more accurate decisions, particularly in complex cases where context and nuance are key.
The deepfake era isn’t just a technical challenge; it presents an ethical minefield. The potential for harm extends far beyond the usual concerns of explicit content. Deepfakes can impersonate public figures, spread disinformation, or harass private individuals by inserting them into compromising scenarios. And as deepfake technology advances, so too have the tools and strategies employed to detect it. Image moderation has become a relentless game of cat and mouse, with each advance in synthetic media requiring a countermeasure in moderation.
The ethics of automation – when algorithms decide what we see
As image moderation systems have grown increasingly automated, several ethical questions, too, have surfaced. Algorithms, while efficient, are not impartial; they are shaped by the data used to train them. Biases – both subtle and overt – could seep into moderation decisions, with algorithms potentially flagging content from certain cultural groups more frequently or inaccurately. The consequences are real, with marginalized voices sometimes being disproportionately silenced or over-policed.
WebPurify, like other content moderation services, faced the challenge of making sure our models don’t just replicate existing biases but actively work to correct them. This requires an ongoing process of testing and refining our algorithms, as well as training human moderators to recognize the limits of automation. The solution isn’t to abandon automated moderation but to make it more accountable. Human moderators review edge cases and disputes, injecting a layer of ethical consideration that algorithms alone can’t provide.
The balance between speed and accuracy has become a central issue. In a world where content spreads instantaneously – such as the terrible mass shooting in 2019, at a mosque in Christchurch, New Zealand, that was live-streamed to the world – delaying moderation can mean real harm, yet moving too quickly with automation might lead to unjust removals. The industry has yet to fully resolve these tensions, but the path forward clearly involves more transparency and layered approaches that allow human oversight to play a meaningful role in decision-making.
What lies ahead for image moderation
The next chapter of image moderation promises to be defined by technologies we’re only beginning to understand. Generative AI is not only creating realistic images but entire worlds, blurring the boundaries between digital and physical experiences. With the potential for hyper-realistic avatars and interactive virtual environments, the nature of the content that needs moderating will fundamentally change.
“We’ve seen this industry get smarter and more intellectual in its approach,” says WebPurify co-founder Josh Buxbaum. “The complexity of the communities and content we moderate means community guidelines are now more intricate. Content moderation is no longer just about flagging offensive words or images; it’s about understanding how people are trying to manipulate AI models, using subtle prompts to bypass moderation filters.
“We’ve recognized that the level of expertise and experience required is significantly higher than it used to be, so we have to be more strategic, agile and adaptive. Content moderation now demands a deeper understanding of how technology works and how it can be manipulated. That’s why we’ve built advanced moderation teams with specialized skill sets, alongside trainers who can create complex workflows to meet these challenges. We’ve built a team of experts who bring a deep level of experience, from GenAI prompt engineering to policy interpretation. We’re well-positioned to tackle these complex issues.”
Throughout all this change, one thing remains certain: the need for vigilant and adaptive image moderation will only grow. Our field will have to navigate not only technical hurdles but also societal expectations around privacy, freedom of expression, and ethical responsibility. The future of image moderation isn’t just about preventing harm; it’s about shaping the kind of digital world we want to live in. And as it has for the past 18 years, WebPurify will continue to adapt, innovate, and lead the charge.
Note: Make sure to check out our photo moderation challenge!