Request a Demo Blog

“AI art currently poses the most challenges for content moderation.” Here’s why…

May 20, 2023 | Image Moderation

Artificial intelligence can be a tool for many great things, including content moderation services, but when it’s applied to creating visual art the implications are far-reaching. “It introduces many new risks for internet users and unique challenges for content moderators,” says Josh Buxhaum, co-founder of WebPurify. “AI-generated content, particularly images and art, is the biggest challenge in moderating content going forward over the next decade. Part of our mission is to adapt quickly to new technologies, as we did with VR moderation – and now, everything is moving in the direction of AI-generated content.”

But what is it, in particular, about AI-generated content – be it deepfake videos or auto-generated images that lift from copyrighted material – that poses such particular danger to brands and unique challenges to moderators?

The challenges AI art poses for moderators

AI has transformed the content moderation industry. AI image detection tools are trained to identify things like violence, weapons, blood, nudity – all of the common NSFW categories that brands and communities want filtered out at scale. But with the advent of AI-generated art, it becomes much easier to evade moderation. Users can prompt AI generators to create a surreal or abstract version of a violent image, for example, and it might slip through.

“As ridiculous as it may sound, someone might prompt an AI art generator to create an image of a man getting his head cut off, but instruct it to make his head much smaller than his body,” says WebPurify co-founder Josh Buxbaum. “The AI moderation tool might let this through because it’s not part of any data set it’s been trained on. And that’s because it’s not real. It’s a surreal exaggeration. In the past, an image like this would require hours of a graphic designer’s time, and most people wouldn’t go to the effort. But with AI art generators, something like this can be prompted in seconds – and it’s accessible to everyone.”

The potential scale of the risk is what concerns Josh and his team. When potential violations of community guidelines aren’t clear-cut enough for AI to make a determination, human moderators step in to make the decision. But if more and more AI-generated content is entering the fray, there may be budget concerns for brands due to the scale of human moderators required. Platforms will need to implement new generative AI content moderation measures to stay ahead of new threats.

“AI art is here to stay. And as a content moderator, like all new technologies that we need to quickly adapt to, it concerns me,” says Josh. “People can prompt these generators to create literally anything, and they have the ability now to create it faster and whenever they want.”

The other big challenge for content moderators as it relates to AI-generated content is copyrights, which introduces all sorts of complicated issues that can be difficult to unravel. The first hurdle comes from the fact that the AI isn’t actually creating anything. It’s drawing on the huge amount of source data it’s been fed, but you can’t tell enough about the source data to verify where it came from. When someone asks an AI art generator to make a portrait of a celebrity or even a simple tropical sunset scene, these images are coming from many disparate places, and they were all ultimately created by someone. This raises some difficult questions when it comes to intellectual property.

While IP decisions will likely be made in the courts, another growing gray area arises when people ask the AI generators to mimic the style of an established artist. When someone asks ChatGPT to mimic the tone of Stephen King or prompts Midjourney to produce a portrait in the style of Annie Liebovitz, it may not be a blatant copyright infringement, but it raises questions. And, perhaps most importantly, it spreads doubts about authenticity, which is antithetical to the principles of user-generated content.

person wearing suit typing on a computer with a ChatGPT screen floating in front of them

Is generative AI the community killer?

The risks from highly offensive content produced by generative AI are becoming well known, but something many brands haven’t considered is that AI-generated content (instead of User Generated Content) has the potential to sow widespread doubt and mistrust among communities, especially when that content isn’t offensive and is instead convincing and relevant… but nonetheless untrue and auto-generated by a computer, not a human. When mistrust breeds, it leads to cynicism and people abandoning the platform.

As these generative AI models have become more sophisticated, they can produce increasingly convincing content that is often indistinguishable from that created by human users. And this ability is improving with every passing day. Whether it’s a well-written blog post, a comment on social media or even a photorealistic image, it’s becoming more challenging to discern if a human or AI was the creator, which leads to trust issues.

If users can’t be certain that the content they’re interacting with is generated by fellow humans, it can create a sense of alienation, undermining the very essence of the communities companies so carefully and thoughtfully cultivate: human interaction and shared experiences.

The potential harm this poses to such platforms is substantial. At the heart of any successful online community is the trust and engagement of its users, and if users start to doubt the authenticity of the content and the humanity of their interlocutors, engagement will plummet and activity will eventually cease. This could also create an environment ripe for misinformation and manipulation, as AI-generated content could be used to sway opinions or create false narratives. Imagine, for example, the revelation that most product reviews on an app are, in fact, AI-written. Think, similarly, about the implications for fan fiction, online contests and messaging on dating apps. Overall, the advent of advanced generative AI certainly poses new challenges to maintaining authenticity and trust in the digital space, and already WebPurify is hearing these concerns from its clients.

“It’s interesting because many of our clients so far aren’t as worried about the offensive content being produced by AI art generators,” says Josh. “They know we’re all over that, constantly monitoring. What they’re mostly worried about is the proliferation of content that isn’t real, period, especially when it’s non-offensive. They know that AI-generated content defeats the purpose of user-generated content, which is that it should be authentic. In all the discussions so far about AI-generated content, this is one of the most underestimated risks.”

Generative AI, like other artificial intelligence technologies, has raised a myriad of ethical considerations, and at the forefront of these concerns is the question of accountability. As these systems become more sophisticated – and they will – they can produce increasingly realistic and convincing text, images, music or even deepfake videos, which pose significant challenges in misinformation, propaganda, and privacy. Moreover, AI systems can inadvertently generate content that is harmful, offensive, or in violation of various guidelines and laws. This is complicated by the fact that the AI itself cannot be held accountable; thus, the responsibility falls to the developers, users or possibly the legal system to determine culpability.

“These discussions always come back to the fact that the people who develop this technology don’t understand the potential impact yet,” Josh says. “You can build in guardrails, but how do you protect against a question mark?

“What we’re seeing now are new jobs being advertised for roles called ‘prompt engineer,’ and what prompt engineers do is write different prompts to see what the output will be from the generators. This just goes to show that generative AI companies have built something with capabilities they don’t fully understand themselves.”

AI art content moderation solutions

Interestingly, Josh believes it will be AI tools that content moderators use to combat AI in the future.

“AI-generated content is going to get better and smarter,” Josh explains. “Then the AI-powered detection methods are going to fall behind and have to catch up. It’s going to be an ongoing race. We’ve seen it before when using AI to identify violent images. Once you figure out how to determine violence, people get more creative in the way that they submit said violence so that it evades detection. So you adapt the models to catch the new methods. It’s a cat and mouse game.”

With more than 17 years under their belts, fighting offensive content online since the days of Web 2.0, Josh and the WebPurify team has seen how the patterns play out. As tech evolves, people adapt their nefarious use of that tech in kind. In response, content moderation leaders update rules to help their human teams detect violations as best they can, and AI models are fed and trained on new data to catch up. What’s different this time, Josh says, is that AI-generated art is complicated in an especially daunting way, because it is subtle and nuanced to an as-of-yet-unseen extent, and pulls from source content spanning the entire internet.

In light of these issues, the role of content moderators will need to evolve, but the reassuring thing is that we know this. WebPurify knows this. And we’re working on it.