Live chat moderation: how WebPurify blocks abusive speech in real-time

June 8, 2023 | Profanity Filter

Being able to chat with other people using text, in real-time, is incredibly useful. You might be shopping online and looking for advice on what to buy, or seeking IT support for a product you’re having trouble with. You might be a student in a virtual classroom, or just chatting about a subculture you love with fellow fans. Either way, the ability to type and get an instant response is much more user-friendly and fun than sending an email and waiting ages for a reply.

However, inherent in this spontaneity is a considerable pitfall: the heightened risk of inappropriate or harmful content slipping through the cracks. And that can tarnish a brand’s reputation faster than you can type out an apology.

The solution? Live chat moderation is ideally provided by professionals with ample experience because the challenges associated with moderating live chat are as fast-moving as the medium itself.

Protecting a brand’s image requires a diligent, round-the-clock effort to detect and defuse any content that could prove detrimental. It’s an intricate dance of technology and human judgment; a delicate balancing act between maintaining a free and open conversation and ensuring it remains respectful, civil, and on-brand.

At WebPurify, we are well-versed in the risks brands face from unmoderated live chats and the challenges in keeping on top of them. We’ve developed a robust, responsive system capable of keeping brand reputations untarnished, and below we lift the lid on our process.

What are the risks of live chat?

The vast majority of live chat online is, of course, harmless. But companies still need to be vigilant. “Any time you allow one person to communicate with another, there are risks,” points out WebPurify co-founder Josh Buxbaum.

“There’s the risk of offensive content, such as racist or hateful language. There’s the ability to attack someone, bully someone, and hurt someone’s feelings. At the extremes, there’s even the potential to drive someone to kill themselves. You’re interacting with another human being, and words are powerful and influential.”

Three things make live chat particularly risky, Josh explains. First, because it happens in real time, it feels very conversational and can therefore be significantly more impactful than, say, posting a comment on a forum. Second, the rapid-fire, back-and-forth nature of chat can be escalatory. The ability to quickly trade messages often has a snowball effect, leaving few if any pauses for users to rethink something they would hesitate to say in an email, comment or review. And thirdly, the anonymity of most live chat forums emboldens people to post things they’d never say face to face.

“For instance, I’m in my forties and a 16-year-old kid would never look me in the eyes and just fling abuse at me,” Josh explains. “But all of a sudden he’s pretty brave because he’s anonymous And this means some people can get pretty aggressive, particularly when it comes to bullying.”

So how do you go about moderating live chat, in a way that prevents harm and minimizes abuse, before things spiral out of control?

How WebPurify’s chat moderation works

The most basic approach is keyword filtering, which involves software flagging when certain words are used, such as ethnic slurs. “This angle’s not to be discounted, but it will only get you so far,” says Josh. “At WebPurify, we take a more sophisticated approach by using AI, which doesn’t just look for keywords, but also context and intention, across full sentences and paragraphs.”

In this way, WebPurify’s advanced software is able to automatically identify bad behavior such as unwanted sexual advances, criminal activity, bigotry and more; phrases you certainly don’t want on your platform and are inappropriate, but may not contain literal “bad words”It then (optionally, and depending on the client’s use case) complements this with a layer of human experts, who can understand the nuance and subtlety of what’s being discussed, and determine whether action needs to be taken.

These content moderators are also there to check anything flagged by users. Which raises an interesting question: if a platform has a proper trust and safety setup, with good user reporting, why is third-party moderation even needed?

It’s largely a question of scale, says Josh. “Even if users are reporting problems, it’s not going to be 10 million reports. No matter how big your platform is, it’ll be a tiny percentage of all of the user-generated content. So we use the AI to check all of this content, and then it pulls out pieces for our moderators to evaluate.”

What’s more, while user reporting is a reactive process, scanning with AI is a proactive measure that seeks to remove content in violation of community guidelines before it reaches the end-user.

“Brands need to consider the reactive nature of user reporting. If your customers are flagging something as offensive, the damage has already been done to their UX,” Josh explains. “It’s of course great that you empower your community to report something they think is wrong – this gives users a sense of agency – but you don’t want them being on the frontline of content. Retaining a third-party that uses AI to check everything in realtime, first, gets ahead of most problems.

“Lastly,” continues Josh, “keep in mind that misused solutions can themselves be a problem. In other words, it’s important that you have human staffing sufficient to quickly review user escalations, because sometimes users will falsely flag one another’s accounts as a form of trolling or harassment. Here human moderators can spot a pattern and set things right, allowing appropriate action to be taken before a malicious account does more damage.”

Many platforms also have their own community moderators, who can be useful in raising issues. But they’re no substitute for professional content moderation, says Josh.

“Typically, they’ll have someone working for free who’s really passionate about whatever community it is,” he explains. “And they’ll be watching the chat and saying things like: ‘Hey, watch your language, or I’m gonna kick you off.’ This is certainly one line of defense for a platform but these community members are not highly trained, thereforerules are not always consistently enforced. Sometimes they can even be vindictive. But they’re not getting paid, and they’re not really accountable, so what are you going to do?”

What types of content get flagged?

WebPurify works with a diverse group of companies across many industries, and the types of content that are flagged for moderation will vary from platform to platform. But typically with live chat, content moderators will be alert to things such as bullying, hate speech, sexually inappropriate behavior, blackmail and grooming.

Child predators offer a particularly tough challenge, says Josh. “These guys are super-manipulative, and super-strategic,” he explains. “They’ll go out of their way to create a friendship with someone who’s underage. They’ll compliment him or her, laugh, tell stories, and ensnare people with their charm. All with the aim of eventually either meeting up in person, or convincing them to send an inappropriate picture or video.

“They’ll then use that as blackmail, to make the victim send more pictures or send money, or whatever it may be. Basically, a teenager (unfortunately it’s more often girls) is now in a very vulnerable position and can easily be manipulated by some of these guys who have trained for this; this is what they do.”

Thankfully, professional content moderators are trained and committed too. And WebPurify’s moderation and reporting efforts led to the arrest of more than 500 such child sexual predators last year.

Game of cat and mouse

It’s a constant battle, though, because bad actors are getting more and more creative in the ways that they try to evade content moderation.

“They will type the same thing 50 different ways until it isn’t caught,” Josh explains. “For example, instead of saying ‘I’m going to kill you’ they’ll say, ‘I’m gonna make you unalive.’ Instead of stating their age clearly, they’ll say: ‘I’m the square root of 72.’ So it’s a cat-and-mouse game, where people are constantly trying to come up with new and creative ways to evade the systems. Which means we have to constantly adapt our content moderation services to catch them.”

All this being said, no software will ever be 100% effective, and so at WebPurify, human and AI moderation are often used in tandem, and rightly seen as complementary. WebPurify’s employees receive expert training, specific sub-teams specialize on different platforms in order to become bonafide experts, and all moderators are continually building their skills and knowledge around things like popular slang and cultural trends.

“What’s nice about content moderation,” Josh adds, “is that we’re all doing this for a common good, so we do share information with moderators in other organizations. So for example, we’ll say, ‘Hey, keep an eye out. We’re starting to see people submitting this racist meme that doesn’t look racist, but when you really dig into it, it’s pretty nasty.'”

How quickly can moderators respond?

Given that live chat is instantaneous, though, how quickly can moderators respond to problematic content?

“That’s essentially a staffing question,” Josh responds. “Sometimes clients will ask: ‘Can you moderate all our content in 30 seconds or less?’ And we actually have a client we do that for. But we’re going to need a certain number of people for that because comments don’t always come in a steady, regular stream. Sometimes you might have 50 come in at once, so you’ll need more moderators online to deal with that quickly.”

WebPurify also ensures its live chat moderators are fully trained and equipped to handle situations quickly and efficiently. And that’s largely about understanding context.

“Our moderators need to understand the cultural nuance of whatever language they’re moderating in,” Josh explains. “Because in live chat, people speak in abbreviations, might combine two different languages, or use vernacular, and this is constantly evolving. So moderators need to understand the nuance of what people are saying, so they can catch the bad stuff.”

That partly comes from training, but it’s about hiring the right moderators who have a strong understanding of the language they’re working in to begin with, and who specialize in a single platform.

The need for content moderators to be in-house

In summary, the takeaways for organizations wishing to outsource their live chat moderation are pretty clear. “You need AI-driven software,” says Josh. “But you also need trained humans. And you need an experienced company that solely focuses on content moderation because this stuff is extremely complicated.”

Josh adds one more crucial point. “Most importantly, these moderators should be in-house, not crowdsourced, and ideally from the same vendor providing AI moderation As I mentioned earlier regarding community moderators often being a nice idea but unaccountable and inconsistent since they’re largely layperson volunteers, it’s a question of training and accountability. Ideally, you want your moderators all under one roof with quality control, managerial oversight and extensive training working in lockstep You need a company that can do this 24/7. It’s not that easy.”

“That way,” he explains, “the humans will know the AI inside out, and understand where the gray areas live. They’ll be able to make quick and informed decisions when issues are flagged. So it’s always more cohesive if the AI and the humans are from a single source.”

Live chat moderation: how WebPurify blocks abusive speech in real-time

What are the risks of live chat?

How WebPurify’s chat moderation works

What types of content get flagged?

Game of cat and mouse

How quickly can moderators respond?

The need for content moderators to be in-house

Request Demo