Are Text Moderation Services Better Than Profanity Filters?June 20, 2022 | Profanity Filter, UGC
Have you ever witnessed or experienced harassment online?
Have you ever wondered what tools are available to protect your users from dealing with harassment on your site or app?
Any website where people post or interact requires monitoring. Even those websites that might be considered the safest are vulnerable to users posting offensive content. With so many different moderation services, it can be difficult to choose the correct one.
Here, we’re going to explore the differences between two such services: text moderation and profanity filter services.
Which is better? Let’s find out!
Why do I need a moderation service?
In the age of the internet, consumers can easily research products or services they want to use, and if they find something unsavory, they have plenty of other options. When customers see offensive or harmful content unchecked on your site, they may leave, or even worse, let their network know that you can’t be trusted. This is where pre-moderation (screening all content before it is posted) can help.
When using a moderation service, it is possible to mitigate the risk that someone will post something offensive that current or future potential users will see, saving you from losing customers or damaging your brand.
What is a profanity filter service?
Profanity filters were created to block users from posting offensive text such as curse words or racist terms. They filter out a predetermined list of words, but typically also allow the customer to create their own custom “blocklist” of additional words or phrases they want to prevent that are more specific to their brand’s concerns. An effective profanity filter service is not simply a list of offensive words, but an advanced algorithm designed to catch the various and creative ways users attempt to obfuscate those words such as replacing letters with numbers (leet speak) or using repeated characters. When selecting a profanity service, remember to ensure that it covers the various languages that your users speak.
The results of the profanity filter can be used in various ways, from simply preventing the post/comment from populating on the customer’s site, escalating it to their internal team for further review, or replacing the offensive language with asterisks or other symbols. Sometimes, profanity filters will also allow customers to create their own “allowlist” for words or phrases they wish to permit, as they may want to modify the strictness of the filter.
Profanity filters are typically easy-to-integrate APIs that run in the background as your users post to your site.
What is a text moderation service?
Text moderation is different from a profanity filter in that text moderation AI can understand the intent behind what is posted. Instead of using a predetermined list of words or phrases, text moderation services will look for malicious intent using predetermined categories such as bullying or bigotry. Remember that a sentence doesn’t always need to have profane language to be offensive, and in these cases, a more robust text moderation vs. a stand-alone profanity filter service is preferred.
Text moderation services are often used in conjunction with a human moderator who can approve or deny the flagged content. AI does its best to determine context but unfortunately, even the most efficient text moderation services will occasionally require humans in the loop. And even for highly trained human moderators, determining the context of a text exchange can be challenging.
Another advantage to a text moderation service is its ability to provide more detail on user submissions than a profanity filter. For example, a profanity filter will alert you to profanity and return the words found in a submission, but a text moderation service will provide details on the context of a submission like if sexual advances or criminal activity were found.
Best uses and applications for moderation services
Anywhere people are interacting online, there is the opportunity for them to share offensive content. There are a handful of uses and applications in particular that necessitate moderation services. In the following use cases, there is often a need for both profanity filtering as well as a text moderation service in order to not only block profanities, but also flag content that did not have any blocklisted words but still had malicious intent.
People young and old enjoy video games, often with strangers around the globe. Competitive games can get heated, and unfortunately some players attack one another personally through in-game chat. A proper moderation service should be in place to combat these varying types of offensive content ranging from bullying and hate to criminal activity and more.
From sexually explicit content to the sharing of personal information, dating apps are vulnerable to all kinds of inappropriate content. Personal attacks, bigotry, or sexual advances are quite common as users can be particularly bold and aggressive in these environments. With a moderation service in place, these apps can ensure that their users can’t send or be exposed to this kind of content.
Blog/In-app comment sections
Many blogs and apps have comment sections where people can share their opinions, however these opinions can be passionate, aggressive and potentially inappropriate. A strong moderation service is a must to properly monitor these interactions and keep users safe.
It goes without saying that games/apps geared towards children require the strictest moderation approaches to shield children from age-inappropriate content such as sexual activity, predators inviting children to meet offline (grooming), mental health issues like self harm, and cyberbullying.
Why character count matters
If users are simply creating usernames or customizing products with very limited character fields (think initials on a custom tennis racket) then a profanity filter may suffice. The more free form text a platform allows, the more likely you are to require a text moderation service in addition to a profanity filter. Full user generated sentences mean you are now facing risks beyond offensive words and the overall intention and context of the submission must now be evaluated.
A note about allowlists
Some sites or apps (like kids games) will go so far as to only allow users to use certain allowlisted phrases. However, allowlists cannot screen for context. Benign words can still be used to create phrases that are inappropriate such as innuendos, making allowlists ineffective without an extra layer of text moderation to catch content that is intended to be offensive.
Budget vs. Risk Tolerance
When evaluating the best moderation plan, a platform must always weigh their budget vs. their tolerance for risk. Text is just one type of content that platforms must contend with, since many allow image and video submissions as well.
Overall, in order to best keep their community safe, a combination of a powerful profanity filter along with a text moderation service (for determining context) is always the best approach. Profanity filter services on their own are less expensive than a comprehensive text moderation service. Some startups or smaller brands begin with a profanity filter service as a great first line of defense. As an example, for as little as $15 per month, a platform can quickly and easily implement WebPurify’s profanity filter service, and take a large step towards keeping their platform safe. That said, exclusively using the profanity filter without text moderation does leave a site with vulnerabilities. Also, let’s remember that even with a profanity filter and comprehensive text moderation service, you are simply relying on AI which is never the preferred approach. We always recommend including live moderators in the loop which adds an additional cost.
WebPurify’s Profanity Filter
WebPurify offers a profanity filter service as well as an optional Offensive Intent add-on (our text moderation service). Our profanity filter does not only block profane words in 15 different languages, but also goes above and beyond with your own customizable smart blocklist. With this customization, you can set the filter to catch words that are particularly harmful for your brand such as your competitor’s brand names. This smart blocklist can also find offensive words that have been hidden inside other words using our Deep Search technology.
Looking for something more robust that can detect offensive content that goes beyond a blocklist? WebPurify also offers an Offensive Intent add-on to our profanity filter that is able to distinguish harmful content based on its phrasing and context in several categories: sexual advances, criminal activity, bullying, bigotry, mental health issues, external contact, and personal attacks. If there is harmful intent, our AI will catch it.
With so many ways to interact with others online, there is always a chance that someone will try to post offensive text. There are several tools available to help mitigate this problem, however only some of those tools also focus on the intent of the person posting.
If you are considering moderation software for your website or app and you are looking for something more robust, text moderation, working in conjunction with a profanity filter, is the way to go.