Algorithms are an important software technology with a lot of promise in many areas of our digital lives. But do they have a role in website content moderation?
That’s an aspect of a conversation I had with Wired.com writer Brian Barrett (Twitter: @brbarrett) for his article on a new algorithm that is designed to detect nudity in photos.
When I was asked if WebPurify would use such an algorithm with our services that help website owners keep their sites free from indecent or unwanted content, my answer was “we would definitely be interested” – if the algorithm was infallible.
But, we’re a long way from that.
What is an algorithm?
As way of background, an algorithm is a “set of rules that precisely defines a sequence of operations.” A computer program will have multiple algorithms for specific tasks with rules or processes that need to be processed in a specific order.
Algorithms work great for processing payroll or rendering webpages, but they are now starting to be used in new ways where the rules aren’t as clear. These applications – such as email spam filters – show that there’s still some work to do in this area.
The Wired.com article was about a company called Algorithmia that is now trying out a new algorithm to detect nudity in photos. On the company’s website (isitnude.com) users can submit a photo and see if the algorithm can guess if the photo is G-rated or worse.
Does it work? Here’s the author’s summary of his experimentation with the site:
It’ll also take more than just fine-tuning for the algorithm to become ironclad, or even lightly bronzed. It can’t yet recognize individual body parts,
for instance, meaning that an image of someone who’s fully clothed except for, well, the parts you most want clothed, likely would still get through.
And because it’s based purely on flesh, otherwise innocent beach pics might find themselves flagged. An even bigger hiccup? Because the algorithm
relies on skin tone recognition, it’s powerless against black and white images.
Even if the algorithm was 100% effective in detecting nudity, there’s no way for it to determine whether its looking at a piece of art or an offensive photo.
One of the main points I make in the article is that nudity is only a small part of what we look at in our content moderation service. Violence, hate crimes, drug paraphernalia, alcohol, bullying – these are all content areas that need to be filtered and are much more complex than nudity. Here’s a simple example I gave of the complexity:
“Violence is complicated. We even get as detailed as ‘violence in a sport that’s not a violent sport’ is violence, but if it’s boxing, which is a violent sport, we allow it.”
So, are algorithms the future of content moderation?
Well, we already utilize highly effective algorithms with our profanity filter solutions and are excited by the prospect of them aiding our live image and video moderation teams someday. But there will always be a need for human intelligence to intercede and make more nuanced judgments. That’s why we have a full team of professionals to ensure that content posted to customer sites is always free from undesirable images, videos and text.