WebPurify co-founder, Joshua Buxbaum, speaks to the growing complexity in enterprise image moderation and the current limitations of machine-based learning for managing branded content.
Facebook. YouTube. Google. For several years, all three of them talked about the benefits of artificial intelligence (AI), computer vision and machine learning in moderating the vast quantities of user-generated content (UGC) on their platforms. Facebook alone receives more than 300 million photo uploads per day.
By early this year, all three said they were hiring thousands of human moderators to catch what the computers were missing.
The Reality of AI
While you may have found this a disappointment after all the media hype about smart tech, the need for human intervention shouldn’t come as a surprise. After all, here at WebPurify, we’ve used a hybrid system of AI and live moderation—by people—to scrub UGC for hundreds of companies and organizations over the last 10 years.
My cofounder, Jonathan Freger, often explains that the elimination of live moderators for images and video, especially, is much further away than you might think—even though AI continues to advance and computer vision continues to improve. In fact, a lot of our clients come to us after trying AI alone and being unsatisfied with the results.
AI is a computer. Currently—and for the foreseeable future—AI only works in numbers and probabilities. For image moderation, that makes it a filter, but not a complete solution.
Know Your Stakeholders
The first step toward successful image moderation is to know who you need to please. Especially in the gaming space, users get angry fast. Sports platforms allow heated, active debate, even lighting jerseys on fire.
It’s important to respect their expectations when they interact with your company or organization. But a certain segment of people will always go as low as they can, so you need to work with your moderators to think like the bad guys and prevent problems.
Some people posting UGC draw a fine line between what they perceive to be moderation and censorship.
Make the Wisest Choice
We consulted on a campaign for a top pet food company. It was intended to help kids understand the danger of leaving Halloween candy out where pets could get it and it involved moderating millions of images over the course of that holiday week.
Parents could upload a photo of their kitchen with the candy out in the open—on a chair or countertop—and then select from a menu of digital dogs to place in front of the candy. The site would quickly generate a realistic video of the dog walking into their kitchen and eating the dangerous treat, which they could show to their kids to reinforce the safety message.
We were concerned about people putting the candy in someone’s lap or staging inappropriate situations with the digital dog. So, we recommended the company not allow humans in the photos. And then our team scrutinized all of the images for potential offenses with the dog.
With kids as the intended viewers of the images, that decision was clear. In this space, you have to be okay losing content to keep your audience safe. On a children’s site, keep it safe. If you’re not sure, reject it.
Jonathan always points out that there’s a big difference between what’s acceptable in a dating app and what’s acceptable for his kids.
It can be challenging to balance content with branding.
Viewers of UGC apply their perception of the content to the sponsor of the channel or platform where it appears—and that’s your company.
Every Image Needs a Moderator
We’ve even learned that event photo booths at music festivals, corporate conferences, conventions and other gatherings let users automatically send the photo strip, branded with the event logo, to Facebook, Twitter and Instagram—no matter what happens when the curtain is closed.
Theme park rollercoasters have the same problem. People figured out where the cameras were and started flashing them or putting their middle fingers up, so parks started moderating the photos before they go up for sale.
Companies come to us and even we’re surprised. There are some we’d never think of needing moderation. But shutting down a photo booth, for example, leads to reduced revenue.
You might think of moderation as a public relations issue, but in reality, it’s insurance.
It’s a bottom-line business issue. We often have clients who develop campaigns by saying, “Wouldn’t it be cool if …?” When we look at the vulnerabilities and risks of opening up that creative idea to the Internet, though, the campaign may not be worth the risk posed by the really upsetting things we’re going to see. Sometimes, it’s necessary to sacrifice user experience for the sake of safety.
That starts with the parameters we set for the artificial intelligence (AI). Say a UGC campaign asks users to show their favorite ride on a certain brand of lawnmower, or to snap a picture of a particular beverage at the lake. There’s no need for text in those images. So rather than moderate to address text, which is very expensive, we’d suggest making the presence of text criteria for rejection.
How AI Really Works
Many people think computers “see,” but they don’t, currently, as my cofounder, Jonathan, often explains. They analyze pixels in an image for familiar patterns and characteristics.
It’s just like Siri or Alexa playing Taylor Swift—they analyze your voice and if there’s a certain percentage likelihood that you said, “Play Taylor Swift,” then that’s what they play.
AI doesn’t judge or make yes-or-no decisions. It provides data that your organization can use to make decisions.
AI Is a Helping Hand
For teams of professional moderators to go through millions of nudity images, for example, could take months, while AI could process and rank that volume in seconds or minutes. Still, AI may not catch everything it’s programmed to find—or it may catch images that are benign.
In a perfect world, AI could do everything. But right now, it can only assist or second guess people.
That’s why our general model is to source lots of images and then define what nudity is, for example. Images that meet the definition are tagged with nudity, then with partial or full, then within full nudity, with sexual positions, private parts, etc.
Those details let you make decisions based on more granular information.
So, to give another example, we can program the AI to identify the likelihood that a gun is present in an image. At this point, it can’t identify whether someone is using a gun in a crime, whether a police officer is wearing a holstered pistol in an official portrait or whether the guard at the Tomb of the Unknown Soldier is patrolling with a shouldered rifle.
People Provide Cognitive Reasoning
AI models today rely on teams of people to scrape, label and feed the data. Then, we rely on AI to reduce the volume of concerning images our teams need to review.
If you don’t want images of dogs in your UGC, our staff have to find 10,000 images of dogs in all kinds of settings—close-ups, partial shots, in cages, on leashes, everything that’s not wanted—and feed that data to the machines as a training set.
Then we find and feed 10,000 images that are completely different as a test set so that we can calibrate the AI. It’s only as good as the images we feed into it.
Context will likely always be a challenge for machines. This is where our people are essential.
We Follow Your Lead
After all that, you choose the moderation parameters. We might agree to set the AI to reject any image with an 80 percent or higher chance of including nudity, hate, alcohol or drugs and to approve anything with a 30 percent chance or lower. The middle 50 percent go to our team for review.
Our professional moderators use similar objective, consistent standards to remove as much personal interpretation as possible from the decision-making process. They get daily training in our office, regularly verify quality control and use systems that preserve your users’ privacy.
We often see ad agencies and companies try to add moderation at the last minute on social media campaigns—which have been blowing up for years because of a lack of moderation. They could succeed if moderation was worked into them instead of being an afterthought.
Moderation has to be a line item in the budget.
Moderation Is All About Details
It’s also important to determine the legal, brand and budget parameters for the project. Just one of these, when translated into criteria or timing, can sink all of your best efforts.
So, “Must have a person in the image,” would become, “Must have at least one clear human face (sunglasses allowed).”
With AI, “hate” is too broad. When you say “gore,” or “white supremacy symbol,” or “terrorism,” we’ll ask, “What do you think the visual components of that category are?”
That also helps our teams, who sometimes need to know why we’re screening for something. A swastika is considered a hate symbol in many parts of the world, but in other regions, like parts of Asia, it’s a spiritual, peaceful symbol.
Be as specific as possible.
Nothing Is Omniscient
The real difficulty is in categories where the scope is so large that neither the AI nor our teams can take in all the possible variables. It’s not feasible to train a team to recognize the thousands of landmarks worldwide. Or to recognize every product that competes with one of yours—when your company produces hundreds of products, each of which has hundreds of competitors.
You’re the Expert
With all of these details in play, constant feedback and escalation are essential to the process of setting, calibrating and verifying the work of both our AI and moderation teams.
Early in a project, we escalate some images to you for clarification—for example, if you don’t want watermarks, but find logos okay, we’ll ask you to verify that the AI and our staff are catching the right things.
Our quality control team gets an update after your review and goes over it with the moderation team. Then, we drop test images into the live project periodically to verify that the calibration and training are working.
Be prepared to work through the period of calibration.
Initially, images are often over-rejected, until you and the moderators strike the right balance between user experience and the laxity or strictness of moderation.
Know Your Context
Details are also always changing. We constantly monitor the changing context in which people view certain images—whether that’s the legal status of marijuana or the growing perception of the Confederate flag as a hate symbol. It’s especially true of offensive and hate signs, since an image of the Confederate flag in a museum or history book, for example, could have a non-racist intent.
That’s the case with any symbol used at a demonstration. Protesters often make tongue-in-cheek signs about themselves using offensive symbols or words. But they all set off the AI and require rejection or human review.
What’s Next for Image Moderation?
The biggest threat we face is that for every advancement in AI, there’s a counter advance, my cofounder, Jonathan, says. He mentioned recently that MIT has actually been working on benign images that detect the use of AI moderation.
Once people detect that someone is using AI, they start trying to subvert it.
As a result, refinement and continued learning—both for AI and for moderation teams—are required to keep up with continuing changes.
As AI gets smarter, it can solve additional problems—so people need to make more nuanced rule sets and conduct more in-depth training.
Moderators Know Your Business
When AI takes on more basic tasks, people may be able to shift some focus toward more elevated strategic needs. We have insights into your audiences, after all.
We see every single image on a dating site, for example.
What if we could tell you that in 20 percent of your images, someone’s using a specific brand of phone?
That might be a business opportunity for your company.