AI that knows hate: Facebook removes offensive posts before users report

The Reinforced Integrity Optimizer (RIO), unveiled on Thursday, is meant to amplify the decision-making powers of human reviewers of hate-speech posts to make the process faster.

Facebook symbol  (photo credit: REUTERS)
Facebook symbol
(photo credit: REUTERS)
Social media giant Facebook released on Thursday new steps meant to tackle the issue of hate speech, cyber-bullying and using fake profiles to influence real-world events in anything from crime to elections.
How big is hate speech on Facebook and Instagram? Looking at the differences between the third quarter of the year (Q3) and the second (Q2), numbers show a whopping increase in removing child nudity on Instagram (1 million in Q3 vs 481,000 at Q2) and a doubling of removing suicide threats from the platform (1.3m. in Q3 vs 277,400 in Q2). On Facebook, cyber-bullying increased by roughly a third, with removal of such items reaching 3.5m. in Q3 when compared to 2.4m. in Q2.  
But for Facebook, 95% of the 22.1m. pieces of hate speech are proactively identified and removed without users needing to report  them at all.
This is done using the Reinforced Integrity Optimizer (RIO), unveiled by Facebook on Thursday, which is meant to help human moderators better identify and deal with hate-speech posts, accelerating the process.
In response to a question from the Post, chief technology officer Mike Schroepfer explained that RIO uses a gaming principle and applies it for a real-world purpose.
“In electronic chess, for example,” he said, “you tell the machine what the goal is, to score, and you let it learn how to do that. We tell RIO, find this thing we want to remove,” meaning hate speech.
The result is a real-time tuning system that keeps learning.
For example, in cases where a post is brought to the attention of human Facebook moderators, RIO can learn what was the principle which guided their decision and carry it out in other, similar cases.
Schroepfer made it clear that the goal is to remove hate-speech quicker and in more accurate ways to allow the humans “to focus on the work humans need to do.”
Advertisement
What does that mean? As Facebook is active in many languages and cultures around the world, the company hires local teams that are tasked in some cases to see if the post is authentic or even true by comparing it to local news. However, many of these local teams have needs as well. As explained by vice president of integrity Guy Rosen, COVID-19 has forced many of these teams to work from home, and graphic content is something many people don't want on their personal computers.
“We found 95% of users don’t click on a blurred image that shows a warning,” Rosen said, benefiting both normal users of the platform and human moderators at home from seeing graphic content or child nudity.
However, this is not the end of Facebook's efforts to effectively and efficiently combat hate speech on the platform.
“We’re not done yet,” Schroepfer told the Post, “we have lots of other ideas.” 
Speaking in an audio press conference, Facebook's vice president of content policy Monika Bickert stressed the firm has very useful data. “We [in Facebook] really learn from these numbers,” she said. “We think policymakers could learn from them as well.”