Pomerleau强调，理解力是关键，语言表达非常微妙，特别是在线上，我们可以转向Tide pod（汰渍洗衣球）这个例子。康奈尔大学教授James Grimmelmann在最近关于假新闻和平台节制的文章中讲到，互联网的反讽使人很难判断真诚和意图。 Facebook和YouTube也在今年1月试图删除Tide Pod挑战视频时发现了这一点（Tide Pod挑战：人们发现Tide Pod长得特别像一种美味的小吃，但它其实只是一种洗涤产品，不可食用）。
Grimmelmann讲到，在决定删除哪些视频时，公司会面临两难的境地。“很容易就能找到人们拿着Tide Pod的视频，他们摆出很想吃的表情，然后又告诉大家不能食用Tide Pod，很危险。但这些视频是真的告诉大家不要食用Tide Pod吗？还是他们表面上声称要抵制食用，只是以此来激起对食物的兴趣？又或者是两种意思都有？”
Why AI isn’t going to solve Facebook’s fake news problem
Facebook has a lot of problems right now, but one that’s definitely not going away any time soon is fake news. As the company’s user base has grown to include more than a quarter of the world’s population, it has (understandably) struggled to control what they all post and share. For Facebook, unwanted content can be anything from mild nudity to serious violence, but what’s proved to be most sensitive and damaging for the company is hoaxes and misinformation — especially when it has a political bent.
So what is Facebook going to do about it? At the moment, the company doesn’t seem to have a clear strategy. Instead, it’s throwing a lot at the wall and seeing what works. It’s hired more human moderators (as of February this year it had around 7,500); it’s giving users more information in-site about news sources; and in a recent interview, Mark Zuckerberg suggested that the company might set up some sort of independent body to rule on what content is kosher. (Which could be seen as democratic, an abandonment of responsibility, or an admission that Facebook is out of its depth, depending on your view.) But one thing experts say Facebook needs to be extremely careful about is giving the whole job over to AI.
So far, the company seems to be just experimenting with this approach. During and interview with The New York Times about the Cambridge Analytica scandal, Zuckerberg revealed that for the special election in Alabama last year, the company “deployed some new AI tools to identify fake accounts and false news.” He specified that these were Macedonian accounts (an established hub in the fake-news-for-profit business), and the company later clarified that it had deployed machine learning to find “suspicious behaviors without assessing the content itself.”
This is smart because when it comes to fake news, AI isn’t up to the job.
AI CAN'T UNDERSTAND FAKE NEWS BECAUSE AI CAN'T UNDERSTAND WRITING
The challenges of building an automated fake news filter with artificial intelligence are numerous. From a technical perspective, AI fails on a number of levels because it just can’t understand human writing the way humans do. It can pull out certain facts and do a crude sentiment analysis (guessing whether a piece of content is “happy” or “angry” based on keywords), but it can’t understand subtleties of tone, consider cultural context, or ring someone up to corroborate information. And even if it could do all this, which would knock out the most obvious misinformation and hoaxes, it would eventually run up against edge cases that confuse even humans. If people on the left and the right can’t agree on what is and is not “fake news,” there’s no way we can teach a machine to make that judgement for us.
In the past, efforts to deal with fake news using AI have quickly run into problems, as with the Fake News Challenge — a competition to crowdsource machine learning solutions held last year. Dean Pomerleau of Carnegie Mellon University, who helped organize the challenge says that he and his team soon realized AI couldn't tackle this alone.
“We actually started out with a more ambitious goal of creating a system that could answer the question ‘Is this fake news, yes or no?’ We quickly realized machine learning just wasn’t up to the task.”
Pomerleau stresses that comprehension was the primary problem, and to understand why exactly language can be so nuanced, especially online, we can turn to the example set by Tide pods. As Cornell professor James Grimmelmann explained in a recent essay on fake news and platform moderation, the internet’s embrace of irony has made it extremely difficult to judge sincerity and intent. And Facebook and YouTube have found this out for themselves when they tried to remove Tide Pod Challenge videos in January this year.
As Grimmelmann explains, when it came to deciding which videos to delete, the companies would have been faced with a dilemma. “It’s easy to find videos of people holding up Tide Pods, sympathetically noting how tasty they look, and then giving a finger-wagging speech about not eating them because they’re dangerous,” he says. “Are these sincere anti-pod-eating public service announcements? Or are they surfing the wave of interest in pod-eating by superficially claiming to denounce it? Both at once?”
Considering this complexity, it’s no wonder that Pomerleau’s Fake News Challenge ended up asking teams to complete a simpler task: make an algorithm that can simply spot articles covering the same topic. Something they turned out to be pretty good at.
With this tool a human could tag a story as fake news (for example, claiming a certain celebrity has died) and then the algorithm would knock out any coverage repeating the lie. “We talked to real-life fact-checkers and realized they would be in the loop for quite some time,” says Pomerleau. “So the best we could do in the machine learning community would be to help them do their jobs.”
EVEN WITH HUMAN FACT-CHECKERS IN TOW, FACEBOOK RELIES ON ALGORITHMS
his seems to be Facebook’s preferred approach. For the Italian elections this year, for example, the company hired independent fact-checkers to flag fake news and hoaxes. Problematic links weren’t deleted, but when shared by a user they were tagged with the label “Disputed by 3rd Party Fact Checkers.” Unfortunately, even this approach has problems, with a recent report from the Columbia Journalism Review highlighting fact-checker’s many frustrations with Facebook. The journalists involved said it often wasn’t clear why Facebook’s algorithms were telling them to check certain stories, while sites well-known for spreading lies and conspiracy theories never got checked at all.
However, there’s definitely a role for algorithms in all this. And while AI can’t do any of the heavy lifting in stamping out fake news, it can filter it in the same way spam is filtered out of your inbox. Anything with bad spelling and grammar can be knocked out, for example; or sites that rely on imitating legitimate outlets to entice readers. And as Facebook has shown with its targeting of Macedonian accounts “that were trying to spread false news” during the special election in Alabama, it can be relatively easy to target fake news when it’s coming from known trouble-spots.
Experts say, though, that is the limit of AI’s current capabilities. Mor Naaman, an associate professor of information science at Cornell Tech, adds that even these simpler filters can create problems. “Classification is often based on language patterns and other simple signals, which may ‘catch’ honest independent and local publishers together with producers of fake news and misinformation,” says Naaman.
And even here, there is a potential dilemma for Facebook. Although in order to avoid accusations of censorship, the social network should be open about the criteria its algorithms use to spot fake news, if it’s too open people could game the system, working around its filters.
For Amanda Levendowski, a teaching fellow at NYU law, this is an example of what she calls the “Valley Fallacy.” Speaking about Facebook’s AI moderation she suggests this is a common mistake, “where companies start saying, ‘We have a problem, we must do something, this is something, so we must do this,’ without carefully considering whether this could create new or different problems.” Levendowski adds that despite these problems, there are plenty of reasons tech firms will continue to pursue AI moderation, ranging from “improving users’ experiences to mitigating the risks of legal liability.”
These are surely temptations for Zuckerberg, but even then, it seems that leaning too hard on AI to solve its moderation problems would be unwise. And not something he would want to explain to Congress next week.
原文发布于微信公众号 - 灯塔大数据（DTbigdata）