Is there a right way to assess the impact of fake news?

A new working paper from the National Bureau of Economic Research has been released and is titled “SOCIAL MEDIA AND FAKE NEWS IN THE 2016 ELECTION” (Allcott and Gentzkow, 2017).

The authors use survey data, web browsing data, and a simple statistical model of voter persuasion to draw five main conclusions:

A. Social media had an impact, but was “not dominant” in being a source of election news (14% of their survey called it their most important source).

B. Drawing from the sample of fake news stories archived on fact-checking websites, there were 30 million Trump-positive shares and 8 million Clinton-positive shares.

C. An average American from their sample remembered 0.92 Trump-positive fake stories and 0.23 Clinton-positive fake stories.

D. A little over half who recalled seeing fake stories believed them.

E. One piece of fake news would have needed the persuasiveness of 36 campaign TV ads to have swayed the election.

I’m really happy to see work like this being done. However, I think fake news may have had more sway — it’s just a question of how to use empirical methods like these to correctly quantify it. Why?

This is not a rigorous critique or an academically sound paper review, but I’d take these conclusions with more than a few grains of salt.

The authors use Alexa and comScore to measure “the share of traffic on news websites that come from social media vs. other sources.” Notably, both “do not include mobile browsers and do not include news articles viewed within social media sites.” This provides a flawed view of the sharing ecosystem.

Mobile browsers

As of Q1 '16, Facebook reported 894 million mobile-only monthly active users, indicating that, theoretically, more than half of the service’s user base hasn’t been observed in this sample.

There are also confounding issues as a lot of external Facebook links are routed through the app’s in-app browser which can mess with many existing trackers and browsing metrics.

Unmeasured engagement

Anecdotal evidence from 2014 and 2016 indicate that embedded newsfeed advertisements have a clickthrough ratio in the 1% to 5% range depending on whether the ad is seen in the mobile or desktop feed (1, 2). Let’s assume that engagement on “real”, non-advertising content is much higher than ads—even 10x higher. Even still, it’s entirely in the realm of possibility that less than half of the fake news links shared on Facebook were clicked through and viewed in a mobile or desktop browser.

However, a large proportion of engagement with Facebook stays within Facebook’s “walled garden”, making measurement notoriously difficult. This is somewhat addressed by the use of Facebook share data, but ignoring the mobile and newsfeed segment is still a rather large hole.

Twitter

“…we do not record shares on these other sites because the number of Facebook shares is orders of magnitude larger.”

While it’s hard to map alternative forms of engagement on Twitter onto this link-sharing based approach, it’s important to note that a lot of fake news engagement happens on Twitter.

Here is an example of Clinton-negative fake news which was organically created and shared within the Twitter ecosystem. Its conclusions spread to other sources (perhaps including some of the fake news articles measured and shared on Facebook), but tens of thousands of engagements occurred solely on Twitter.

This also bypasses the issue of emotional and incidental engagement with Trump-positive “troll” accounts and unsourced snippets of fake news which take the form of tweets.

In conclusion: I think this work is a step in the right direction, but the conclusion it makes are too strong for the data underpinning the work. Many forms of influence and engagement aren’t likely to be solely captured using desktop Alexa data, comScore, and Facebook-only shares. I think this might understate the effects found in the paper.