When more Covid-19 data doesnt equal more understanding

Since the start of the Covid-19 pandemic charts and graphs have helped communicate information about infection rates deaths and vaccinations. In some cases such visualizations can encourage behaviors that reduce virus transmission like wearing a mask. Indeed the pandemic has been hailed as the breakthrough moment for data visualization.

But new findings suggest a more complex picture. A study from MIT shows how coronavirus skeptics have marshalled data visualizations online to argue against public health orthodoxy about the benefits of mask mandates. Such ’counter-visualizations’ are often quite sophisticated using datasets from official sources and state-of-the-art visualization methods.

The researchers combed through hundreds of thousands of social media posts and found that coronavirus skeptics often deploy counter-visualizations alongside the same ’follow-the-data’ rhetoric as public health experts yet the skeptics argue for radically different policies. The researchers conclude that data visualizations arent sufficient to convey the urgency of the Covid-19 pandemic because even the clearest graphs can be interpreted through a variety of belief systems.  

’A lot of people think of metrics like infection rates as objective’ says Crystal Lee. ’But theyre clearly not based on how much debate there is on how to think about the pandemic. Thats why we say data visualizations have become a battleground.’

The research will be presented at the ACM Conference on Human Factors in Computing Systems in May. Lee is the studys lead author and a PhD student in MITs History Anthropology Science Technology and Society (HASTS) program and MITs Computer Science and Artificial Intelligence Laboratory (CSAIL) as well as a fellow at Harvard Universitys Berkman Klein Center for Internet and Society. Co-authors include Graham Jones a Margaret MacVicar Faculty Fellow in Anthropology; Arvind Satyanarayan the NBX Career Development Assistant Professor in the Department of Electrical Engineering and Computer Science and CSAIL; Tanya Yang an MIT undergraduate; and Gabrielle Inchoco a Wellesley College undergraduate.

As data visualizations rose to prominence early in the pandemic Lee and her colleagues set out to understand how they were being deployed throughout the social media universe. ’An initial hypothesis was that if we had more data visualizations from data collected in a systematic way then people would be better informed’ says Lee. To test that hypothesis her team blended computational techniques with innovative ethnographic methods.

They used their computational approach on Twitter scraping nearly half a million tweets that referred to both ’Covid-19’ and ’data.’ With those tweets the researchers generated a network graph to find out ’whos retweeting whom and who likes whom’ says Lee. ’We basically created a network of communities who are interacting with each other.’ Clusters included groups like the ’American media community’ or ’antimaskers.’ The researchers found that antimask groups were creating and sharing data visualizations as much as if not more than other groups.

And those visualizations werent sloppy. ’They are virtually indistinguishable from those shared by mainstream sources’ says Satyanarayan. ’They are often just as polished as graphs you would expect to encounter in data journalism or public health dashboards.’

’Its a very striking finding’ says Lee. ’It shows that characterizing antimask groups as data-illiterate or not engaging with the data is empirically false.’

Lee says this computational approach gave them a broad view of Covid-19 data visualizations. ’What is really exciting about this quantitative work is that were doing this analysis at a huge scale. Theres no way I could have read half a million tweets.’

But the Twitter analysis had a shortcoming. ’I think it misses a lot of the granularity of the conversations that people are having’ says Lee. ’You cant necessarily follow a single thread of conversation as it unfolds.’ For that the researchers turned to a more traditional anthropology research method — with an internet-age twist.

Lees team followed and analyzed conversations about data visualizations in antimask Facebook groups — a practice they dubbed ’deep lurking’ an online version of the ethnographic technique called ’deep hanging out.’ Lee says ’understanding a culture requires you to observe the day-to-day informal goings-on — not just the big formal events. Deep lurking is a way to transpose these traditional ethnography approaches to digital age.’

The qualitative findings from deep lurking appeared consistent with the quantitative Twitter findings. Antimaskers on Facebook werent eschewing data. Rather they discussed how different kinds of data were collected and why. ’Their arguments are really quite nuanced’ says Lee. ’Its often a question of metrics.’ For example antimask groups might argue that visualizations of infection numbers could be misleading in part because of the wide range of uncertainty in infection rates compared to measurements like the number of deaths. In response members of the group would often create their own counter-visualizations even instructing each other in data visualization techniques.

’Ive been to livestreams where people screen share and look at the data portal from the state of Georgia’ says Lee. ’Then theyll talk about how to download the data and import it into Excel.’

Jones says the antimask groups ’idea of science is not listening passively as experts at a place like MIT tell everyone else what to believe.’ He adds that this kind of behavior marks a new turn for an old cultural current. ’Antimaskers use of data literacy reflects deep-seated American values of self-reliance and anti-expertise that date back to the founding of the country but their online activities push those values into new arenas of public life.’

He adds that ’making sense of these complex dynamics would have been impossible’ without Lees ’visionary leadership in masterminding an interdisciplinary collaboration that spanned SHASS and CSAIL.’

The mixed methods research ’advances our understanding of data visualizations in shaping public perception of science and politics’ says Jevin West a data scientist at the University of Washington who was not involved with the research. Data visualizations ’carry a veneer of objectivity and scientific precision. But as this paper shows data visualizations can be used effectively on opposite sides of an issue’ he says. ’It underscores the complexity of the problem — that it is not enough to just teach media literacy. It requires a more nuanced sociopolitical understanding of those creating and interpreting data graphics.’

Combining computational and anthropological insights led the researchers to a more nuanced understanding of data literacy. Lee says their study reveals that compared to public health orthodoxy ’antimaskers see the pandemic differently using data that is quite similar. I still think data analysis is important. But its certainly not the salve that I thought it was in terms of convincing people who believe that the scientific establishment is not trustworthy.’ Lee says their findings point to ’a larger rift in how we think about science and expertise in the U.S.’ That same rift runs through issues like climate change and vaccination where similar dynamics often play out in social media discussions.

To make these results accessible to the public Lee and her collaborator CSAIL PhD student Jonathan Zong led a team of seven MIT undergraduate researchers to develop an interactive narrative where readers can explore the visualizations and conversations for themselves.

Lee describes the teams research as a first step in making sense of the role of data and visualizations in these broader debates. ’Data visualization is not objective. Its not absolute. It is in fact an incredibly social and political endeavor. We have to be attentive to how people interpret them outside of the scientific establishment.’

This research was funded in part by the National Science Foundation and the Social Science Research Council.