Understanding the COVID 19 misinformation flow using artificial Intelligence (AI) based tools and citizen panels

Start date: 1 September 2021
End date: 30 April 2022
Principal investigator: Gary Graham, Serge Sharoff and Neil Winn

Description

During the COVID 19 pandemic, social media has become an open space for people to access and report both true and fake news. This has led to what is often described as an ‘infodemic’, a term that refers to a massive circulation and flow of false information related to COVID 19. Social media users can choose to deliberately share false content with the intention to deceive which results in disinformation. Equally, amid the uncertainty and changing measures, false content is also unintentionally shared when users seek to know what needs to be done. This uninformed act results in the spread of misinformation.

In both cases, false information is found to spread faster on social media compared to accurate information (Vosoughi et al., 2018) creating confusion and public mistrust in health authorities. What is more concerning is the threat it poses to public health as misinformation is found to promote vaccine hesitancy and expose citizens to the risk of getting COVID 19.

As is often stated, “understanding the problem is half the solution,” and so this project was born out of necessity to counter the spread of misinformation on social media by first understanding the underlying factors behind its spread. We focused on both examining the characteristics of misinformation content and the social media accounts sharing it.

The research hypothesizes that socio-demographic and psychological factors are important indicators that steer the spread of misinformation on social media. To gain a better understanding of these factors, this project explored the link between specific socio-demographic properties (age, gender, ideology) of social media profiles sharing misinformation and the key features of messages shared on different social media platforms (topic, style, true/false information, and sentiment).

To understand the key identifiers influencing the spread of Covid misinformation, a number of variables were predicted from either the text message or the account user. In doing so, an artificial intelligence tool was used to train statistical models. The predictions depend on the classification of these properties through a type of machine learning model in which we can train an algorithm to detect patterns in data by using a large amount of training data. The accuracy of the results was verified through human annotations to fine-tune the model.

A crucial part of verifying the accuracy and relevance of the prediction model was the authentic involvement of citizens throughout the stages of the study. Two discussion panels comprised 40 participants from different areas in the UK. These public engagements stimulated rich discussions about factors shaping the level of trust in health information. The participants pointed to the impact influence of communities and social networks, educational level and digital literacy, age, and beliefs on trust in public communication of health information. These factors often foster mistrust in messages coming from the health authorities. Throughout this study, the citizens discussed demographic, social, and political topics that informed the categories used to develop the automatic prediction model.

Findings from the AI classifiers' predictions show that the demographic factor plays a crucial role in the COVID misinformation flow. First, users aged between 35 and 54 years are found to share more false information, whereas those under 25’s share less misinformation. Secondly, Gender (Female, Male) was also classified in relation to COVID misinformation messages. This correlation reveals that males share significantly more false information than females. These findings also reveal the most vulnerable age groups who need assistance on social media platforms. To this end, we urge health communication institutions to concentrate on providing these groups with facts and resources to protect them from falling prey to misinformation.

Furthermore, the classifiers' predictions also show that far-right political leaning accounts share more COVID misinformation characterised by conspiracy stories and anti-vaccines messages. Particularly, we found that the amount of false content about vaccines shared by these accounts outweighs the amount of true content disseminated about vaccines. This finding explains the reason for vaccine hesitancy that needs considerable attention by the health communication organisations. There is a clear knowledge gap regarding COVID vaccines on social media that needs to be filled with factual covid-related messages. This will equip social media users with the right knowledge, bolster trust, and increase their confidence to make informed decisions.

Another key finding relates to the language of misinformation. We found that the content of misinformation messages tends to trigger fear and the style resembles academic writing. Notably, the COVID misinformation messages which masquerade as academic writing are surprisingly higher than true academic information. This indicates how crucial it is to strategise health messages with the right tone and style to counter misleading content.

Description

This project is funded by the UKRI Policy Support Fund: 2020-21 and EPRSC Impact Accelerator Fund 2022.