Classification aware neural topic model for COVID-19 disinformation categorisation.

Journal: PloS one
PMID:

Abstract

The explosion of disinformation accompanying the COVID-19 pandemic has overloaded fact-checkers and media worldwide, and brought a new major challenge to government responses worldwide. Not only is disinformation creating confusion about medical science amongst citizens, but it is also amplifying distrust in policy makers and governments. To help tackle this, we developed computational methods to categorise COVID-19 disinformation. The COVID-19 disinformation categories could be used for a) focusing fact-checking efforts on the most damaging kinds of COVID-19 disinformation; b) guiding policy makers who are trying to deliver effective public health messages and counter effectively COVID-19 disinformation. This paper presents: 1) a corpus containing what is currently the largest available set of manually annotated COVID-19 disinformation categories; 2) a classification-aware neural topic model (CANTM) designed for COVID-19 disinformation category classification and topic discovery; 3) an extensive analysis of COVID-19 disinformation categories with respect to time, volume, false type, media type and origin source.

Authors

  • Xingyi Song
    University of Sheffield, Western Bank, Sheffield, S10 2TN, UK.
  • Johann Petrak
    Department of Computer Science, University of Sheffield, Sheffield, United Kingdom.
  • Ye Jiang
    Department of Computer Science, University of Sheffield, Sheffield, United Kingdom.
  • Iknoor Singh
    Department of Computer Science, University of Sheffield, Sheffield, United Kingdom.
  • Diana Maynard
    Natural Language Processing Group, Department of Computer Science, The University of Sheffield, Sheffield, United Kingdom.
  • Kalina Bontcheva
    Department of Computer Science, University of Sheffield, Sheffield, United Kingdom.