Deep Learning-Based Prediction of Daily COVID-19 Cases Using X (Twitter) Data.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Due to the importance of COVID-19 control, innovative methods for predicting cases using social network data are increasingly under attention. This study aims to predict confirmed COVID-19 cases using X (Twitter) social network data (tweets) and deep learning methods. We prepare data extracted from tweets by natural language processing (NLP) and consider the daily G-value (growth rate) as the target variable of COVID-19, collected from the worldometer. We develop and evaluate a time series mixer (TSMixer) predictive model for multivariate time series. The mean squared error (MSE) loss on the test dataset was 0.0063 for 24-month Gvalue prediction when using the MinMax normalization with recursive feature elimination (RFE) and average or min aggregation method. Our findings illuminate the potential of integrating social media data to enhance daily COVID-19 case predictions and are applicable also for epidemiological forecasting purposes.

Authors

  • Nourhan Ahmed
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Khansa Saeed
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Jeevitha Lora Rodrigues
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Maha Naeem
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Andrea Correa
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Chairungroj Sanabboon
    Information Systems and Machine Learning Lab, Department of Mathematics, Natural Science, Economics and Computer Science, Institute of Computer Science, University of Hildesheim.
  • Sharareh Rostam Niakan Kalhori
    Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany.
  • Thomas M Deserno
    Department of Medical Informatics, RWTH Aachen University, Pauwelsstr. 30, 52057 Aachen, Germany.