Machine Learning and Probabilistic Approaches for Forecasting Infectious Disease Transmission and Cases

Journal: medRxiv
Published Date:

Abstract

Forecasting the effective reproductive number (Rt) and infection case counts is critical for guiding public health responses. We developed a machine learning and probabilistic forecasting framework to predict Rt and daily COVID-19 cases, respectively, across South Carolina counties, with the flexibility to generalize to other infectious diseases. We first estimated Rt using the EpiNow2 R package, which incorporates Bayesian time-series modeling and accounts for reporting delay and incubation period. These initial estimates were refined using spatial covariate-adjusted smoothing through the Integrated Nested Laplace Approximation (INLA). We then generated Rt forecasts using an ensemble of linear regression, random forest, and XGBoost models. Daily case forecasts were obtained by linking Rt trajectories with historical case data via a Poisson model. This ensemble-based approach outperformed EpiNow2 across different forecast horizons (7-day, 14-day, and 21-day). In the first forecast period (November 11, 2020 – February 02, 2021), the ensemble achieved a median PA of 96.5% (IQR: 95.4% – 97.1%) for 7-day horizon Rt forecast, compared to 87.0% (IQR: 84.4% – 89.4%) from EpiNow2. In the second period (December 11, 2022 – March 04, 2023), the ensemble attained a 93.0% median PA for Rt forecast (IQR: 90.8% – 95.4%), while EpiNow2 reached 86.8% (IQR: 82.5% – 89.2%). Similar trends were observed for case forecasts, with the ensemble model demonstrating improved performance. This study presents a flexible forecasting framework that integrates Bayesian estimation, spatial smoothing, and ensemble machine learning to improve the accuracy of COVID-19 transmission and case forecasts. The approach enhances epidemic forecasting performance and offers scalable tools to support data-driven public health preparedness and response.

Authors

  • Md Sakhawat Hossain; Ravi Goyal; Natasha K Martin; Victor DeGruttola; Tanvir Ahammed; Christopher McMahan; Lior Rennert

Categories