Cumulative In-Context Learning versus Simple Historical Weighting for Real-Time Geographic Origin Identification of Ongoing Epidemic Waves: A Comparative Evaluation Using Eight COVID-19 Waves in Japan

Journal: medRxiv
Published Date:

Abstract

Background: Identifying the geographic origin of epidemic waves early is critical for targeted public health responses. Conventional statistical methods for wave origin estimation rely on fixed algorithms applied to case count time-series data and treat each wave independently. Large language models (LLMs) offer a novel alternative through cumulative learning-the ability to incorporate confirmed epidemiological findings from prior waves into predictions for subsequent waves. Whether this approach outperforms conventional statistical baselines in early detection, and whether the same cumulative learning principle can be implemented in transparent statistical methods, remains unknown. Methods: We compared three computational approaches across eight COVID-19 epidemic waves in Japan (Waves 2-8, 2020-2023): (1) non-cumulative statistical baselines (B0-B5) treating each wave independently; (2) a cumulative-learning LLM (Claude Haiku) receiving confirmed origins from all prior waves as in-context historical knowledge; and (3) cumulative calculation statistical baselines implementing the identical historical weighting mechanism as a transparent arithmetic score. We additionally evaluated a non-cumulative LLM condition-receiving only current-wave data-to isolate the contribution of intrinsic LLM geographic reasoning from accumulated historical knowledge. All approaches were evaluated at 7, 14, 21, and 28 days after wave onset and validated against genomically confirmed wave origins. Results: Cumulative calculation statistical baselines (B1, B3) achieved mean F1 = 0.51 at 14 days after wave onset, performing comparably to the cumulative-learning LLM (F1 = 0.52) and outperforming all non-cumulative statistical baselines (F1 = 0.41-0.46). Wave 7 (Omicron BA.5) was correctly identified at 14 days by both methods (F1 = 1.00). Wave 6 (Omicron BA.1) was undetectable by all methods (F1 = 0.00), consistent with an origin outside the domestic surveillance system. Conclusions: The cumulative historical weighting mechanism-not LLM reasoning per se-drives performance improvement, as transparent arithmetic implementation matches LLM accuracy. However, the non-cumulative LLM achieves F1 = 0.46 without any historical context, suggesting substantial intrinsic geographic reasoning capacity. These findings advance understanding of when and why in-context learning confers advantage, and provide a deployable, spreadsheet-implementable method for real-time epidemic origin identification requiring no AI infrastructure.

Authors

  • Nakagawa
  • S.; Yamamoto
  • A.