Supervised Learning Methods for Predicting Healthcare Costs: Systematic Literature Review and Empirical Evaluation.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
PMID:

Abstract

An important informatics tool for controlling healthcare costs is accurately predicting the likely future healthcare costs of individuals. To address this important need, we conducted a systematic literature review and identified five methods for predicting healthcare costs. To enable a direct comparison of these different approaches, we empirically evaluated the predictive performance of each reported approach, as well as other state-of-the-art supervised learning methods, using data from University of Utah Health Plans for October 2013 through October 2016. The data set consisted of approximately 90,000 individuals, 6.3 million medical claims and 1.2 million pharmacy claims. In this comparative analysis, gradient boosting had the best predictive performance overall and for low to medium cost individuals. For high cost individuals, Artificial Neural Network (ANN) and the Ridge regression model, which have not been previously reported for use in healthcare cost prediction, had the highest performance.

Authors

  • Mohammad Amin Morid
    Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
  • Kensaku Kawamoto
    Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
  • Travis Ault
    niversity of Utah Health Plans, Murray, UT, USA.
  • Josette Dorius
    niversity of Utah Health Plans, Murray, UT, USA.
  • Samir Abdelrahman
    Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.