Missing data is poorly handled and reported in prediction model studies using machine learning: a literature review.

Journal: Journal of clinical epidemiology
Published Date:

Abstract

OBJECTIVES: Missing data is a common problem during the development, evaluation, and implementation of prediction models. Although machine learning (ML) methods are often said to be capable of circumventing missing data, it is unclear how these methods are used in medical research. We aim to find out if and how well prediction model studies using machine learning report on their handling of missing data.

Authors

  • Swj Nijman
    Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands. Electronic address: s.w.j.nijman@umcutrecht.nl.
  • A M Leeuwenberg
    Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands.
  • I Beekers
    Department of Health, Ortec B.V. Zoetermeer, The Netherlands.
  • I Verkouter
    Department of Health, Ortec B.V. Zoetermeer, The Netherlands.
  • Jjl Jacobs
    Department of Health, Ortec B.V. Zoetermeer, The Netherlands.
  • M L Bots
    Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands.
  • F W Asselbergs
    Department of Cardiology, University Medical Center Utrecht, Utrecht University, The Netherlands; Institute of Cardiovascular Science, Population Health Sciences, University College London, London, UK; Health Data Research UK, Institute of Health Informatics, University College London, London, UK.
  • Kgm Moons
    Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands.
  • Tpa Debray
    Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Heidelberglaan 100, Utrecht, 3584 CX , The Netherlands; Health Data Research UK, Institute of Health Informatics, University College London, London, UK.