Does Cohort Selection Affect Machine Learning from Clinical Data?

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:

Abstract

This study investigates cohort selection and its effects on the quality of machine learning (ML) models trained on clinical data, focusing on measurements taken within the first 48 hours of hospital admission. It discusses the potential repercussions of making arbitrary decisions during data processing prior to applying ML methods. Experiments are performed within the framework of the National COVID Cohort Collaborative (N3C) dataset. The research aims to unravel biases and assess the fairness of machine learning models used to predict outcomes for hospitalized patients. Detailed discussions cover the data, decision-making processes, and the resulting impact on model predictions regarding patient outcomes. An experiment is conducted in which four arbitrary decisions are made, resulting in 16 distinct datasets characterized by varying sizes and properties. The findings demonstrate significant differences in the obtained datasets and indicate a high potential for bias based on inclusion or exclusion decisions. The results also confirm significant differences in the performance of models constructed on different cohorts, especially when cross-compared between ones based on different inclusion criteria. The study specifically chose to analyze gender, race, and ethnicity as these social determinants of health played a significant role in COVID-19 outcomes.

Authors

  • Atefehsadat Haghighathoseini
    George Mason University, Fairfax, VA, USA.
  • Janusz Wojtusiak
    Deprtment of Health Administration and Policy, College of Public Health, George Mason University, Fairfax, Virginia, United States.
  • Hua Min
    Hua Min, Department of Health Administration and Policy, College of Health and Human Services, George Mason University, MS: 1J3, 4400 University Drive, Fairfax, VA 22030-4444, USA, E-mail: hmin3@gmu.edu.
  • Timothy Leslie
    George Mason University, Fairfax, VA, USA.
  • Cara Frankenfeld
    MaineHealth Institute for Research, Scarborough, ME, USA.
  • Nirup M Menon
    George Mason University, Fairfax, VA, USA.