Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.

Journal: Computer methods and programs in biomedicine
Published Date:

Abstract

INTRODUCTION: Machine Learning (ML) is transforming medical research by enhancing diagnostic accuracy, predicting disease progression, and personalizing treatments. While general models trained on large datasets identify broad patterns across populations, the diversity of human biology, shaped by genetics, environment, and lifestyle, often limits their effectiveness. This has driven a shift towards subject-specific models that incorporate individual biological and clinical data for more precise predictions and personalized care. However, developing these models presents significant practical and financial challenges. Additionally, ML models initialized through stochastic processes with random seeds can suffer from reproducibility issues when those seeds are changed, leading to variations in predictive performance and feature importance. To address this, this study introduces a novel validation approach to enhance model interpretability, stabilizing predictive performance and feature importance at both the group and subject-specific levels.

Authors

  • Gideon Vos
    College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.
  • Liza van Eijk
    College of Health Care Sciences, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.
  • Zoltan Sarnyai
    College of Public Health, Medical, and Vet Sciences, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.
  • Mostafa Rahimi Azghadi
    College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia. Electronic address: mostafa.rahimiazghadi@jcu.edu.au.