Predictive overfitting in immunological applications: Pitfalls and solutions.

Journal: Human vaccines & immunotherapeutics

PMID: 37697867

Abstract

Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.

Authors

Jeremy P Gygi

Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA.
Steven H Kleinstein

Department of Pathology, Yale School of Medicine, New Haven, CT, USA. steven.kleinstein@yale.edu.
Leying Guan

Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA.

Keywords

Machine Learning Vaccination

External Resources

View on PubMed Access via DOI PubMed (37697867)

Predictive overfitting in immunological applications: Pitfalls and solutions.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals