Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

Journal: Acta neurochirurgica. Supplement

Published Date: Jan 1, 2022

Abstract

We illustrate the steps required to train and validate a simple, machine learning-based clinical prediction model for any binary outcome, such as, for example, the occurrence of a complication, in the statistical programming language R. To illustrate the methods applied, we supply a simulated database of 10,000 glioblastoma patients who underwent microsurgery, and predict the occurrence of 12-month survival. We walk the reader through each step, including import, checking, and splitting of datasets. In terms of pre-processing, we focus on how to practically implement imputation using a k-nearest neighbor algorithm, and how to perform feature selection using recursive feature elimination. When it comes to training models, we apply the theory discussed in Parts I-III. We show how to implement bootstrapping and to evaluate and select models based on out-of-sample error. Specifically for classification, we discuss how to counteract class imbalance by using upsampling techniques. We discuss how the reporting of a minimum of accuracy, area under the curve (AUC), sensitivity, and specificity for discrimination, as well as slope and intercept for calibration-if possible alongside a calibration plot-is paramount. Finally, we explain how to arrive at a measure of variable importance using a universal, AUC-based method. We provide the full, structured code, as well as the complete glioblastoma survival database for the readers to download and execute in parallel to this section.

Authors

Victor E Staartjes

Department of Neurosurgery, Bergman Clinics, Naarden, The Netherlands; and.
Julius M Kernbach

Department of Neurosurgery, RWTH Aachen University Hospital, Aachen, Germany.

Keywords

Algorithms Humans Logistic Models Machine Learning Models, Statistical Prognosis

External Resources

View on PubMed Access via DOI PubMed (34862525)

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV-A Practical Approach to Binary Classification Problems.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals