Larger sample sizes are needed when developing a clinical prediction model using machine learning in oncology: methodological systematic review.

Journal: Journal of clinical epidemiology
PMID:

Abstract

BACKGROUND AND OBJECTIVES: Having a sufficient sample size is crucial when developing a clinical prediction model. We reviewed details of sample size in studies developing prediction models for binary outcomes using machine learning (ML) methods within oncology and compared the sample size used to develop the models with the minimum required sample size needed when developing a regression-based model (N).

Authors

  • Biruk Tsegaye
    Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK. Electronic address: biruk.tsegaye@csm.ox.ac.uk.
  • Kym I E Snell
    Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK; Institute of Translational Medicine, National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK.
  • Lucinda Archer
    Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK; Institute of Translational Medicine, National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK.
  • Shona Kirtley
    Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
  • Richard D Riley
    School of Primary, Community and Social Care, Keele University, Keele, UK.
  • Matthew Sperrin
  • Ben Van Calster
  • Gary S Collins
    Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Windmill Road, Oxford, OX3 7LD UK; Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
  • Paula Dhiman
    Center for Statistics in Medicine, University of Oxford, Oxford, UK.