Development and evaluation of machine learning training strategies for neonatal mortality prediction using multicountry data.
Journal:
Scientific reports
Published Date:
Jul 7, 2025
Abstract
Neonatal mortality poses a critical challenge in global health, particularly in low- and middle-income countries. Leveraging advancements in technology, such as machine learning (ML) algorithms, offers the potential to improve neonatal care by enabling precise prediction and prevention of mortality risks. This study utilized the Maternal and Neonatal Health Registry (MNHR) dataset from the National Institutes of Health (NIH), encompassing multicentric neonatal data across various countries, to evaluate the effectiveness of ML in predicting neonatal mortality risk. We compared three training approaches: a generalized model applicable across all countries, country-specific models tailored to local healthcare characteristics, and a model derived from the largest single-country dataset. Utilizing data from 2010 to 2016 for training and validation from 2017 to 2019, our analysis included 575,664 pregnancies and assessed five ML algorithms based on key neonatal health indicators recommended by the World Health Organization. Notably, the generalized model demonstrated the highest predictive performance, achieving an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.816, highlighting the benefits of leveraging a diverse dataset. Our findings advocate for the integration of generalized ML models into healthcare strategies to improve neonatal health outcomes and emphasize the importance of data diversity in reducing neonatal mortality rates.