A preregistered, Open Pipeline for Early Cerebral Palsy Risk Assessment from Infant Videos
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
Cerebral Palsy (CP), affecting approximately 1 in 500 children due to abnormal brain development, impacts movement control. Early risk assessment via the General Movements Assessment (GMA) at 3-4 months is highly predictive for CP but relies on trained clinicians. Machine-learning-based approaches for predicting GMA score from video have shown considerable promise, but typically rely on dataset-specific preprocessing, custom feature sets, and manually designed model pipelines, which make external benchmarking more difficult. This, combined with strict privacy constraints on sharing data, makes it challenging to train and evaluate models across datasets, which is important for assessing clinical utility. There is therefore a need to develop approaches that will work across different datasets to enable multi-site dataset aggregation and model training. To address this gap, we developed an end-to-end pipeline that uses off-the-shelf pose estimation, general-purpose feature extraction, and automated machine learning — none of which are tuned to a specific dataset. We applied this approach to a newly generated large dataset of 1053 infants (with approximately 10–12% positive class for adverse GMA outcome, drawn from a high-risk clinical cohort) within a preregistered study design. Model performance was evaluated on a strict “lock-box” test set, which remained untouched during any phase of model development or preprocessing optimization, and only used for evaluation once the final model and pipeline had been preregistered. The developed model achieved moderate predictive accuracy for clinician-assessed GMA scores (Area Under the Receiver Operating Characteristic Curve, ROC-AUC = 0.77; Area Under the Precision-Recall Curve, PR-AUC = 0.41). The moderate accuracy is noteworthy given the 10–12% positive class prevalence, and power-law scaling of ROC-AUC as a function of increasing dataset size. By releasing de-identified feature data and open-source code, and simplifying the training pipeline using AutoML, our work establishes essential groundwork for future robust, globally relevant CP screening tools suitable for low-resource settings. Introduced an open, accessible video-based pipeline, and used it to predict General Movements Assessment (GMA) scores (a key early indicator of Cerebral Palsy [CP] risk). Rigorously validated this pipeline on a large infant cohort (1053 videos), employing a preregistered design and a “lock-box” test set to ensure robust evaluation and minimize the risk of overly optimistic performance estimates. Demonstrated that relatively simple movement features, derived from handheld camera recordings achieve moderate predictive accuracy for GMA scores (ROC-AUC 0.77, PR-AUC 0.41), with power-law improvement with increasing dataset size, even under these stringent training and evaluation conditions. Designed the pipeline to facilitate broader application and collaborative research, particularly through its use of generalizable pose estimation and by enabling the extraction and sharing of de-identified movement features for aggregated dataset creation across clinical sites. Released the entire pipeline as open-source (including data processing, feature computation, and AutoML components) to promote transparency and reproducibility.