SCOPE-MRI: Bankart Lesion Detection as a Case Study in Data Curation and Deep Learning for Challenging Diagnoses
Journal:
arXiv
Published Date:
Apr 29, 2025
Abstract
While deep learning has shown strong performance in musculoskeletal imaging,
existing work has largely focused on pathologies where diagnosis is not a
clinical challenge, leaving more difficult problems underexplored, such as
detecting Bankart lesions (anterior-inferior glenoid labral tears) on standard
MRIs. Diagnosing these lesions is challenging due to their subtle imaging
features, often leading to reliance on invasive MRI arthrograms (MRAs). This
study introduces ScopeMRI, the first publicly available, expert-annotated
dataset for shoulder pathologies, and presents a deep learning (DL) framework
for detecting Bankart lesions on both standard MRIs and MRAs. ScopeMRI includes
586 shoulder MRIs (335 standard, 251 MRAs) from 558 patients who underwent
arthroscopy. Ground truth labels were derived from intraoperative findings, the
gold standard for diagnosis. Separate DL models for MRAs and standard MRIs were
trained using a combination of CNNs and transformers. Predictions from
sagittal, axial, and coronal views were ensembled to optimize performance. The
models were evaluated on a 20% hold-out test set (117 MRIs: 46 MRAs, 71
standard MRIs). The models achieved an AUC of 0.91 and 0.93, sensitivity of 83%
and 94%, and specificity of 91% and 86% for standard MRIs and MRAs,
respectively. Notably, model performance on non-invasive standard MRIs matched
or surpassed radiologists interpreting MRAs. External validation demonstrated
initial generalizability across imaging protocols. This study demonstrates that
DL models can achieve radiologist-level diagnostic performance on standard
MRIs, reducing the need for invasive MRAs. By releasing ScopeMRI and a modular
codebase for training and evaluating deep learning models on 3D medical imaging
data, we aim to accelerate research in musculoskeletal imaging and support the
development of new datasets for clinically challenging diagnostic tasks.