Enhanced CT and MRI Focal Bone Tumor Classification with Machine Learning-based Stratification: A Multicenter Retrospective Study.
Journal:
Radiology
PMID:
40261174
Abstract
Background Standardized bone tumor reporting is crucial for consistent, risk-aligned patient management. Current systems are based on expert consensus and/or lack multicenter validation. Purpose To evaluate a machine learning-based approach for differentiating between benign and malignant focal bone lesions and to propose a Bone Tumor Imaging Reporting and Data System (BTI-RADS) 2.0 for further risk stratification. Materials and Methods This retrospective multicenter trial included patients with solitary bone tumors undergoing radiography or CT and MRI at 10 centers from November 2009 to March 2022. Patients were divided into training and test datasets. Predefined radioclinical features were extracted. The training dataset was considered for bootstrapped χ feature selection, and extreme gradient boosting (XGBoost) classifiers were optimized using nested cross-validation. Continuous classifier outputs were thresholded to stratify patients into seven malignancy risk classes (BTI-RADS 2.0), and malignancy rates were assessed for the test set. XGBoost and human expert performances were compared using the Wilcoxon signed-rank significance test with a significance level of .05. Results In total, 1113 patients (mean age, 39 years ± 22 [SD]; 623 men) were included: 298 in the training and 815 in the test datasets. Twenty-seven of 80 (34%) multimodal features were selected based on χ analysis. Best classification performances were achieved by an XGBoost model trained on 27 features, with an F1 score of 0.81 (95% CI: 0.78, 0.84). This model performed slightly inferior to 28 experienced radiologists, who demonstrated an F1 score of 0.83 (95% CI: 0.80, 0.85; < .001). BTI-RADS 2.0 risk grades II-V were associated with malignancy rates of 0% (0 of 102; 95% CI: 0, 0), 8.3% (14 of 168; 95% CI: 4, 13), 45% (121 of 271; 95% CI: 39, 50), and 92% (252 of 274; 95% CI: 89, 95), respectively, identifying malignant lesions with a sensitivity of 96% (373 of 387; 95% CI: 94, 98). Conclusion A machine learning algorithm and risk stratification system achieved accurate and standardized bone tumor malignancy grading. Clinical trial registration no. NCT04884048 © RSNA, 2025 See also the editorial by Tordjman and Murphey in this issue.