Improving Colorectal Cancer Screening and Risk Assessment through Predictive Modeling on Medical Images and Records
Journal:
arXiv
Published Date:
Oct 13, 2024
Abstract
Colonoscopy screening effectively identifies and removes polyps before they
progress to colorectal cancer (CRC), but current follow-up guidelines rely
primarily on histopathological features, overlooking other important CRC risk
factors. Variability in polyp characterization among pathologists also hinders
consistent surveillance decisions. Advances in digital pathology and deep
learning enable the integration of pathology slides and medical records for
more accurate CRC risk prediction. Using data from the New Hampshire
Colonoscopy Registry, including longitudinal follow-up, we adapted a
transformer-based model for histopathology image analysis to predict 5-year CRC
risk. We further explored multi-modal fusion strategies to combine clinical
records with deep learning-derived image features. Training the model to
predict intermediate clinical variables improved 5-year CRC risk prediction
(AUC = 0.630) compared to direct prediction (AUC = 0.615, p = 0.013).
Incorporating both imaging and non-imaging data, without requiring manual slide
review, further improved performance (AUC = 0.674) compared to traditional
features from colonoscopy and microscopy reports (AUC = 0.655, p = 0.001).
These results highlight the value of integrating diverse data modalities with
computational methods to enhance CRC risk stratification.