A CNN-Transformer for Classification of Longitudinal 3D MRI Images -- A Case Study on Hepatocellular Carcinoma Prediction
Journal:
arXiv
Published Date:
Jan 18, 2025
Abstract
Longitudinal MRI analysis is crucial for predicting disease outcomes,
particularly in chronic conditions like hepatocellular carcinoma (HCC), where
early detection can significantly influence treatment strategies and patient
prognosis. Yet, due to challenges like limited data availability, subtle
parenchymal changes, and the irregular timing of medical screenings, current
approaches have so far focused on cross-sectional imaging data. To address
this, we propose HCCNet, a novel model architecture that integrates a 3D
adaptation of the ConvNeXt CNN architecture with a Transformer encoder,
capturing both the intricate spatial features of 3D MRIs and the complex
temporal dependencies across different time points. HCCNet utilizes a two-stage
pre-training process tailored for longitudinal MRI data. The CNN backbone is
pre-trained using a self-supervised learning framework adapted for 3D MRIs,
while the Transformer encoder is pre-trained with a sequence-order-prediction
task to enhance its understanding of disease progression over time. We
demonstrate the effectiveness of HCCNet by applying it to a cohort of liver
cirrhosis patients undergoing regular MRI screenings for HCC surveillance. Our
results show that HCCNet significantly improves predictive accuracy and
reliability over baseline models, providing a robust tool for personalized HCC
surveillance. The methodological approach presented in this paper is versatile
and can be adapted to various longitudinal MRI screening applications. Its
ability to handle varying patient record lengths and irregular screening
intervals establishes it as an invaluable framework for monitoring chronic
diseases, where timely and accurate disease prognosis is critical for effective
treatment planning.