Cross-modal attentive fusion network for tri-modal lesion growth prediction.
Journal:
Scientific reports
Published Date:
Jun 8, 2026
Abstract
Pre-trained LSTM-RNN models with linear kernels fail to capture irregular, non-linear lesion growth patterns. In this paper presents a new deep learning model is called Cross-Modal Attention Fusion Network (CMAFN) to enhance an effective feature fusion of multiple medical image modalities such as Mammography, Magnetic Resonance Imaging (MRI) and Ultrasound (US). It enables progressively transforms and aligns features from multiple modalities into a unified representation space. Our CMAFN model combines the three advanced modules, Deep Canonical Correlation Analysis (DCCA) to effectively fuses non-linear features in imaging modalities, Cross-Modal Attention Mechanism (CMAM) to adaptively combine modalities and align features, and Radial Basis Function with Conventional Long Term Short Memory (RBF-ConvLSTM) to learn both linear and non-linear spatial-temporal growth patterns. Overall, the proposed CMAFN model supportive for multi-modal medical image analysis and interpretable predictions of lesion progression over time.
Authors
Keywords
No keywords available for this article.