Unsupervised SAM-guided mixture-of-multimodal-experts fusion network for medical image diagnosis.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Nov 22, 2025
Abstract
Accurate diagnosis of cancer from medical images relies on both precise lesion localization and complementary multimodal information. However, current methods suffer from two key limitations: (1) dependence on costly pixel-level annotations for lesion segmentation, and (2) rigid fusion strategies that ignore patient-specific modality contributions. To address these challenges, we propose an Unsupervised SAM-guided Mixture-of-Multimodal-Experts Fusion Network (UnSAM-MoME) for medical image diagnosis. In the first stage, we introduce a dual cross-validation segmentation network that automatically generates high-confidence prompts to guide the Segment Anything Model (SAM), enabling precise lesion localization without manual labels. In the second stage, we design a Mixture-of-Multimodal-Experts (MoME) fusion module that dynamically selects specialized experts to adaptively fuse image and metadata features based on individual patient characteristics. Experiments on three skin cancer datasets and one breast cancer dataset demonstrate that UnSAM-MoME achieves state-of-the-art performance, with significant improvements in accuracy, precision, and generalizability. Extensive ablation studies further validate the effectiveness of each module, underscoring the framework's potential for scalable and personalized cancer diagnosis.
Authors
Keywords
No keywords available for this article.