Self-attention fusion and adaptive continual updating for multimodal federated learning with heterogeneous data.
Journal:
Neural networks : the official journal of the International Neural Network Society
PMID:
40090301
Abstract
Federated learning (FL) enables collaborative model training without direct data sharing, facilitating knowledge exchange while ensuring data privacy. Multimodal federated learning (MFL) is particularly advantageous for decentralized multimodal data, effectively managing heterogeneous information across modalities. However, the diversity in environments and data collection methods among participating devices introduces substantial challenges due to non-independent and identically distributed (non-IID) data. Our experiments reveal that, despite the theoretical benefits of multimodal data, MFL under non-IID conditions often exhibits poor performance, even trailing traditional unimodal FL approaches. Additionally, MFL frequently encounter missing modality issues, further complicating the training process. To address these challenges, we propose several improvements: the federated self-attention multimodal (FSM) feature fusion method and the multimodal federated learning adaptive continual update (FedMAC) algorithm. Moreover, we utilize a Stable Diffusion model to mitigate the impact of missing image modality. Extensive experimental results demonstrate that our proposed methods outperform other state-of-the-art FL algorithms, enhancing both accuracy and robustness in MFL.