Boosting semi-supervised federated learning by effectively exploiting server-side knowledge and client-side unconfident samples.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Apr 4, 2025
Abstract
Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients' data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the class proxies of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.