FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training
Journal:
arXiv
Published Date:
Apr 3, 2025
Abstract
Federated Active Learning (FAL) has emerged as a promising framework to
leverage large quantities of unlabeled data across distributed clients while
preserving data privacy. However, real-world deployments remain limited by high
annotation costs and communication-intensive sampling processes, particularly
in a cross-silo setting, when clients possess substantial local datasets. This
paper addresses the crucial question: What is the best practice to reduce
communication costs in human-in-the-loop learning with minimal annotator
effort? Existing FAL methods typically rely on iterative annotation processes
that separate active sampling from federated updates, leading to multiple
rounds of expensive communication and annotation. In response, we introduce
FAST, a two-pass FAL framework that harnesses foundation models for weak
labeling in a preliminary pass, followed by a refinement pass focused
exclusively on the most uncertain samples. By leveraging representation
knowledge from foundation models and integrating refinement steps into a
streamlined workflow, FAST substantially reduces the overhead incurred by
iterative active sampling. Extensive experiments on diverse medical and natural
image benchmarks demonstrate that FAST outperforms existing FAL methods by an
average of 4.36% while reducing communication rounds eightfold under a limited
5% labeling budget.