ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification
Journal:
arXiv
Published Date:
Feb 12, 2025
Abstract
Precise classification of megakaryocytes is crucial for diagnosing
myelodysplastic syndromes. Although self-supervised learning has shown promise
in medical image analysis, its application to classifying megakaryocytes in
stained slides faces three main challenges: (1) pervasive background noise that
obscures cellular details, (2) a long-tailed distribution that limits data for
rare subtypes, and (3) complex morphological variations leading to high
intra-class variability. To address these issues, we propose the ActiveSSF
framework, which integrates active learning with self-supervised pretraining.
Specifically, our approach employs Gaussian filtering combined with K-means
clustering and HSV analysis (augmented by clinical prior knowledge) for
accurate region-of-interest extraction; an adaptive sample selection mechanism
that dynamically adjusts similarity thresholds to mitigate class imbalance; and
prototype clustering on labeled samples to overcome morphological complexity.
Experimental results on clinical megakaryocyte datasets demonstrate that
ActiveSSF not only achieves state-of-the-art performance but also significantly
improves recognition accuracy for rare subtypes. Moreover, the integration of
these advanced techniques further underscores the practical potential of
ActiveSSF in clinical settings.