Explainable multimodal hematology analysis for white blood cell classification and attribute prediction.
Journal:
Computers in biology and medicine
Published Date:
Jul 29, 2025
Abstract
White blood cell (WBC) classification and morphological attribute prediction are critical for automated hematological analyses. To provide detailed and interpretable predictions, this paper proposes a multimodal visual-language embedding learning approach based on the contrastive language image pretraining (CLIP) model for WBC classification and attribute prediction. First, structured natural language prompts are created around WBC types and morphological attributes to offer rich semantic context that enhances the processing of multimodal inputs. Moreover, a joint-task optimization strategy is introduced to align the generated encodings from the WBC images with their corresponding structured text prompts in a shared semantic space, thus improving interpretability and prediction accuracy. Furthermore, a multi-task loss function with an adaptive weighting mechanism is employed to address class imbalance, effectively balancing classification and attribute prediction tasks to boost model performance. Experimental evaluations on publicly available datasets demonstrate that the proposed method achieves state-of-the-art performance in both WBC classification and attribute prediction.