Multi-Source COVID-19 Detection via Variance Risk Extrapolation
Journal:
arXiv
Published Date:
Jun 29, 2025
Abstract
We present our solution for the Multi-Source COVID-19 Detection Challenge,
which aims to classify chest CT scans into COVID and Non-COVID categories
across data collected from four distinct hospitals and medical centers. A major
challenge in this task lies in the domain shift caused by variations in imaging
protocols, scanners, and patient populations across institutions. To enhance
the cross-domain generalization of our model, we incorporate Variance Risk
Extrapolation (VREx) into the training process. VREx encourages the model to
maintain consistent performance across multiple source domains by explicitly
minimizing the variance of empirical risks across environments. This
regularization strategy reduces overfitting to center-specific features and
promotes learning of domain-invariant representations. We further apply Mixup
data augmentation to improve generalization and robustness. Mixup interpolates
both the inputs and labels of randomly selected pairs of training samples,
encouraging the model to behave linearly between examples and enhancing its
resilience to noise and limited data. Our method achieves an average macro F1
score of 0.96 across the four sources on the validation set, demonstrating
strong generalization.