Trustworthy image-to-image translation: evaluating uncertainty calibration in unpaired training scenarios
Journal:
arXiv
Published Date:
Jan 29, 2025
Abstract
Mammographic screening is an effective method for detecting breast cancer,
facilitating early diagnosis. However, the current need to manually inspect
images places a heavy burden on healthcare systems, spurring a desire for
automated diagnostic protocols. Techniques based on deep neural networks have
been shown effective in some studies, but their tendency to overfit leaves
considerable risk for poor generalisation and misdiagnosis, preventing their
widespread adoption in clinical settings. Data augmentation schemes based on
unpaired neural style transfer models have been proposed that improve
generalisability by diversifying the representations of training image features
in the absence of paired training data (images of the same tissue in either
image style). But these models are similarly prone to various pathologies, and
evaluating their performance is challenging without ground truths/large
datasets (as is often the case in medical imaging). Here, we consider two
frameworks/architectures: a GAN-based cycleGAN, and the more recently developed
diffusion-based SynDiff. We evaluate their performance when trained on image
patches parsed from three open access mammography datasets and one non-medical
image dataset. We consider the use of uncertainty quantification to assess
model trustworthiness, and propose a scheme to evaluate calibration quality in
unpaired training scenarios. This ultimately helps facilitate the trustworthy
use of image-to-image translation models in domains where ground truths are not
typically available.