Enhancing Privacy: The Utility of Stand-Alone Synthetic CT and MRI for Tumor and Bone Segmentation
Journal:
arXiv
Published Date:
Jun 13, 2025
Abstract
AI requires extensive datasets, while medical data is subject to high data
protection. Anonymization is essential, but poses a challenge for some regions,
such as the head, as identifying structures overlap with regions of clinical
interest. Synthetic data offers a potential solution, but studies often lack
rigorous evaluation of realism and utility. Therefore, we investigate to what
extent synthetic data can replace real data in segmentation tasks. We employed
head and neck cancer CT scans and brain glioma MRI scans from two large
datasets. Synthetic data were generated using generative adversarial networks
and diffusion models. We evaluated the quality of the synthetic data using MAE,
MS-SSIM, Radiomics and a Visual Turing Test (VTT) performed by 5 radiologists
and their usefulness in segmentation tasks using DSC. Radiomics indicates high
fidelity of synthetic MRIs, but fall short in producing highly realistic CT
tissue, with correlation coefficient of 0.8784 and 0.5461 for MRI and CT
tumors, respectively. DSC results indicate limited utility of synthetic data:
tumor segmentation achieved DSC=0.064 on CT and 0.834 on MRI, while bone
segmentation a mean DSC=0.841. Relation between DSC and correlation is
observed, but is limited by the complexity of the task. VTT results show
synthetic CTs' utility, but with limited educational applications. Synthetic
data can be used independently for the segmentation task, although limited by
the complexity of the structures to segment. Advancing generative models to
better tolerate heterogeneous inputs and learn subtle details is essential for
enhancing their realism and expanding their application potential.