Mitigating bias in prostate cancer diagnosis using synthetic data for improved AI driven Gleason grading.
Journal:
NPJ precision oncology
Published Date:
May 23, 2025
Abstract
Prostate cancer (PCa) is a leading cause of cancer-related mortality in men, with Gleason grading critical for prognosis and treatment decisions. Machine learning (ML) models offer potential for automated grading but are limited by dataset biases, staining variability, and data scarcity, reducing their generalizability. This study employs generative adversarial networks (GANs) to generate high-quality synthetic histopathological images to address these challenges. A conditional GAN (dcGAN) was developed and validated using expert pathologist review and Spatial Heterogeneous Recurrence Quantification Analysis (SHRQA), achieving 80% diagnostic quality approval. A convolutional neural network (EfficientNet) was trained on original and synthetic images and validated across TCGA, PANDA Challenge, and MAST trial datasets. Integrating synthetic images improved classification accuracy for Gleason 3 (26%, p = 0.0010), Gleason 4 (15%, p = 0.0274), and Gleason 5 (32%, p < 0.0001), with sensitivity and specificity reaching 81% and 92%, respectively. This study demonstrates that synthetic data significantly enhances ML-based Gleason grading accuracy and improves reproducibility, providing a scalable AI-driven solution for precision oncology.
Authors
Keywords
No keywords available for this article.