Predicting DNA damage response using synthetic cell painting profiles and experimental analysis.
Journal:
iScience
Published Date:
Feb 11, 2026
Abstract
Detecting DNA damage response (DDR) using cell painting profiles is challenging due to limited sample sizes and skewed class distributions. We established a robust classification framework to enhance DDR prediction based on synthetic data. Using the idr-0080 dataset, we generated synthetic profiles with the Gaussian copula, CTGAN, VAE, and CopulaGAN algorithms, and assessed their quality through fidelity metrics. Among four classifiers evaluated with real and/or synthetic data under preserved or resolved class imbalance, an SVM trained on real data augmented by Gaussian copula-generated synthetic data achieved the best performance (F1-score = 0.87, AUROC = 0.94). SHAP analysis highlighted key predictive morphological features. The model successfully identified known and previously unreported DDR inducers in the external cpg-0012 dataset, which were experimentally confirmed by γH2AX marker accumulation and reduced cell viability. Overall, our machine learning framework integrating synthetic cell painting profiles effectively predicted DDR, providing a scalable virtual prescreening approach for drug discovery.
Authors
Keywords
No keywords available for this article.