Translating lineage-resolved single-cell programs to bulk clinical prognosis: Adversarial generative learning reveals lineage-informed immune - stromal - epithelial risk signatures.

Journal: Computational biology and chemistry
Published Date:

Abstract

BACKGROUND: Breast cancer transcriptional programs span malignant epithelial cells, and the tumor microenvironment (TME), yet tumor-normal contrasts are often confounded by lineage identity and cell-type composition, limiting clinically actionable signatures. METHODS: Our objective was to construct a lineage-aware model that prioritizes compact cell-type-specific tumor-normal gene signatures for downstream pathway interpretation and TCGA-BRCA projection, rather than to use classification performance as the final endpoint. Using GSE268662, we built an integrated single-cell atlas with eight major cell types grouped into four lineages (epithelial, immune, stromal, vascular). For each cell type, we trained a lineage-aware adversarial autoencoder-classifier (WGAN-GP) on within-cell-type tumor and normal DEG pools, ranked genes by input-to-logit importance, selected signature size by k-sensitivity (AUROC/AUPRC), and projected signatures into TCGA-BRCA for PAM50-stratified activity and survival analyses. RESULTS: Optimal signatures were compact (Endothelial 50; Fibroblast 20; Pericytes/SMC 10; Epithelial 100; Myeloid 20; T cells 20; Mast 30; B cells 30). At best_k, AUROC ranged from 0.766 to 0.904 and AUPRC from 0.463 to 0.953 (Mast AUROC 0.904; Myeloid AUROC 0.894; B-cell AUPRC 0.953). Top genes were lineage-consistent (e.g., HSPG2/COL4A2/THY1/SPARC; COL1A1/COL1A2/COL3A1; STAT1; LEF1; CXCR4/IGHG3). Enrichment analyses highlighted immune interferon and TNFα/NFκB signaling and antigen presentation, contrasted with stromal/vascular ECM-receptor interaction and focal adhesion pathways; epithelial hallmarks linked EMT and apoptosis. In TCGA-BRCA, epithelial and stromal scores increased broadly in tumors, whereas immune and vascular signals were more subtype-dependent. Within subtype, immune risk scores provided the strongest overall survival separation (C-index up to ∼0.92). CONCLUSION: A lineage-aware GAN framework yields compact, interpretable signatures that capture ecosystem-level biology (immune activity, ECM remodeling, epithelial plasticity) and can be projected into TCGA-BRCA for subtype-aware prognostic stratification in breast cancer.

Authors

Keywords

No keywords available for this article.