Genetic analyses of eight complex diseases using predicted continuous representations of disease.
Journal:
Cell reports methods
Published Date:
Jul 25, 2025
Abstract
We evaluated whether predicted continuous disease representations could enhance genetic discovery beyond case-control genome-wide association study (GWAS) phenotypes across eight complex diseases in up to 485,448 UK Biobank participants. Predicted phenotypes had high genetic correlations with case-control phenotypes (median r = 0.66) but identified more independent associations (median 306 versus 125). While some predicted phenotype associations were spurious, multi-trait analysis of GWAS-boosted case-control phenotypes identified a median of 46 additional variants per disease, of which a median of 73% replicated in FinnGen, 37% reached genome-wide significance in a UK Biobank/FinnGen meta-analysis, and 45% had supporting evidence. Predicted phenotypes also identified 14 genes targeted by phase I-IV drugs not identified by case-control phenotypes, and combined polygenic risk scores (PRSs) using both phenotypes improved prediction performance, with a median 37% increase in Nagelkerke's R. Predicted phenotypes represent composite biomarkers complementing case-control approaches in genetic discovery, drug target prioritization, and risk prediction, though efficacy varies across diseases.