Genetic association studies using disease liabilities from deep neural networks.
Journal:
American journal of human genetics
PMID:
39986278
Abstract
The case-control study is a widely used method for investigating the genetic underpinnings of binary traits. However, long-term, prospective cohort studies often grapple with absent or evolving health-related outcomes. Here, we propose two methods, liability and meta, for conducting genome-wide association studies (GWASs) that leverage disease liabilities calculated from deep patient phenotyping. Analyzing 38 common traits in ∼300,000 UK Biobank participants, we identified an increased number of loci in comparison to the number identified by the conventional case-control approach, and there were high replication rates in larger external GWASs. Further analyses confirmed the disease specificity of the genetic architecture; the meta method demonstrated higher robustness when phenotypes were imputed with low accuracy. Additionally, polygenic risk scores based on disease liabilities more effectively predicted newly diagnosed cases in the 2022 dataset, which were controls in the earlier 2019 dataset. Our findings demonstrate that integrating high-dimensional phenotypic data into deep neural networks enhances genetic association studies while capturing disease-relevant genetic architecture.