Rapid and accurate multi-phenotype imputation for millions of individuals.

Journal: Nature communications
PMID:

Abstract

Deep phenotyping can enhance the power of genetic analysis, including genome-wide association studies (GWAS), but the occurrence of missing phenotypes compromises the potential of such resources. Although many phenotypic imputation methods have been developed, the accurate imputation of millions of individuals remains challenging. In the present study, we have developed a multi-phenotype imputation method based on mixed fast random forest (PIXANT) by leveraging efficient machine learning (ML)-based algorithms. We demonstrate by extensive simulations that PIXANT is reliable, robust and highly resource-efficient. We then apply PIXANT to the UKB data of 277,301 unrelated White British citizens and 425 traits, and GWAS is subsequently performed on the imputed phenotypes, 18.4% more GWAS loci are identified than before imputation (8710 vs 7355). The increased statistical power of GWAS identified some additional candidate genes affecting heart rate, such as RNF220, SCN10A, and RGS6, suggesting that the use of imputed phenotype data from a large cohort may lead to the discovery of additional candidate genes for complex traits.

Authors

  • Lin-Lin Gu
    Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.
  • Hong-Shan Wu
    Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.
  • Tian-Yi Liu
    Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.
  • Yong-Jie Zhang
    School of Metallurgy, Northeastern University, Shenyang 110819, China.
  • Jing-Cheng He
    Center for Data Science, School of Mathematical Sciences, Zhejiang University, Hangzhou, Zhejiang, People's Republic of China.
  • Xiao-Lei Liu
    Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, People's Republic of China.
  • Zhi-Yong Wang
    Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs & Fisheries college, Jimei University, Xiamen, Fujian, People's Republic of China.
  • Guo-Bo Chen
    Center for General Practice Medicine, Department of General Practice Medicine, Clinical Research Institute, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou, Zhejiang, People's Republic of China. chenguobo@gmail.com.
  • Dan Jiang
    Department of Operative Dentistry and Endodontics, The Affiliated Hospital of Stomatology, Chongqing Medical University, Chongqing, China.
  • Ming Fang
    Dalian Medical University Graduate School, Dalian, China.