RefRGim: an intelligent reference panel reconstruction method for genotype imputation with convolutional neural networks.

Journal: Briefings in bioinformatics
PMID:

Abstract

Genotype imputation is a statistical method for estimating missing genotypes from a denser haplotype reference panel. Existing methods usually performed well on common variants, but they may not be ideal for low-frequency and rare variants. Previous studies showed that the population similarity between study and reference panels is one of the key factors influencing the imputation accuracy. Here, we developed an imputation reference panel reconstruction method (RefRGim) using convolutional neural networks (CNNs), which can generate a study-specified reference panel for each input data based on the genetic similarity of individuals from current study and references. The CNNs were pretrained with single nucleotide polymorphism data from the 1000 Genomes Project. Our evaluations showed that genotype imputation with RefRGim can achieve higher accuracies than original reference panel, especially for low-frequency and rare variants. RefRGim will serve as an efficient reference panel reconstruction method for genotype imputation. RefRGim is freely available via GitHub: https://github.com/shishuo16/RefRGim.

Authors

  • Shuo Shi
    State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, Hubei 430079, China.
  • Qiheng Qian
    National Genomics Data Center of Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
  • Shuhuan Yu
    National Genomics Data Center of Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
  • Qi Wang
    Biotherapeutics Discovery Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.
  • Jinyue Wang
    National Genomics Data Center of Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
  • Jingyao Zeng
    National Genomics Data Center of Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
  • Zhenglin Du
    Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, PR China.
  • Jingfa Xiao
    BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.