Precise uncertain significance prediction using latent space matrix factorization models: genomics variant and heterogeneous clinical data-driven approaches.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Several studies to date have proposed different types of interpreters for measuring the degree of pathogenicity of variants. However, in predicting the disease type and disease-gene associations, scholars face two essential challenges, namely the vast number of existing variants and the existence of variants which are recognized as variant of uncertain significance (VUS). To tackle these challenges, we propose algorithms to assign a significance to each gene rather than each variant, describing its degree of pathogenicity. Since the interpreters identified most of the variants as VUS, most of the gene scores were identified as uncertain significance. To predict the uncertain significance scores, we design two matrix factorization-based models: the common latent space model uses genomics variant data as well as heterogeneous clinical data, while the single-matrix factorization model can be used when heterogeneous clinical data are unavailable. We have managed to show that the models successfully predict the uncertain significance scores with low error and high accuracy. Moreover, to evaluate the effectiveness of our novel input features, we train five different multi-label classifiers including a feedforward neural network with the same feature set and show they all achieve high accuracy as the main impact of our approach comes from the features. Availability: The source code is freely available at https://github.com/sabdollahi/CoLaSpSMFM.

Authors

  • Sina Abdollahi
    Intelligent Information Retrieval Lab, Department of Computer Science and Information Engineering, National Cheng Kung University.
  • Peng-Chan Lin
    Internal Medicine Department.
  • Meng-Ru Shen
    Department of Obstetrics and Gynecology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan.
  • Jung-Hsien Chiang
    Department of Computer Science and Information Engineering, National Cheng Kung University, 1, University Road, Tainan City, Taiwan.