Disease gene prediction with privileged information and heteroscedastic dropout.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Recently, machine learning models have achieved tremendous success in prioritizing candidate genes for genetic diseases. These models are able to accurately quantify the similarity among disease and genes based on the intuition that similar genes are more likely to be associated with similar diseases. However, the genetic features these methods rely on are often hard to collect due to high experimental cost and various other technical limitations. Existing solutions of this problem significantly increase the risk of overfitting and decrease the generalizability of the models.

Authors

  • Juan Shu
    Department of Statistics, Purdue University, West Lafayette, IN 47906, USA.
  • Yu Li
    Department of Public Health, Shihezi University School of Medicine, 832000, China.
  • Sheng Wang
    Intensive Care Medical Center, Tongji Hospital, School of Medicine, Tongji University, Shanghai, 200065, People's Republic of China.
  • Bowei Xi
    Department of Statistics, Purdue University, West Lafayette, IN 47906, USA.
  • Jianzhu Ma
    Toyota Technological Institute at Chicago, 6045 S. Kenwood Ave. Chicago, Illinois 60637 USA.