Disease gene prediction with privileged information and heteroscedastic dropout.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Jul 12, 2021
Abstract
MOTIVATION: Recently, machine learning models have achieved tremendous success in prioritizing candidate genes for genetic diseases. These models are able to accurately quantify the similarity among disease and genes based on the intuition that similar genes are more likely to be associated with similar diseases. However, the genetic features these methods rely on are often hard to collect due to high experimental cost and various other technical limitations. Existing solutions of this problem significantly increase the risk of overfitting and decrease the generalizability of the models.