Gene-environment interaction analysis via deep learning.

Journal: Genetic epidemiology
PMID:

Abstract

Gene-environment (G-E) interaction analysis plays an important role in studying complex diseases. Extensive methodological research has been conducted on G-E interaction analysis, and the existing methods are mostly based on regression techniques. In many fields including biomedicine and omics, it has been increasingly recognized that deep learning may outperform regression with its unique flexibility (e.g., in accommodating unspecified nonlinear effects) and superior prediction performance. However, there has been a lack of development in deep learning for G-E interaction analysis. In this article, we fill this important knowledge gap and develop a new analysis approach based on deep neural network in conjunction with penalization. The proposed approach can simultaneously conduct model estimation and selection (of important main G effects and G-E interactions), while uniquely respecting the "main effects, interactions" variable selection hierarchy. Simulation shows that it has superior prediction and feature selection performance. The analysis of data on lung adenocarcinoma and skin cutaneous melanoma overall survival further establishes its practical utility. Overall, this study can advance G-E interaction analysis by delivering a powerful new analysis approach based on modern deep learning.

Authors

  • Shuni Wu
    The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China.
  • Yaqing Xu
    Department of Epidemiology and Biostatistics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Qingzhao Zhang
    The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China.
  • Shuangge Ma
    Department of Biostatistics, Yale School of Public Health, New Haven, CT 06511, USA.