Prediction of secondary structure population and intrinsic disorder of proteins using multitask deep learning.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:

Abstract

Recent research in predicting protein secondary structure populations (SSP) based on Nuclear Magnetic Resonance (NMR) chemical shifts has helped quantitatively characterise the structural conformational properties of intrinsically disordered proteins and regions (IDP/IDR). Different from protein secondary structure (SS) prediction, the SSP prediction assumes a dynamic assignment of secondary structures that seem correlate with disordered states. In this study, we designed a single-task deep learning framework to predict IDP/IDR and SSP respectively; and multitask deep learning frameworks to allow quantitative predictions of IDP/IDR evidenced by the simultaneously predicted SSP. According to independent test results, single-task deep learning models improve the prediction performance of shallow models for SSP and IDP/IDR. Also, the prediction performance was further improved for IDP/IDR prediction when SSP prediction was simultaneously predicted in multitask models. With p53 as a use case, we demonstrate how predicted SSP is used to explain the IDP/IDR predictions for each functional region.

Authors

  • Xu Ying
    IBM Research Australia, Melbourne, Victoria, Australia.
  • AndrĂ© Leier
    Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
  • Tatiana T Marquez-Lago
    Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
  • Jue Xie
    Monash University, Melbourne, Victoria, Australia.
  • Antonio Jose Jimeno Yepes
    IBM Research Australia, Melbourne, Victoria, Australia.
  • James C Whisstock
    College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia.
  • Campbell Wilson
    Monash University, Melbourne, Victoria, Australia.
  • Jiangning Song
    College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia.