Bastion3: a two-layer ensemble predictor of type III secreted effectors.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Type III secreted effectors (T3SEs) can be injected into host cell cytoplasm via type III secretion systems (T3SSs) to modulate interactions between Gram-negative bacterial pathogens and their hosts. Due to their relevance in pathogen-host interactions, significant computational efforts have been put toward identification of T3SEs and these in turn have stimulated new T3SE discoveries. However, as T3SEs with new characteristics are discovered, these existing computational tools reveal important limitations: (i) most of the trained machine learning models are based on the N-terminus (or incorporating also the C-terminus) instead of the proteins' complete sequences, and (ii) the underlying models (trained with classic algorithms) employed only few features, most of which were extracted based on sequence-information alone. To achieve better T3SE prediction, we must identify more powerful, informative features and investigate how to effectively integrate these into a comprehensive model.

Authors

  • Jiawei Wang
    Biomedicine Discovery Institute, Monash University, VIC 3800, Australia.
  • Jiahui Li
    College of Communication Engineering, Jilin University, Changchun, Jilin China.
  • Bingjiao Yang
    College of Mechanical Engineering, Yanshan University, Qinhuangdao 066004, China.
  • Ruopeng Xie
    School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China.
  • Tatiana T Marquez-Lago
    Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
  • AndrĂ© Leier
    Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
  • Morihiro Hayashida
    Laboratory of Mathematical Bioinformatics, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, 611-0011, Japan.
  • Tatsuya Akutsu
    Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan.
  • Yanju Zhang
    School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China.
  • Kuo-Chen Chou
    School of Computer Science and Technology and Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China, Gordon Life Science Institute, Belmont, MA 02478, USA and Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, 21589, Saudi Arabia School of Computer Science and Technology and Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China, Gordon Life Science Institute, Belmont, MA 02478, USA and Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, 21589, Saudi Arabia.
  • Joel Selkrig
    European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.
  • Tieli Zhou
    Department of Clinical Laboratory, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
  • Jiangning Song
    College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia College of Information Engineering, Northwest A&F University, Yangling 712100, China, Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia, National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China, Centre for Research in Intelligent Systems, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia and ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC 3800, Australia.
  • Trevor Lithgow
    Infection and Immunity Program, Biomedicine Discovery Institute and Department of Microbiology, Monash University, Melbourne, VIC, 3800, Australia. Trevor.Lithgow@monash.edu.