Prediction of prokaryotic transposases from protein features with machine learning approaches.

Journal: Microbial genomics
Published Date:

Abstract

Identification of prokaryotic transposases (Tnps) not only gives insight into the spread of antibiotic resistance and virulence but the process of DNA movement. This study aimed to develop a classifier for predicting Tnps in bacteria and archaea using machine learning (ML) approaches. We extracted a total of 2751 protein features from the training dataset including 14852 Tnps and 14852 controls, and selected 75 features as predictive signatures using the combined mutual information and least absolute shrinkage and selection operator algorithms. By aggregating these signatures, an ensemble classifier that integrated a collection of individual ML-based classifiers, was developed to identify Tnps. Further validation revealed that this classifier achieved good performance with an average AUC of 0.955, and met or exceeded other common methods. Based on this ensemble classifier, a stand-alone command-line tool designated TnpDiscovery was established to maximize the convenience for bioinformaticians and experimental researchers toward Tnp prediction. This study demonstrates the effectiveness of ML approaches in identifying Tnps, facilitating the discovery of novel Tnps in the future.

Authors

  • Qian Wang
    Department of Radiation Oncology, China-Japan Union Hospital of Jilin University, Changchun, China.
  • Jun Ye
    Department of Electrical and Information Engineering, Shaoxing University, 508 Huancheng West Road, Shaoxing, Zhejiang 312000, PR China. Electronic address: yehjun@aliyun.com.
  • Teng Xu
    Department of General Surgery, the Affiliated Hospital of Xuzhou Medical University, Xuzhou 221002, China.
  • Ning Zhou
    Department of General Surgery, Wuxi People's Hospital Affiliated to Nanjing Medical University, Wuxi, China.
  • Zhongqiu Lu
    Department of Emergency, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China.
  • Jianchao Ying
    Wenzhou Key Laboratory of Emergency, Critical Care, and Disaster Medicine, Department of Emergency, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, PR China.