A first perturbome of Pseudomonas aeruginosa: Identification of core genes related to multiple perturbations by a machine learning approach.

Journal: Bio Systems
Published Date:

Abstract

Tolerance to stress conditions is vital for organismal survival, including bacteria under specific environmental conditions, antibiotics, and other perturbations. Some studies have described common modulation and shared genes during stress response to different types of disturbances (termed as perturbome), leading to the idea of central control at the molecular level. We implemented a robust machine learning approach to identify and describe genes associated with multiple perturbations or perturbome in a Pseudomonas aeruginosa PAO1 model. Using microarray datasets from the Gene Expression Omnibus (GEO), we evaluated six approaches to rank and select genes: using two methodologies, data single partition (SP method) or multiple partitions (MP method) for training and testing datasets, we evaluated three classification algorithms (SVM Support Vector Machine, KNN K-Nearest neighbor and RF Random Forest). Gene expression patterns and topological features at the systems level were included to describe the perturbome elements. We were able to select and describe 46 core response genes associated with multiple perturbations in P. aeruginosa PAO1 and it can be considered a first report of the P. aeruginosa perturbome. Molecular annotations, patterns in expression levels, and topological features in molecular networks revealed biological functions of biosynthesis, binding, and metabolism, many of them related to DNA damage repair and aerobic respiration in the context of tolerance to stress. We also discuss different issues related to implemented and assessed algorithms, including data partitioning, classification approaches, and metrics. Altogether, this work offers a different and robust framework to select genes using a machine learning approach.

Authors

  • Jose Arturo Molina Mora
    Centro de Investigacion en Enfermedades Tropicales (CIET) and Facultad de Microbiología, Universidad de Costa Rica, San Jose, Costa Rica. Electronic address: jose.molinamora@ucr.ac.cr.
  • Pablo Montero-Manso
    Discipline of Business Analytics, University of Sydney, Sydney, Australia. Electronic address: pablo.monteromanso@sydney.edu.au.
  • Raquel García-Batán
    Centro de Investigacion en Enfermedades Tropicales (CIET) and Facultad de Microbiología, Universidad de Costa Rica, San Jose, Costa Rica. Electronic address: raquel.garcia@ucr.ac.cr.
  • Rebeca Campos-Sánchez
    Centro de Investigación en Biología Celular y Molecular (CIBCM), Universidad de Costa Rica, San José, Costa Rica. Electronic address: rebeca.campos@ucr.ac.cr.
  • Jose Vilar-Fernández
    Departament of Mathematics, University of A Coruña, A Coruña, Spain. Electronic address: jose.vilarf@udc.es.
  • Fernando García
    Centro de Investigacion en Enfermedades Tropicales (CIET) and Facultad de Microbiología, Universidad de Costa Rica, San Jose, Costa Rica. Electronic address: fernando.garcia@ucr.ac.cr.