A machine-learning approach for predicting butyrate production by microbial consortia using metabolic network information.

Journal: PeerJ
Published Date:

Abstract

Understanding the behavior of microbial consortia is crucial for predicting metabolite production by microorganisms. Genome-scale network reconstructions enable the computation of metabolic interactions and specific associations within microbial consortia underpinning the production of different metabolites. In the context of the human gut, butyrate is a central metabolite produced by bacteria that plays a key role within the gut microbiome impacting human health. Despite its importance, there is a lack of computational methods capable of predicting its production as a function of the consortium composition. Here, we present a novel machine-learning approach leveraging automatically generated genome-scale metabolic models to tackle this limitation. Briefly, all consortia made of two up to 13 members from a pool of 19 bacteria with known genomes, including at least one butyrate producer from a pool of three known producer species, were built and their (maximum) butyrate production simulated. Using network-derived descriptors from each bacteria, butyrate production by the above consortia was used as training data for various machine learning models. The performance of the algorithms was evaluated using k-fold cross-validation and new experimental data, displaying a Pearson correlation coefficient exceeding 0.75 for the predicted and observed butyrate production in two bacteria consortia. While consortia with more than two bacteria showed generally worse predictions, the best machine-learning models still outperformed predictions from genome-scale metabolic models alone. Overall, this approach provides a valuable tool and framework for probing promising butyrate-producing consortia on a large scale, guiding experimentation, and more importantly, predicting metabolic production by consortia.

Authors

  • Claudia Silva-Andrade
    Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile.
  • Sergio Hernández
    Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Santiago, Chile.
  • Pedro Saa
    Departamento de Ingeniería Química y Bioprocesos, Escuela de Ingeniería, Pontificia Universidad Católica de Chile, Santiago, Chile.
  • Ernesto Perez-Rueda
    Unidad Académica de Yucatán, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Yucatán, Mérida, Mexico.
  • Daniel Garrido
    Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile.
  • Alberto J Martin
    Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Santiago, Chile.