Predicting hydrocarbon presence in marine cold seep sediments using machine learning models trained with benthic bacterial 16S rRNA taxonomy.

Journal: Microbiology spectrum
Published Date:

Abstract

UNLABELLED: Hydrocarbon seepage in marine sediments exerts selective pressure on benthic microbiomes. Accordingly, microbial community composition in these sediments can reflect the presence of hydrocarbons, with specific groups being more prolific in association with seepage. Here, we tested machine learning models with large 16S rRNA gene amplicon data sets derived from marine sediments in deep-sea hydrocarbon prospective areas of the Eastern Gulf of Mexico and NW Atlantic Scotian Slope. Utilizing H2O's AutoML machine learning platform, it was determined that Gradient Boosting Machines performed best for creating 16S rRNA-based models that successfully predict the presence of hydrocarbons. Feature importance scores from the models revealed that in Gulf of Mexico samples, members of the class (within the phylum) and the genus (within the phylum) were most diagnostic for the presence of low molecular weight hydrocarbon gases. The lineage was also important in Scotian Slope sediments, along with sequences affiliated with the class-level JS1 group (within the phylum) for determining hydrocarbon-positive sites. Testing these models in geographically distant seafloor basins showed that the microbial communities between basins varied sufficiently to prevent consistently accurate reciprocal predictions. However, models trained on a combined data set and filtered for important features performed substantially better, supporting the feasibility of generalized models under stringent feature selection. These results highlight the potential of seabed microbial taxonomy-based hydrocarbon seep site prediction when paired with refined sampling and consistent geochemical characterization.

Authors

  • Rohan Khan
    Geomicrobiology Group, Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada.
  • Tulika Bhardwaj
    Geomicrobiology Group, Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada.
  • Carmen Li
    Geomicrobiology Group, Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada.
  • Anirban Chakraborty
  • José M Seoane
    Repsol SA, Madrid, Spain.
  • James M Brooks
    TDI-Brooks International, College Station, Texas, USA.
  • Bernie B Bernard
    TDI-Brooks International, College Station, Texas, USA.
  • Adam MacDonald
    Natural Resources Canada, Geological Survey of Canada Atlantic, Dartmouth, Canada.
  • Natasha MacAdam
    Natural Resources Canada, Geological Survey of Canada Atlantic, Dartmouth, Canada.
  • Calvin Campbell
    Nova Scotia Department of Natural Resources and Renewables, Government of Nova Scotia, Halifax, Canada.
  • Martin Fowler
    Applied Petroleum Technology Canada, Calgary, Alberta, Canada.
  • Casey R J Hubert
    Geomicrobiology Group, Department of Biological Sciences, University of Calgary, Calgary, Alberta, Canada.

Keywords

No keywords available for this article.