Machine learning models reveal how polycyclic aromatic hydrocarbons influence environmental bacterial communities.
Journal:
The Science of the total environment
PMID:
39447913
Abstract
Polycyclic aromatic hydrocarbons (PAHs) are harmful and widespread pollutants in the environment, posing an ecological threat. However, exploring the influence of PAHs on environmental bacterial communities in different habitats (soil, water, and sediment) remains a major challenge. We collected and reanalyzed 1924 16S rRNA sequencing samples to determine the effects of PAHs on bacterial communities in different habitats and used machine learning to predict potential degrading bacteria. It was found that PAHs had substantial effects on the bacterial community, and that the bacterial community structure changed differently in different habitats. PAH contamination decreased the relative abundance of Proteobacteria in the soil (16.3 %) and sediment (10.1 %), whereas the abundance of Proteobacteria in water increased by 20.2 %. Among the tested models, the random forest model best identified the effects of PAHs on bacterial groups, with an accuracy of 99.51 % for soil, 97.72 % for sediment, and 100 % for water at the genus level. Using the random forest model, we identified 70 biomarkers that respond to PAHs, including potentially degrading microorganisms such as A4b, Bacillus, Flavobacterium and Polynucleobacter. Furthermore, PAH contamination did not significantly alter the functions of bacterial communities in the environment. This study provides a candidate strain set for future screening of PAH-degrading bacteria and contributes to the study of the adaptability of engineered PAH-degrading bacteria to the environment.