Machine Learning Reveals Proteome-Encoded Growth Predictors of Rhodopseudomonas palustris CGA009 on Lignin Aromatics

Journal: bioRxiv
Published Date:

Abstract

Microbial utilization of lignin-derived aromatics requires extensive metabolic flexibility, yet growth outcomes vary sharply with substrate chemistry and oxygen availability. Whether this variability reflects distinct growth programs or alternative realizations of shared biochemical constraints remains unclear. Here, we combine quantitative proteomics with cross-condition machine learning to test whether growth-rate variation in Rhodopseudomonas palustris can be predicted directly from proteome composition and to identify the proteomic features that consistently encode growth potential across environments. Using OmniProt, a cross-condition neural modeling and interpretive framework, we predicted growth rates across 16 lignin-derived substrate–oxygen combinations, including held-out conditions, demonstrating that growth-relevant information is encoded in the proteome. Interpreting model reliance using Monte Carlo SHAP values and dependence-aware perturbation analyses revealed a compact, hierarchical organization of growth determinants. Despite pronounced oxygen-driven bifurcation in proteome abundance, the features required for accurate growth prediction were largely regime-invariant, defining a conserved biochemical core overlaid by adaptive, condition-specific modulators. These findings reconcile metabolic versatility with constrained growth control and show that diverse lignin-derived substrates converge onto a limited set of proteome-encoded growth bottlenecks accessed through flexible regulatory programs.

Authors

  • Abraham Osinuga; Mark Kathol; Rajib Saha