LucaPCycle: Illuminating microbial phosphorus cycling in deep-sea cold seep sediments using protein language models.

Journal: Nature communications
Published Date:

Abstract

Phosphorus is essential for life and critically influences marine productivity. Despite geochemical evidence of active phosphorus cycling in deep-sea cold seeps, the microbial processes involved remain poorly understood. Traditional sequence-based searches often fail to detect proteins with remote homology. To address this, we developed a deep learning model, LucaPCycle, integrating raw sequences and contextual embeddings based on the protein language model ESM2-3B. LucaPCycle identified 5241 phosphorus-cycling protein families from global cold seep gene and genome catalogs, substantially enhancing our understanding of their diversity, ecology, and function. Among previously unannotated sequences, we discovered three alkaline phosphatase families that feature unique domain organizations and preserved enzymatic capabilities. These results highlight previously overlooked ecological importance of phosphorus cycling within cold seeps, corroborated by data from porewater geochemistry, metatranscriptomics, and metabolomics. We revealed a previously unrecognized diversity of archaea, including Asgardarchaeota, anaerobic methanotrophic archaea and Thermoproteota, which contribute to organic phosphorus mineralization and inorganic phosphorus solubilization through various mechanisms. Additionally, auxiliary metabolic genes of cold seep viruses primarily encode the PhoR-PhoB regulatory system and PhnCDE transporter, potentially enhancing their hosts' phosphorus utilization. Overall, LucaPCycle are capable of accessing previously 'hidden' sequence spaces for microbial phosphorus cycling and can be applied to various ecosystems.

Authors

  • Chuwen Zhang
    Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.
  • Yong He
    College of Biosystems Engineering and Food Science, Zhejiang Univ., Hangzhou, 310058, China.
  • Jieni Wang
    Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.
  • Tengkai Chen
    Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China.
  • Federico Baltar
    Fungal and Biogeochemical Oceanography Group, College of Oceanography and Ecological Science, Shanghai Ocean University, Shanghai, China.
  • Minjie Hu
    Key Laboratory of Humid Sub-tropical Eco-geographical Process of Ministry of Education, Fujian Normal University, Fuzhou, China.
  • Jing Liao
    State Key Laboratory of Respiratory Diseases, National Clinical Research Center for Respiratory Diseases, Guangzhou Institute of Respiratory Health, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
  • Xi Xiao
    Department of Nephrology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.
  • Zhao-Rong Li
    Apsara Lab, Alibaba Cloud Intelligence, Alibaba Group, Hangzhou, China. zhaorong.lzr@alibaba-inc.com.
  • Xiyang Dong
    Key Laboratory of Marine Genetic Resources, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen, China. dongxiyang@tio.org.cn.