Advancing the Accuracy of Anti-MRSA Peptide Prediction Through Integrating Multi-Source Protein Language Models.

Journal: Interdisciplinary sciences, computational life sciences

Published Date: Mar 11, 2025

Abstract

The emergence of methicillin-resistant Staphylococcus aureus (MRSA) as a recognized cause of community-acquired and hospital infections has brought about a need for the efficient and accurate identification of peptides with anti-MRSA properties in drug discovery and development pipelines. However, current experimental methods often tend to be labor- and resource-intensive. Thus, there is an immediate requirement to develop practical computational solutions for identifying sequence-based anti-MRSA peptides. Lately, pre-trained protein language models (pLMs) have emerged as a remarkable advancement for encoding peptide sequences as discriminative feature embeddings, uncovering plentiful protein-level information and successfully repurposing it for in silico peptide property prediction. In this study, we present pLM4MRSA, a framework based on pLMs designed to enhance the accuracy of predicting anti-MRSA peptides. In this framework, we combine feature embeddings from various pLMs, such as ProtTrans, and evolutionary-scale modeling (ESM-2) which provide complementary information for prediction. These individual pLM strengths are integrated to form hybrid feature embeddings. Next, we apply principal component analysis (PCA) to process these hybrid embeddings. The resulting PCA-transformed feature vectors are then used as inputs for constructing the predictive model. Experimental results on the independent test dataset showed that the proposed pLM4MRSA approach achieved a balanced accuracy and Matthew correlation coefficient of 0.983 and 0.980, respectively, representing remarkable improvements over the state-of-the-art methods by 2.53%-4.83% and 7.73%-13.23%, respectively. This indicates that pLM4MRSA is a high-performance prediction model with excellent scope of applicability. Additionally, comparison with well-known hand-crafted features demonstrated that the proposed hybrid feature embeddings complement each other effectively, capturing discriminative patterns for more accurate anti-MRSA peptide prediction. We anticipate that pLM4MRSA will serve as an effective solution for accurate and high-capacity prediction of anti-MRSA peptides from peptide sequences.

Authors

Watshara Shoombuatong

Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
Pakpoom Mookdarsanit

Faculty of Science, Computer Science and Artificial Intelligence, Chandrakasem Rajabhat University, Bangkok, 10900, Thailand.
Lawankorn Mookdarsanit

Business Information System, Faculty of Management Science, Chandrakasem Rajabhat University, Bangkok, 10900, Thailand. lawankorn.s@chandra.ac.th.
Nalini Schaduangrat

Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand.
Saeed Ahmed

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
Muhammad Kabir

Department of Computer Science, Abdul Wali Khan University, Mardan, Pakistan.
Pramote Chumnanpuen

Department of Zoology, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; Kasetsart University International College (KUIC), Kasetsart University, Bangkok 10900, Thailand.

Keywords

Anti-Bacterial Agents Computational Biology Methicillin-Resistant Staphylococcus aureus Peptides Principal Component Analysis

External Resources

View on PubMed Access via DOI PubMed (40067411)

Advancing the Accuracy of Anti-MRSA Peptide Prediction Through Integrating Multi-Source Protein Language Models.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Advancing the Accuracy of Anti-MRSA Peptide Prediction Through Integrating Multi-Source Protein Language Models.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals