xBitterT5: an explainable transformer-based framework with multimodal inputs for identifying bitter-taste peptides.

Journal: Journal of cheminformatics
Published Date:

Abstract

Bitter peptides (BPs), derived from the hydrolysis of proteins in food, play a crucial role in both food science and biomedicine by influencing taste perception and participating in various physiological processes. Accurate identification of BPs is essential for understanding food quality and potential health impacts. Traditional machine learning approaches for BP identification have relied on conventional feature descriptors, achieving moderate success but struggling with the complexities of biological sequence data. Recent advances utilizing protein language model embedding and meta-learning approaches have improved the accuracy, but frequently neglect the molecular representations of peptides and lack interpretability. In this study, we propose xBitterT5, a novel multimodal and interpretable framework for BP identification that integrates pretrained transformer-based embeddings from BioT5+ with the combination of peptide sequence and its SELFIES molecular representation. Specifically, incorporating both peptide sequences and their molecular strings, xBitterT5 demonstrates superior performance compared to previous methods on the same benchmark datasets. Importantly, the model provides residue-level interpretability, highlighting chemically meaningful substructures that significantly contribute to its bitterness, thus offering mechanistic insights beyond black-box predictions. A user-friendly web server ( https://balalab-skku.org/xBitterT5/ ) and a standalone version ( https://github.com/cbbl-skku-org/xBitterT5/ ) are freely available to support both computational biologists and experimental researchers in peptide-based food and biomedicine.

Authors

  • Nguyen Doan Hieu Nguyen
  • Nhat Truong Pham
    Division of Computational Mechatronics, Institute for Computational Science, Ton Duc Thang University, Ho Chi Minh City, Vietnam.
  • Duong Thanh Tran
  • Leyi Wei
    School of Computer Science and Technology, Tianjin University, Tianjin, 30050, China.
  • Adeel Malik
    Department of Microbiology and Molecular Biology, College of Bioscience and Biotechnology, Chungnam National University, Daejeon 34134, Korea. adeel@procarb.org.
  • Balachandran Manavalan
    Department of Physiology, Ajou University School of Medicine, Suwon, Republic of Korea.

Keywords

No keywords available for this article.