UmamiPreDL: Deep learning model for umami taste prediction of peptides using BERT and CNN.

Journal: Computational biology and chemistry
Published Date:

Abstract

Taste is crucial in driving food choice and preference. Umami is one of the basic tastes defined by characteristic deliciousness and mouthfulness that it imparts to foods. Identification of ingredients to enhance umami taste is of significant value to food industry. Various models have been shown to predict umami taste using feature encodings derived from traditional molecular descriptors such as amphiphilic pseudo-amino acid composition, dipeptide composition, and composition-transition-distribution. Highest reported accuracy of 90.5 % was recently achieved through novel model architecture. Here, we propose use of biological sequence transformers such as ProtBert and ESM2, trained on the Uniref databases, as the feature encoders block. With combination of 2 encoders and 2 classifiers, 4 model architectures were developed. Among the 4 models, ProtBert-CNN model outperformed other models with accuracy of 95 % on 5-fold cross validation data and 94 % on independent data.

Authors

  • Arun Pandiyan Indiran
    ITC Life Sciences and Technology Centre, Peenya Industrial Area, 1st Phase, Bengaluru 560058, India.
  • Humaira Fatima
    ITC Life Sciences and Technology Centre, Peenya Industrial Area, 1st Phase, Bengaluru 560058, India.
  • Sampriti Chattopadhyay
    Centre for Advanced Process Decision Making, Carnegie Mellon University, USA.
  • Sureshkumar Ramadoss
    ITC Life Sciences and Technology Centre, Peenya Industrial Area, 1st Phase, Bengaluru 560058, India; ITC Infotech India Limited, Bengaluru 560005, India.
  • Yashwanth Radhakrishnan
    ITC Life Sciences and Technology Centre, Peenya Industrial Area, 1st Phase, Bengaluru 560058, India. Electronic address: yashwanth.radhakrishnan@itc.in.