Machine learning application to predict binding affinity between peptide containing non-canonical amino acids and HLA-A0201.
Journal:
PloS one
Published Date:
Jun 27, 2025
Abstract
Class Ι major histocompatibility complexes (MHC-Ι), encoded by the highly polymorphic HLA-A, HLA-B, and HLA-C genes in humans, are expressed on all nucleated cells. Both self and foreign proteins are processed to peptides of 8-10 amino acids, loaded into MHC-Ι, within the endoplasmic reticulum and then presented on the cell surface. Foreign peptides presented in this fashion activate CD8 + T cells and their immunogenicity correlates with their affinity for the MHC-Ι binding groove. Thus, predicting antigen binding affinity for MHC-Ι is a valuable tool for identifying potentially immunogenic antigens. While quite a few predictors for MHC-Ι binding exist, there are no currently available tools that can predict antigen/MHC-Ι binding affinity for antigens with explicitly labeled post-translational modifications or unusual/non-canonical amino acids (NCAAs). However, such modifications are increasingly recognized as critical mediators of peptide immunogenicity. In this work, we propose a machine learning application that quantifies the binding affinity of epitopes containing NCAAs to MHC-Ι and compares its performance with other commonly used regressors. Our model demonstrates robust performance, with 5-fold cross-validation yielding an R2 value of 0.477 and a root-mean-square error (RMSE) of 0.735, indicating strong predictive capability for peptides with NCAAs. This work provides a valuable tool for the computational design and optimization of peptides incorporating NCAAs, potentially accelerating the development of novel peptide-based therapeutics with enhanced properties and efficacy.