Application of protein language models for antibody developability prediction.

Journal: mAbs
Published Date:

Abstract

Protein language models (PLMs) provide a powerful framework for learning sequence - property relationships in antibodies. However, their performance and reliability in real-world industrial antibody discovery pipelines remain underexplored. Here, we systematically evaluate several state-of-the-art PLMs using internal datasets comprising antibody sequences and developability assay measurements from 33 historical therapeutic programs. The assays span three critical developability dimensions: polyspecificity reagent (PSR), hydrophobic interaction chromatography (HIC), and affinity-capture self-interaction nanoparticle spectroscopy (AC-SINS). Across all assays, domain-adaptive fine-tuning of PLMs on internal antibody sequence data consistently improves predictive performance relative to pretrained representations alone. In addition, we assess sequence likelihoods derived from pretrained PLMs as unsupervised indicators of developability risk and analyze their strengths and limitations across assay types. Together, these results demonstrate that PLMs can provide robust and complementary signals for antibody developability assessment, supporting their practical use in early-stage candidate optimization and selection.

Authors

Keywords

No keywords available for this article.