Sequence-based virtual screening using transformers.
Journal:
Nature communications
Published Date:
Jul 28, 2025
Abstract
Protein-ligand interactions play central roles in myriad biological processes and are of key importance in drug design. Deep learning approaches are becoming cost-effective alternatives to high-throughput experimental methods for ligand identification. Here, to predict the binding affinity between proteins and small molecules, we introduce Ligand-Transformer, a deep learning method based on the transformer architecture. Ligand-Transformer implements a sequence-based approach, where the inputs are the amino acid sequence of the target protein and the topology of the small molecule to enable the prediction of the conformational space explored by the complex between the two. We apply Ligand-Transformer to screen and validate experimentally inhibitors targeting the mutant EGFR kinase, identifying compounds with low nanomolar potency. We then use this approach to predict the conformational population shifts induced by known ABL kinase inhibitors, showing that sequence-based predictions enable the characterisation of the population shift upon binding. Overall, our results illustrate the potential of Ligand-Transformer to accurately predict the interactions of small molecules with proteins, including the binding affinity and the changes in the free energy landscapes upon binding, thus uncovering molecular mechanisms and facilitating the initial steps in drug design.