PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction
Journal:
arXiv
Published Date:
Dec 24, 2024
Abstract
Recent advances in topology-based modeling have accelerated progress in
physical modeling and molecular studies, including applications to
protein-ligand binding affinity. In this work, we introduce the Persistent
Laplacian Decision Tree (PLD-Tree), a novel method designed to address the
challenging task of predicting protein-protein interaction (PPI) affinities.
PLD-Tree focuses on protein chains at binding interfaces and employs the
persistent Laplacian to capture topological invariants reflecting critical
inter-protein interactions. These topological descriptors, derived from
persistent homology, are further enhanced by incorporating evolutionary scale
modeling (ESM) from a large language model to integrate sequence-based
information. We validate PLD-Tree on two benchmark datasets-PDBbind V2020 and
SKEMPI v2 demonstrating a correlation coefficient ($R_p$) of 0.83 under the
sophisticated leave-out-protein-out cross-validation. Notably, our approach
outperforms all reported state-of-the-art methods on these datasets. These
results underscore the power of integrating machine learning techniques with
topology-based descriptors for molecular docking and virtual screening,
providing a robust and accurate framework for predicting protein-protein
binding affinities.