BiVAE-CPI: An Interpretable Generative Model Using a Bilateral Variational Autoencoder for Compound-Protein Interaction Prediction.
Journal:
Journal of chemical information and modeling
Published Date:
Sep 8, 2025
Abstract
Predicting compound-protein interaction (CPI) plays a critical role in drug discovery and development, but traditional screening experiments consume much time and resources. Therefore, deep learning methods for CPI prediction are popular now. However, many existing methods treat CPI pairs as independent inputs, ignoring the correlations among different CPI pairs, and do not capture their latent representations well. In this paper, we propose a novel CPI prediction model, named BiVAE-CPI, which is built upon the bilateral variational autoencoder (BiVAE). It not only considers the correlations among different CPI pairs but also uses the latent factors to learn the shared low-dimensional latent representations for CPI prediction. This continuous representation based on the latent space fuses distribution and features, providing good interpretability, and the model can better match the bidirectional nature of compound-protein data. Additionally, the paper employs the graph isomorphism network (GIN) to directly learn the representation of the entire compound and utilizes a gated convolutional encoder to learn embeddings of protein sequences. Experimental results on two benchmarks, especially on imbalanced data sets, demonstrate that BiVAE-CPI outperforms the state-of-the-art methods. These results illustrate the performance of the proposed model in CPI prediction and also show that considering the correlation in different CPIs and the shared low-dimensional latent representation of compound-protein pairs is helpful for CPI prediction.