PlantPathoPPI: An Ensemble-based Machine Learning Architecture for Prediction of Protein-Protein Interactions between Plants and Pathogens.
Journal:
Journal of molecular biology
Published Date:
Mar 17, 2025
Abstract
This study aimed to develop a machine learning-based tool for predicting protein-protein interactions (PPIs) between plant-pathogen systems, addressing the challenges of experimental PPI identification. Identifying PPIs in plant-pathogen interactions is crucial for understanding the molecular mechanisms underlying plant defense and pathogen virulence. However, experimental methods are time-consuming and labor-intensive, prompting the use of computational techniques to complement traditional approaches. A robust ensemble model was developed using multiple sequence encodings and diverse learning algorithms such as random forest, support vector machine, and artificial neural network. The features used included auto-covariance, conjoint triad, and local descriptor schemes, which were selected based on their performance. The top three performing models were combined into an ensemble model, improving prediction accuracy to approximately 97%. The PlantPathoPPI tool, developed through this approach, was compared with existing tools using an independent test dataset, showing promising potential for PPI prediction in plant-pathogen interactions. To facilitate broad accessibility, a web-based prediction server was developed, available at https://plantpathoppi.onrender.com/, alongside a Python package on https://pypi.org/project/plantpathoppi-ml/. This research contributes significantly to the field by offering an efficient tool for predicting PPIs in plant-pathogen systems, providing valuable insights into plant diseases and supporting hypothesis-driven research.