Machine Learning-Driven Consensus Modeling for Activity Ranking and Chemical Landscape Analysis of HIV-1 Inhibitors.

Journal: Pharmaceuticals (Basel, Switzerland)
Published Date:

Abstract

This study aimed to develop a predictive model to classify and rank highly active compounds that inhibit HIV-1 integrase (IN). : A total of 2271 potential HIV-1 inhibitors were selected from the ChEMBL database. The most relevant molecular descriptors were identified using a hybrid GA-SVM-RFE approach. Predictive models were built using Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Support Vector Machines (SVM), and Multi-Layer Perceptron (MLP). The models underwent a comprehensive evaluation employing calibration, Y-randomization, and Net Gain methodologies. : The four models demonstrated intense calibration, achieving an accuracy greater than 0.88 and an area under the curve (AUC) exceeding 0.90. Net Gain at a high probability threshold indicates that the models are both effective and highly selective, ensuring more reliable predictions with greater confidence. Additionally, we combine the predictions of multiple individual models by using majority voting to determine the final prediction for each compound. The Rank Score (weighted sum) serves as a confidence indicator for the consensus prediction, with the majority of highly active compounds identified through high scores in both the 2D descriptors and ECFP4-based models, highlighting the models' effectiveness in predicting potent inhibitors. Furthermore, cluster analysis identified significant classes associated with vigorous biological activity. Some clusters were found to be enriched in highly potent compounds while maintaining moderate scaffold diversity, making them promising candidates for exploring unique chemical spaces and identifying novel lead compounds. Overall, this study provides valuable insights into predicting integrase binders, thereby enhancing the accuracy of predictive models.

Authors

  • Danishuddin
    School of Computational and Integrative Sciences, Jawaharlal Nehru University , New Delhi , India.
  • Md Azizul Haque
    Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Republic of Korea.
  • Geet Madhukar
    Department of Molecular, Cellular and Biomedical Sciences, University of New Hampshire, Durham, NH 03824, USA.
  • Qazi Mohammad Sajid Jamal
    Department of Health Informatics, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia.
  • Jong-Joo Kim
    Department of Biotechnology, Yeungnam University, Gyeongsan 38541, Republic of Korea.
  • Khurshid Ahmad
    Department of Medical Biotechnology, Yeungnam University, Gyeongsan, South Korea.

Keywords

No keywords available for this article.