Machine Learning Models Based on Molecular Fingerprints and an Extreme Gradient Boosting Method Lead to the Discovery of JAK2 Inhibitors.
Journal:
Journal of chemical information and modeling
Published Date:
Dec 23, 2019
Abstract
Developing Janus kinase 2 (JAK2) inhibitors has become a significant focus for small-molecule drug discovery programs in recent years because the inhibition of JAK2 may be an effective approach for the treatment of myeloproliferative neoplasm. Here, based on three different types of fingerprints and Extreme Gradient Boosting (XGBoost) methods, we developed three groups of models in that each group contained a classification model and a regression model to accurately acquire highly potent JAK2 kinase inhibitors from the ZINC database. The three classification models resulted in Matthews correlation coefficients of 0.97, 0.94, and 0.97. Docking methods including Glide and AutoDock Vina were employed to evaluate the virtual screening effectiveness of our classification models. The of three regression models were 0.80, 0.78, and 0.80. Finally, 13 compounds were biologically evaluated, and the results showed that the IC values of six compounds were identified to be less than 100 nM. Among them, compound showed high activity and selectivity in that its IC value was less than 1 nM against JAK2 while 694 nM against JAK3. The strategy developed may be generally applicable in ligand-based virtual screening campaigns.