Machine learning meets pK .
Journal:
F1000Research
Published Date:
Feb 13, 2020
Abstract
We present a small molecule pK prediction tool entirely written in Python. It predicts the macroscopic pK value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa.