Online OCHEM multi-task model for solubility and lipophilicity prediction of platinum complexes.

Journal: Journal of inorganic biochemistry
Published Date:

Abstract

Predicting the solubility and lipophilicity of platinum(II, IV) complexes is essential for prioritizing potential anticancer candidates in drug discovery. This study introduces the first publicly available online model for predicting the solubility of platinum complexes, addressing the lack of literature and models in this regard. Using a time-split dataset, we developed a consensus model with a Root Mean Squared Error (RMSE) of 0.62 through 5-cross-validation on a training set of 284 historical compounds (solubility data reported prior to 2017). However, the RMSE increased to 0.86 when applied to a prospective test set of 108 compounds reported after 2017. Further analysis of the high prediction errors revealed that these inaccuracies are primarily attributed to the underrepresentation of novel chemical scaffolds, particularly Pt(IV) derivatives, in the training sets. For instance, a series of eight phenanthroline-containing compounds, not covered by the training set's chemical space, had an RMSE of 1.3. When the model was redeveloped using a combined dataset, the RMSE of this series significantly decreased to 0.34 under the same validation protocol. Additionally, we developed an interpretable linear model to identify structural features and functional groups that influence the solubility of platinum complexes. We further validated the correlation between solubility and lipophilicity, consistent with the Yalkowsky General Solubility Equation. Building on these insights, we developed a final multitask model that simultaneously predicts solubility and lipophilicity as two endpoints with RMSE = 0.62 and 0.44, respectively. The data and final developed model is available at https://ochem.eu/article/31.

Authors

  • Nesma Mousa
    Freie Universität Berlin, Fachbereich Biologie, Chemie, Pharmazie, Takustr. 3, 14195 Berlin, Germany.
  • Hristo P Varbanov
    Institute of Pharmacy/Pharmaceutical Chemistry, University of Innsbruck, Center for Chemistry and Biomedicine, Innrain 80 - 82/IV, 6020 Innsbruck, Austria.
  • Vidya Kaipanchery
    Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, Krakow 30239, Poland.
  • Elisabetta Gabano
    Dipartimento per lo Sviluppo Sostenibile e la Transizione Ecologica, Università del Piemonte Orientale, Piazza S. Eusebio 5, 13100 Vercelli, Italy.
  • Mauro Ravera
    Dipartimento di Scienze e Innovazione Tecnologica, Università del Piemonte Orientale, Viale Teresa Michel 11, 15121 Alessandria, Italy. Electronic address: mauro.ravera@uniupo.it.
  • Andrey A Toropov
    Department of Environmental Health Science, Laboratory of Environmental Chemistry and Toxicology, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy.
  • Larisa Charochkina
    V.P. Kukhar Institute of Bioorganic Chemistry and Petrochemistry, National Academy of Sciences of Ukraine, Academician Kukhar Str. 1, Kyiv 02094, Ukraine.
  • Filipe Menezes
    Molecular Targets and Therapeutics Center, Institute of Structural Biology, Helmholtz Munich, Neuherberg, Germany.
  • Guillaume Godin
    dsm-firmenich SA, Rue de la Bergère 7, CH-1242 Satigny, Switzerland.
  • Igor V Tetko
    g Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH) , Institute of Structural Biology , Neuherberg , Germany.