Multi-fidelity graph neural networks for predicting toluene/water partition coefficients.

Journal: Journal of cheminformatics
Published Date:

Abstract

Accurate prediction of toluene/water partition coefficients of neutral species is crucial in drug discovery and separation processes; however, data-driven modeling of these coefficients remains challenging due to limited available experimental data. To address the limitation of available data, we apply multi-fidelity learning approaches leveraging a quantum chemical dataset (low fidelity) of approximately 9000 entries generated by COSMO-RS and an experimental dataset (high fidelity) of about 250 entries collected from the literature. We explore the transfer learning, feature-augmented learning, and multi-target learning approaches in combination with graph neural networks, validating them on two external datasets: one with molecules similar to training data (EXT-Zamora) and one with more challenging molecules (EXT-SAMPL9). Our results show that multi-target learning significantly improves predictive accuracy, achieving a root-mean-square error of 0.44 units for the EXT-Zamora, compared to a root-mean-square error of 0.63 units for single-task models. For the EXT-SAMPL9 dataset, multi-target learning achieves a root-mean-square error of 1.02 units, indicating reasonable performance even for more complex molecular structures. These findings highlight the potential of multi-fidelity learning approaches that leverage quantum chemical data to improve toluene/water partition coefficient predictions and address challenges posed by limited experimental data. We expect the applicability of the methods used beyond just toluene/water partition coefficients.

Authors

  • Thomas Nevolianis
    Institute of Technical Thermodynamics, RWTH Aachen University, 52062, Aachen, Germany.
  • Jan G Rittig
    Process Systems Engineering, RWTH Aachen University, 52074, Aachen, Germany.
  • Alexander Mitsos
    AVT-Process Systems Engineering, RWTH Aachen University, Aachen, Germany.
  • Kai Leonhard
    Institute of Technical Thermodynamics, RWTH Aachen University, 52062, Aachen, Germany. kai.leonhard@ltt.rwth-aachen.de.

Keywords

No keywords available for this article.