Machine learning-driven bioavailability prediction in early-stage drug development: a KNIME-based computational workflow for digital health applications.
Journal:
Xenobiotica; the fate of foreign compounds in biological systems
Published Date:
May 28, 2025
Abstract
Bioavailability prediction remains a significant challenge in early-stage drug development, where conventional experimental approaches are time-consuming and resource-intensive. This study explores the application of machine learning techniques to enhance the efficiency of bioavailability prediction. By leveraging computational workflows within the KNIME Analytics Platform, we aim to automate bioavailability assessment and reduce dependence on costly and studies. A dataset comprising 475 drug-like compounds characterised by key molecular descriptors was analysed using multiple machine learning models, including Random Forest, Gradient Boosting, Decision Trees, k-Nearest Neighbours, and neural networks. Model performance was assessed through 5-fold cross-validation, with ensemble models outperforming linear and neural network-based approaches. Random Forest demonstrated the highest predictive performance ( = 0.87, RMSE = 0.08). Feature importance analysis identified topological polar surface area and solubility as the most influential factors in bioavailability prediction. The findings underscore the potential of integrating open-source tools and machine learning methodologies in pharmaceutical research, improving workflow efficiency while adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. This approach facilitates rapid and cost-effective bioavailability assessment, supporting AI-driven predictive modelling and digital health applications in drug development.
Authors
Keywords
No keywords available for this article.