Developing Predictive Models by Sharing Predictions - An Investigation of a Federated Learning Approach for ADMET Predictions.
Journal:
Journal of medicinal chemistry
Published Date:
May 29, 2026
Abstract
Machine learning models for ADMET prediction benefit from large, diverse data sets, yet such data are typically siloed across organizations. Federated learning (FL) enables collaborative modeling while preserving data privacy. Here, we investigate a student-teacher model (STM) framework in which organizations train internal models on proprietary data and share predictions on a public data set to generate pseudolabels for a centralized student model. As a proof of concept, 11 pharmaceutical companies contributed predictions for rat steady-state volume of distribution, yielding a pseudolabeled data set of ∼133,000 compounds. The resulting student model achieved performance comparable to individual teacher models on an external test set (RMSE ≈ 0.51 vs 0.47-0.61). Compared with FL approaches such as MELLODY and Effiris, STM offers a simpler workflow that avoids direct data sharing or iterative collaboration, providing a practical and scalable framework for secure cross-company model development.
Authors
Keywords
No keywords available for this article.