A pipeline for developing AI-driven models to predict molecular initiating events: a case study on neural tube defects.
Journal:
Journal of cheminformatics
Published Date:
Apr 2, 2026
Abstract
Adverse Outcome Pathways (AOPs) describe the sequence of molecular and cellular events that lead to toxicity. Each pathway begins with a Molecular Initiating Event (MIE) and ends in an Adverse Outcome. Early identification of chemical activity on MIE-relevant protein targets supports first-line toxicity assessment and helps researchers prioritize mechanisms for subsequent experimental investigation. Here we present an automated AI pipeline that converts raw ChEMBL bioactivity data into optimized deep learning models for MIE prediction. The pipeline builds on the Knowledge-Guided Pre-training of Graph Transformer (KPGT) framework, which represents chemical structures as knowledge-enriched molecular graphs. It integrates data curation, molecular graph generation, and model training and tuning. This integration enables users to construct target-specific prediction models in a seamless and reproducible way, starting from initial data and ending with deployable AI. We demonstrate its use in a neural tube defect (NTD) case study, where fine-tuned KPGT models outperformed traditional Support Vector Machine models with a radial basis function kernel (SVM-RBF) when predicting MIEs linked to developmental toxicity. The results highlight the potential of AI-driven toxicity modeling to accelerate AOP development, improve endpoint prioritization, and prioritize chemicals for experimental follow-up. By providing an end-to-end, data-to-model workflow, the pipeline lowers the technical barrier to using modern graph-based neural architectures in toxicology. It offers a reproducible route to deployable MIE prediction models that support AOP development, compound prioritization, and early-stage chemical safety evaluation.
Authors
Keywords
No keywords available for this article.