Developing Pharmaceutically Relevant Pd-Catalyzed C-N Coupling Reactivity Models Leveraging High-Throughput Experimentation.
Journal:
Journal of the American Chemical Society
Published Date:
May 29, 2025
Abstract
This manuscript presents machine learning models for Pd-catalyzed C-N couplings constructed using a large, pharmaceutically relevant, structurally diverse dataset (4204 unique products) generated using high-throughput experimentation. The dataset generation was enabled by the discovery of novel nanomole scale compatible automation friendly C-N coupling reaction conditions using LiOTMS as the base. The large dataset enabled the systematic evaluation of model performance using five different data-splitting strategies that were carefully designed to assess the models' ability to both interpolate and extrapolate. The models exhibit high predictive performance across all splits as gauged by standard metrics. In addition, the models predicted with high accuracy the outcome of validation libraries that were outside the scope of the training set. Employing these models in the context of medicinal chemistry campaigns should result in significant enrichment of successful C-N couplings.