Bidirectional Long Short-Term Memory (BiLSTM) Neural Networks with Conjoint Fingerprints: Application in Predicting Skin-Sensitizing Agents in Natural Compounds.
Journal:
Journal of chemical information and modeling
PMID:
40029998
Abstract
Skin sensitization, or allergic contact dermatitis, represents a critical end point in toxicity assessment, with profound implications for drug safety and regulatory decision-making. This study aims to develop a robust deep-learning-based quantitative structure-activity relationship framework for accurately predicting skin sensitization toxicity, particularly in the context of natural-product-derived compounds. To achieve this, we explored advanced recurrent neural network architectures, including long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU, to model the intricate structure-toxicity relationships inherent in molecular compounds. We aim to optimize and improve predictive performance by training a cohort of 55 models with a diverse set of molecular fingerprints. Notably, the BiLSTM model, which integrates SMILES tokens with RDKit fingerprints, achieved superior predictive performance, underscoring its capability to effectively capture key molecular determinants of skin sensitization. An extensive applicability domain analysis coupled with an in-depth evaluation of feature importance provided new insights into the key molecular attributes that influence sensitization propensity. We further evaluated the BiLSTM model using a natural product data set, where it demonstrated exceptional generalization capabilities. The model achieved an accuracy of 86.5%, a Matthews correlation coefficient of 75.2%, a sensitivity of 100%, an area under the curve of 88%, a specificity of 75%, and an F1-score of 88.8%. Remarkably, the model effectively categorized natural products by discriminating sensitizing from non-sensitizing agents across various natural product subcategories. These results underscore the potential of BiLSTM-based models as powerful tools for modern drug discovery efforts and regulatory assessments, especially in the field of natural products.