Multimodal deep learning for chemical toxicity prediction and management.

Journal: Scientific reports
Published Date:

Abstract

The accurate prediction of chemical toxicity is a crucial research focus in chemistry, biotechnology, and national defense. The development of comprehensive datasets for chemical toxicity prediction remains limited due to security constraints and the structural complexity of chemical data. Existing studies are often confined to specific domains, such as genotoxicity or acute oral toxicity. To address these gaps, this study introduces an integrated research dataset that combines chemical property data and molecular structure images. The dataset is curated from diverse sources, preprocessed, and normalized to optimize it for deep learning applications. The proposed deep learning model enhances the precision of multi-toxicity predictions by integrating Vision Transformer (ViT) architecture for image-based data and a Multilayer Perceptron (MLP) for numerical data. A joint fusion mechanism is employed to effectively combine image and numerical features, significantly improving predictive performance. The model is also designed for multi-label toxicity prediction, enabling simultaneous evaluation of diverse toxicological endpoints. Experimental results show that ViT model demonstrate an accuracy of 0.872, an F1-score of 0.86, and a Pearson Correlation Coefficient (PCC) of 0.9192.

Authors

  • Jiwon Hong
    ROK Army Signal School, Daejeon, 34059, South Korea.
  • Hyun Kwon
    Department of Electrical Engineering, Korea Military Academy, Seoul 01805, Korea.