Compound Classification and Consideration of Correlation with Chemical Descriptors from Articles on Antioxidant Capacity Using Natural Language Processing.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

In recent times, there has been a substantial increase in the number of articles focusing on antioxidants. However, the development of a comprehensive estimator for antioxidant capacity remains elusive due to the challenge of integrating information from these articles. Furthermore, the complexity of the antioxidant mechanism, which involves a multitude of factors, makes it difficult to establish a simple equation or correlation. Hence, there is a pressing need for a model that can effectively interpret the collective knowledge from these articles, especially from a chemistry perspective. In this research, we employed natural language processing techniques, specifically Word2Vec, to analyze articles related to antioxidant capacity. We extracted representation vectors of compound names from these documents and organized them into 10 distinct clusters. In our investigation of two of these clusters, we unveiled that the majority of the compounds in question were flavonoids and flavonoid glycosides. To establish a link between the descriptors and clusters, we utilized kernel density estimation and generated scatter plots to visualize their similarity. These visualizations clearly indicated a strong relationship between the descriptors and clusters, affirming that a tangible connection exists between word vectors and compound descriptors through a document analysis conducted with natural language processing techniques. This study represents a pioneering approach that utilizes document analysis to shed light on the field of antioxidant capacity research, marking a significant advancement in this domain.

Authors

  • Yuto Matsumoto
    Department of Chemistry and Life Science, Yokohama National University, 79-5 Tokiwadai, Hodogaya-ku, Yokohama 240-8501, Japan.
  • Hiroaki Gotoh
    Department of Chemistry and Life Science, Yokohama National University, 79-5 Tokiwadai, Hodogaya-ku, Yokohama 240-8501, Japan.