OCMR: A comprehensive framework for optical chemical molecular recognition.

Journal: Computers in biology and medicine
Published Date:

Abstract

Artificial intelligence (AI) has achieved significant progress in the field of drug discovery. AI-based tools have been used in all aspects of drug discovery, including chemical structure recognition. We propose a chemical structure recognition framework, Optical Chemical Molecular Recognition (OCMR), to improve the data extraction capability in practical scenarios compared with the rule-based and end-to-end deep learning models. The proposed OCMR framework enhances the recognition performances via the integration of local information in the topology of molecular graphs. OCMR handles complex tasks like non-canonical drawing and atomic group abbreviation and substantially improves the current state-of-the-art results on multiple public benchmark datasets and one internally curated dataset.

Authors

  • Yan Wang
    College of Animal Science and Technology, Beijing University of Agriculture, Beijing, China.
  • Ruochi Zhang
    BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China.
  • Shengde Zhang
    State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering , Beijing University of Chemical Technology , P.O. Box 53, 15 BeiSanHuan East Road , Beijing 100029 , P. R. China.
  • Liming Guo
    McGill University Health Centre, McGill Adult Unit for Congenital Heart Disease Excellence, Montreal, Québec, Canada.
  • Qiong Zhou
    Department of Cardiology Third Ward, Jingzhou First People's Hospital, No. 8 Hangkang Road, Jingzhou, Hubei Province 434000, China.
  • Bowen Zhao
    Guangzhou Institute of Technology, Xidian University, Guangzhou, China.
  • Xiaotong Mo
    Machine Learning Department, Silexon AI Technology Co, Ltd, Beijing, 100084, China.
  • Qian Yang
    Center for Advanced Scientific Instrumentation, University of Wyoming, Laramie, WY, United States.
  • Yajuan Huang
    Machine Learning Department, Silexon AI Technology Co, Ltd, Beijing, 100084, China.
  • Kewei Li
    Institute of Microbiology, Jilin Provincial Center for Disease Control and Prevention Changchun, China.
  • Yusi Fan
    College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China.
  • Lan Huang
  • Fengfeng Zhou