Machine Learning Enhanced Spectrum Recognition Based on Computer Vision (SRCV) for Intelligent NMR Data Extraction.
Journal:
Journal of chemical information and modeling
Published Date:
Nov 10, 2020
Abstract
A machine learning enhanced spectrum recognition system called spectrum recognition based on computer vision (SRCV) for data extraction from previously analyzed C and H NMR spectra has been developed. The intelligent system was designed with four function modules to extract data from three areas of NMR images, including C and H chemical shifts, the integral, and the range of the shift values. During this study, three machine learning models were pretrained for number recognition, which is the key procedure for NMR data extraction. The nearest neighbor (NN) method was selected with optimized ( = 4), which displayed a 100% recognition rate. Subsequently, the performance of SRCV was tested and validated to have high accuracy with a short processing time (11-21 s) for each NMR spectral image. Our spectrum recognizer enables high-throughput C and H NMR data extraction from abundant spectra in the literature and has the potential to be used for spectral database construction. In addition, the system may be applicable to be developed for data import to computer-assisted structure elucidation systems, which would automate this procedure significantly. SRCV can be accessed in GitHub (https://github.com/WJmodels/SRCV).