Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task.

Journal: Scientific reports
Published Date:

Abstract

A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were optimized to achieve the best accuracy. Even with elimination of all retention time values accuracies of up to 96% and 92% were achieved on validation set for plant species and plant organ identification respectively. Benefits and drawbacks of used algortihms were discussed. Preliminary test showed that developed approaches exhibit tolerance to changes in data created by using different extraction methods and/or equipment. Dataset with more than 2200 chromatograms was published in an open repository.

Authors

  • Pavel Kharyuk
    Skolkovo Institute of Science and Technology, Center for Computational and Data-Intensive Science and Engineering, Moscow, 143026, Russia. kharyuk.pavel@gmail.com.
  • Dmitry Nazarenko
    Lomonosov Moscow State University, Faculty of Chemistry, Moscow, 119991, Russia. dmitro.nazarenko@gmail.com.
  • Ivan Oseledets
    Skolkovo Institute of Science and Technology, Center for Computational and Data-Intensive Science and Engineering, Moscow, 143026, Russia.
  • Igor Rodin
    Lomonosov Moscow State University, Faculty of Chemistry, Moscow, 119991, Russia.
  • Oleg Shpigun
    Lomonosov Moscow State University, Faculty of Chemistry, Moscow, 119991, Russia.
  • Andrey Tsitsilin
    All-Russian Research Institute of Medicinal and Aromatic Plants (VILAR), Moscow, 117216, Russia.
  • Mikhail Lavrentyev
    Saratov State University, Department of Botanics and Ecology, Saratov, 410012, Russia.