Machine learning and chemometric methods for high-throughput authentication of 53 Root and Rhizome Chinese Herbal using ATR-FTIR fingerprints.
Journal:
Journal of chromatography. B, Analytical technologies in the biomedical and life sciences
Published Date:
May 3, 2025
Abstract
To address the identification challenges caused by morphological similarities in Root and Rhizome Chinese Herbal (RRCH), this study developed a discrimination system integrating Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FTIR) with multimodal machine learning. 53 kinds of RRCH collected from China were analyzed using ATR-FTIR to acquire spectral fingerprints. An innovative analytical framework was established, combining chemometric Partial Least Squares Discriminant Analysis (PLS-DA) with optimized machine learning models: t-distributed Stochastic Neighbor Embedding (t-SNE), optimized decision trees, optimized discriminant analysis, naive Bayes, optimized SVM, optimized KNN, SVM kernels, and optimized ensemble learning. Multivariate analysis revealed distinct spatial distribution patterns of chemical characteristics among the 53 RRCH species. t-SNE projections demonstrated significant cluster separation in two-dimensional feature space, confirming strong correlations between spectral fingerprints and phytochemical compositions. The SVM model outperformed others, achieving 100 % classification accuracy on both training and validation sets, with a markedly shorter identification time compared to PLS-DA. This ATR-FTIR-machine learning hybrid system enables high-throughput authentication of RRCH and establishes a scalable technical framework for herbal quality standardization. The methodology provides critical insights into chemical marker discovery through vibrational spectrum-feature relationship mapping, advancing intelligent discrimination of botanically similar medicinal materials.