Demystifying food flavor: Flavor data interpretation through machine learning.
Journal:
Food chemistry
PMID:
40233512
Abstract
Flavor data obtained from analytical techniques are vast and complex, which increases the difficulty of multi-factorial analysis. This study aims to provide a machine learning (ML)-based framework to interpret flavor data, exploiting four widely used techniques, i.e., Principal Component Analysis (PCA), Redundancy Analysis (RDA), Partial Least Squares (PLS), and Random Forest (RF). To demonstrate the potential of these ML techniques, two case studies, one with semi-quantitative data and the other with quantitative data, were discussed. Results indicate that PCA is useful for data exploration; RDA can quantify the statistical significance of factors; combining feature importance analysis results from PLS and RF offer a comprehensive understanding of marker compounds. Regarding classification performance, PLS excels in handling collinear data, whereas RF captures complex patterns if sufficient data are available. However, overfitting is a risk for datasets with small sample sizes. Overall, carefully selecting and integrating those ML techniques could demystify food flavor.