MAMSI: Integration of Multiassay Liquid Chromatography-Mass Spectrometry Metabolomics Data Using Multiview Machine Learning.
Journal:
Analytical chemistry
Published Date:
Jul 10, 2025
Abstract
Liquid chromatography-mass spectrometry (LC-MS) is a commonly used analytical technique in untargeted metabolomics. However, the diverse chemical and physical properties of metabolites often require the use of several different analytical assays for broad metabolome coverage. Conventionally, each assay is analyzed separately, but this fails to capture interassay relationships, making multiassay biomarker discovery and data interpretation difficult. Here we propose a workflow to integrate multiassay metabolomics data, designed to enable biomarker discovery and elucidation of unknown metabolites. We employ a multiblock-partial least-squares model (MB-PLS) coupled with multiblock variable importance in projection to estimate the importance of predictors to the outcome variable. Then we cluster the selected predictors and compare them to groups defined by their structural properties based on retention time and mass-to-charge ratio. To demonstrate and evaluate the approach, we used three multiassay data sets predicting biological sex, Alzheimer's disease status, and blood bilirubin levels as the outcomes of interest. The MB-PLS models outperformed single-assay models in both classification and regression tasks, indicating that modeling interblock relationships enabled an improved estimate of phenotypic outcome. Additionally, the MB-PLS models shed valuable insight into each data block's contribution to the predicted outcome. Our workflow enabled us to determine a set of potential cross-assay biomarkers. Following putative annotation, the majority of these and their signs of association agreed with results previously reported in the literature. Our workflow has the potential to benefit the metabolomics community and beyond as it offers interpretable integrative analysis of multiassay LC-MS data and facilitates discovery of potential biomarkers.