Assessing the Impact of Measurement Precision on Metabolite Identification Probability in Multidimensional Mass Spectrometry-Based, Reference-Free Metabolomics.
Journal:
Analytical chemistry
Published Date:
Jul 8, 2025
Abstract
Identification of compounds with minimal ambiguity remains a central challenge in mass spectrometry-based metabolomics. Conventional compound identification relies on comparing analytical signatures (e.g., mass-to-charge ratio, collision cross section, tandem mass spectra) against reference data obtained from measurements of authentic chemical standards. The breadth of annotatable compounds using this approach is necessarily limited by availability of authentic standards, analytical throughput, and resolving power of the separations that underly the measurements. The maturation of computational methods, both theory-driven and artificial intelligence/machine learning-based, for prediction of various molecular properties relevant to multidimensional mass spectrometry measurements has opened the door to a new "reference-free" paradigm of compound annotation. Through augmenting existing reference data for molecular properties with computational predictions, the universe of identifiable chemical species can be expanded significantly beyond its current limits. An unexplored aspect of this novel approach is understanding how to gauge confidence in resulting annotations, especially as the compound search space is expanded. Intuitively, the confidence of a compound annotation is related to the inherent discriminatory power of the molecular properties used for identification, as well as the precision with which the properties are measured or predicted. In this work, we characterize this relationship between measurement precision and identification probability in a systematic and quantitative fashion for a defined region of chemical space that includes organic small molecule metabolites. Importantly, this work establishes a framework for conducting metabolite identification probability analysis that enables others to quantify this relationship for their own compounds and properties of interest.