Steroid identification via deep learning retention time predictions and two-dimensional gas chromatography-high resolution mass spectrometry.

Journal: Journal of chromatography. A
PMID:

Abstract

Untargeted steroid identification represents a great analytical challenge even when using sophisticated technology such as two-dimensional gas chromatography coupled to high resolution mass spectrometry (GC × GCHRMS) due to the chemical similarity of the analytes. Moreover, when analytical standards, mass spectral and retention index databases are not available, compound annotation is cumbersome. Hence, there is a need for the development of retention time prediction models in order to explore new annotation approaches. In this work, we evaluated the use of several in silico methods for retention time prediction in multidimensional gas chromatography. We use three classical machine learning (CML) algorithms (Partial Least Squares (PLS), Support Vector Regression (SVR) and Random Forest Regression (RFR)) and two deep learning approaches (dense neural network (DNN) and three-dimensional convolutional neural network (CNN)). Whereas molecular descriptors were utilized for the CLM and DNN algorithms, three-dimensional molecular representation based on the electrostatic potential (ESP) was studied as input data as is for the CNN. All the developed models showed similar performances with Q values over 0.9. However, among all CNN showed the best performance, resulting in average retention time prediction errors of 2% and 6% for the first and second separation dimension, respectively. Additionally, only the three-dimensional ESP representation coupled with CNN was able to extract the stereochemical information crucial for the separation of diastereomers. The combination of retention time prediction and high-resolution mass spectral data applied to clinical samples enabled the untargeted annotation of 12 steroid metabolites in the urine of new-borns.

Authors

  • Giuseppe Marco Randazzo
    Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), Scuola Universitaria Professionale della Svizzera italiana (SUPSI), Università della Svizzera italiana (USI), CH-6928 Manno, Switzerland. Electronic address: gm.randazzo@idsia.ch.
  • Andrea Bileck
    Department of Nephrology and Hypertension and Department of BioMedical Research, Inselspital, Bern University Hospital, University of Bern, Switzerland.
  • Andrea Danani
    Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), Scuola Universitaria Professionale della Svizzera italiana (SUPSI), Università della Svizzera italiana (USI), CH-6928 Manno, Switzerland.
  • Bruno Vogt
    Department of Nephrology and Hypertension and Department of BioMedical Research, Inselspital, Bern University Hospital, University of Bern, Switzerland.
  • Michael Groessl
    Department of Nephrology and Hypertension and Department of BioMedical Research, Inselspital, Bern University Hospital, University of Bern, Switzerland. Electronic address: michael.groessl@dbmr.unibe.ch.