Utilizing machine learning with knockoff filtering to extract significant metabolites in Crohn's disease with a publicly available untargeted metabolomics dataset.

Journal: PloS one
PMID:

Abstract

Metabolomic data processing pipelines have been improving in recent years, allowing for greater feature extraction and identification. Lately, machine learning and robust statistical techniques to control false discoveries are being incorporated into metabolomic data analysis. In this paper, we introduce one such recently developed technique called aggregate knockoff filtering to untargeted metabolomic analysis. When applied to a publicly available dataset, aggregate knockoff filtering combined with typical p-value filtering improves the number of significantly changing metabolites by 25% when compared to conventional untargeted metabolomic data processing. By using this method, features that would normally not be extracted under standard processing would be brought to researchers' attention for further analysis.

Authors

  • Shoaib Bin Masud
    Department of Electrical and Computer Engineering, Tufts University, Medford, MA, United States of America.
  • Conor Jenkins
    DEVCOM Chemical Biological Center, Aberdeen Proving Ground, Aberdeen, MD, United States of America.
  • Erika Hussey
    DEVCOM Soldier Center, Natick, MA, United States of America.
  • Seth Elkin-Frankston
    DEVCOM Soldier Center, Natick, MA, United States of America.
  • Phillip Mach
    DEVCOM Chemical Biological Center, Aberdeen Proving Ground, Aberdeen, MD, United States of America.
  • Elizabeth Dhummakupt
    DEVCOM Chemical Biological Center, Aberdeen Proving Ground, Aberdeen, MD, United States of America.
  • Shuchin Aeron
    Department of Electrical and Computer Engineering, Tufts University, Medford, MA, United States of America.