Unraveling city-specific signature and identifying sample origin locations for the data from CAMDA MetaSUB challenge.

Journal: Biology direct
Published Date:

Abstract

BACKGROUND: Composition of microbial communities can be location-specific, and the different abundance of taxon within location could help us to unravel city-specific signature and predict the sample origin locations accurately. In this study, the whole genome shotgun (WGS) metagenomics data from samples across 16 cities around the world and samples from another 8 cities were provided as the main and mystery datasets respectively as the part of the CAMDA 2019 MetaSUB "Forensic Challenge". The feature selecting, normalization, three methods of machine learning, PCoA (Principal Coordinates Analysis) and ANCOM (Analysis of composition of microbiomes) were conducted for both the main and mystery datasets.

Authors

  • Runzhi Zhang
    Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China.
  • Alejandro R Walker
    Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL, 32610, USA.
  • Susmita Datta
    Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, FL, 32610, USA. susmita.datta@ufl.edu.