Standardizing a microbiome pipeline for body fluid identification from complex crime scene stains.
Journal:
Applied and environmental microbiology
Published Date:
Apr 30, 2025
Abstract
Recent advances in next-generation sequencing have opened up new possibilities for applying the human microbiome in various fields, including forensics. Researchers have capitalized on the site-specific microbial communities found in different parts of the body to identify body fluids from biological evidence. Despite promising results, microbiome-based methods have not been integrated into forensic practice due to the lack of standardized protocols and systematic testing of methods on forensically relevant samples. Our study addresses critical decisions in establishing these protocols, focusing on bioinformatics choices and the use of machine learning to present microbiome results in case reports for forensically relevant and challenging samples. In our study, we propose using operational taxonomic units (OTUs) for read data processing and generating heterogeneous training data sets for training a random forest classifier. We incorporated six forensically relevant classes: saliva, semen, skin from hand, penile skin, urine, and vaginal/menstrual fluid, and our classifier achieved a high weighted average F1 score of 0.89. Systematic testing on mock forensic samples, including mixed-source samples and underwear, revealed reliable detection of at least one component of the mixture and the identification of vaginal fluid from underwear substrates. Additionally, when investigating the sexually shared microbiome (sexome) of heterosexual couples, our classifier could potentially infer the nature of sexual activity. We therefore highlight the value of the sexome for assessing the nature of sexual activities in forensic investigations while delineating areas that warrant further research.IMPORTANCEMicrobiome-based analyses combined with machine learning offer potential avenues for use in forensic science and other applied fields, yet standardized protocols remain lacking. Moreover, machine learning classifiers have shown promise for predicting body sites in forensics, but they have not been systematically evaluated on complex mixed-source samples. Our study addresses key decisions for establishing standardized protocols and, to our knowledge, is the first to report classification results from uncontrolled mixed-source samples, including sexome (sexually shared microbiome) samples. In our study, we explore both the strengths and limitations of classifying the mixed-source samples while also providing options for tackling the limitations.