Bayesian Forensic DNA Mixture Deconvolution Using a Novel String Similarity Measure
Journal:
arXiv
Published Date:
May 2, 2025
Abstract
Mixture interpretation is a central challenge in forensic science, where
evidence often contains contributions from multiple sources. In the context of
DNA analysis, biological samples recovered from crime scenes may include
genetic material from several individuals, necessitating robust statistical
tools to assess whether a specific person of interest (POI) is among the
contributors. Methods based on capillary electrophoresis (CE) are currently in
use worldwide, but offer limited resolution in complex mixtures. Advancements
in massively parallel sequencing (MPS) technologies provide a richer, more
detailed representation of DNA mixtures, but require new analytical strategies
to fully leverage this information. In this work, we present a Bayesian
framework for evaluating whether a POIs DNA is present in an MPS-based forensic
sample. The model accommodates known contributors, such as the victim, and uses
a novel string edit distance to quantify similarity between observed alleles
and sequencing artifacts. The resulting Bayes factors enable effective
discrimination between samples that do and do not contain the POIs DNA,
demonstrating strong performance in both hypothesis testing and classification
settings.