An initial exploration of machine learning for establishing associations between genetic markers and THC levels in Cannabis sativa samples.

Journal: Forensic science international. Genetics
PMID:

Abstract

Cannabis sativa, a globally commercialized plant used for medicinal, food, fiber production, and recreation, necessitates effective identification to distinguish legal and illegal varieties in forensic contexts. This research utilizes multivariate statistical models and Machine Learning approaches to establish correlations between specific genotypes and tetrahydrocannabinol (Δ-THC) content (%) in C. sativa samples. 132 cannabis leaves samples were obtained from legal growers in Piedmont, Italy, and illegal drug seizures in Turin. Samples were genetically profiled using a 13-loci STR multiplex and their Δ-THC content was detected through quantitative GC-MS analysis. This study aims to assess the use of supervised classification modelling on genetic data to distinguish cannabis samples into legal and illegal categories, revealing distinct clusters characterized by unique allele profiles and THC content. t-distributed Stochastic Neighbor Embedding (t-SNE), Random Forest (RF) and Partial Least Squares Regression (PLS-R) were executed for the machine learning modelling. All the tested models resulted effective discriminating between legal samples and illegal. Although further validation is necessary, this study presents a novel forensic investigative approach, potentially aiding law enforcement in significant marijuana seizures or tracking illicit drug trafficking routes.

Authors

  • Selena Cisana
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy. Electronic address: selena.cisana@antidoping.piemonte.it.
  • Michele Di Nunzio
    Forensic Genetics Laboratory - Legal Medicine Unit Department of Medicine, University of Barcelona, Spain. Electronic address: michele.dinunzio@ub.edu.
  • Valentina Brenzini
    Department of Biology, University of Florence, Italy.
  • Monica Omedei
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy.
  • Fabrizio Seganti
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy.
  • Christina Ververi
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy; Department of Chemistry, University of Torino, Italy.
  • Enrico Gerace
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy.
  • Alberto Salomone
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy; Department of Chemistry, University of Torino, Italy.
  • Andrea Berti
    Reparto CC Investigazioni Scientifiche di Cagliari, Italy.
  • Filippo Barni
    Reparto CC Investigazioni Scientifiche di Roma, Italy.
  • Sergio Schiavone
    Reparto CC Investigazioni Scientifiche di Roma, Italy.
  • Andrea Coppi
    Forensic Genetics Laboratory - Legal Medicine Unit Department of Medicine, University of Barcelona, Spain.
  • Ciro Di Nunzio
    University Magna Graecia of Catanzaro, Catanzaro, Italy. Electronic address: dinunzio@unicz.it.
  • Paolo Garofano
    Centro Regionale Antidoping e di Tossicologia "A. Bertinaria", Regione Gonzole 10/1, Orbassano, Torino 10043, Italy.
  • Eugenio Alladio
    Department of Chemistry, University of Turin, Turin, Italy.