MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data.

Journal: PLoS computational biology
PMID:

Abstract

The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human disease severity. While there are approaches for obtaining mitochondrial DNA variants from NGS data, these software do not account for the unique characteristics of mitochondrial genetics and can be inaccurate even for homoplasmic variants. We introduce MitoScape, a novel, big-data, software for extracting mitochondrial DNA sequences from NGS. MitoScape adopts a novel departure from other algorithms by using machine learning to model the unique characteristics of mitochondrial genetics. We also employ a novel approach of using rho-zero (mitochondrial DNA-depleted) data to model nuclear-encoded mitochondrial sequences. We showed that MitoScape produces accurate heteroplasmy estimates using gold-standard mitochondrial DNA data. We provide a comprehensive comparison of the most common tools for obtaining mtDNA variants from NGS and showed that MitoScape had superior performance to compared tools in every statistically category we compared, including false positives and false negatives. By applying MitoScape to common disease examples, we illustrate how MitoScape facilitates important heteroplasmy-disease association discoveries by expanding upon a reported association between hypertrophic cardiomyopathy and mitochondrial haplogroup T in men (adjusted p-value = 0.003). The improved accuracy of mitochondrial DNA variants produced by MitoScape will be instrumental in diagnosing disease in the context of personalized medicine and clinical diagnostics.

Authors

  • Larry N Singh
    Center for Mitochondrial and Epigenomic Medicine, Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Brian Ennis
    Center for Data-Driven Discovery in Biomedicine (D3b), The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Bryn Loneragan
    Center for Eye Research Australia, Ophthalmology, Department of Surgery, University of Melbourne, Melbourne, Australia.
  • Noah L Tsao
    Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
  • M Isabel G Lopez Sanchez
    Center for Eye Research Australia, Ophthalmology, Department of Surgery, University of Melbourne, Melbourne, Australia.
  • Jianping Li
    College of Chemistry and Bioengineering, Guilin University of Technology, Guilin, 541004, China.
  • Patrick Acheampong
    Center for Mitochondrial and Epigenomic Medicine, Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Oanh Tran
    22q and You Center, Division of Human Genetics, The Children's Hospital of Philadelphia and the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
  • Ian A Trounce
    Center for Eye Research Australia, Ophthalmology, Department of Surgery, University of Melbourne, Melbourne, Australia.
  • Yuankun Zhu
    Center for Data-Driven Discovery in Biomedicine (D3b), The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Prasanth Potluri
    Center for Mitochondrial and Epigenomic Medicine, Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Beverly S Emanuel
    22q and You Center, Division of Human Genetics, The Children's Hospital of Philadelphia and the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
  • Daniel J Rader
    The Familial Hypercholesterolemia Foundation, Pasadena, CA, USA; Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA; Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA.
  • Zoltan Arany
    Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
  • Scott M Damrauer
    Department of Surgery, Perelman School of Medicine at University of Pennsylvania, Philadelphia, PA, USA. damrauer@upenn.edu.
  • Adam C Resnick
    Center for Data-Driven Discovery in Biomedicine (D3b), The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.
  • Stewart A Anderson
    Department of Psychiatry, The Children's Hospital of Philadelphia and the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
  • Douglas C Wallace
    Center for Mitochondrial and Epigenomic Medicine, Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.