TADA-a machine learning tool for functional annotation-based prioritisation of pathogenic CNVs.

Journal: Genome biology
PMID:

Abstract

Few methods have been developed to investigate copy number variants (CNVs) based on their predicted pathogenicity. We introduce TADA, a method to prioritise pathogenic CNVs through assisted manual filtering and automated classification, based on an extensive catalogue of functional annotation supported by rigourous enrichment analysis. We demonstrate that our classifiers are able to accurately predict pathogenic CNVs, outperforming current alternative methods, and produce a well-calibrated pathogenicity score. Our results suggest that functional annotation-based prioritisation of pathogenic CNVs is a promising approach to support clinical diagnostics and to further the understanding of mechanisms controlling the disease impact of larger genomic alterations.

Authors

  • Jakob Hertzberg
    Max Planck Institute for Molecular Genetics, Ihnestraße 63, Berlin, 14195, Germany. hertzber@molgen.mpg.de.
  • Stefan Mundlos
    Institute of Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.
  • Martin Vingron
    Max Planck Institute for Molecular Genetics, Ihnestraße 63, Berlin, 14195, Germany.
  • Giuseppe Gallone
    Max Planck Institute for Molecular Genetics, Ihnestraße 63, Berlin, 14195, Germany.