Assessing the reliability of point mutation as data augmentation for deep learning with genomic data.

Journal: BMC bioinformatics
PMID:

Abstract

BACKGROUND: Deep neural networks (DNNs) have the potential to revolutionize our understanding and treatment of genetic diseases. An inherent limitation of deep neural networks, however, is their high demand for data during training. To overcome this challenge, other fields, such as computer vision, use various data augmentation techniques to artificially increase the available training data for DNNs. Unfortunately, most data augmentation techniques used in other domains do not transfer well to genomic data.

Authors

  • Hyunjung Lee
    Korea University, Seoul, South Korea.
  • Utku Ozbulak
    Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.
  • Homin Park
    Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, South Korea.
  • Stephen Depuydt
    Lab of Plant Growth Analysis, Ghent University Global Campus, Incheon, South Korea.
  • Wesley De Neve
    Center for Biotech Data Science, Department of Environmental Technology, Food Technology and Molecular Biotechnology, Ghent University Global Campus, Songdo, Incheon, South Korea.
  • Joris Vankerschaver
    Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium.