MAMnet: detecting and genotyping deletions and insertions based on long reads and a deep learning approach.
Journal:
Briefings in bioinformatics
PMID:
35580841
Abstract
Structural variations (SVs) play important roles in human genetic diversity; deletions and insertions are two common types of SVs that have been proven to be associated with genetic diseases. Hence, accurately detecting and genotyping SVs is significant for disease research. Despite the fact that long-read sequencing technologies have improved the field of SV detection and genotyping, there are still some challenges that prevent satisfactory results from being obtained. In this paper, we propose MAMnet, a fast and scalable SV detection and genotyping method based on long reads and a combination of convolutional neural network and long short-term network. MAMnet uses a deep neural network to implement sensitive SV detection with a novel prediction strategy. On real long-read sequencing datasets, we demonstrate that MAMnet outperforms Sniffles, SVIM, cuteSV and PBSV in terms of their F1 scores while achieving better scaling performance. The source code is available from https://github.com/micahvista/MAMnet.