Filtering of artificial chimeric reads generated by ligation preparation method of nanopore sequencing.

Journal: iScience
Published Date:

Abstract

Nanopore sequencing provides long and ultra-long reads that are valuable for structural variation (SV) detection and genome assembly. However, false-positive chimeric reads can arise and interfere with somatic SV calling. Here, we show that ligation-based library preparation generates false-positive chimeric reads, particularly inverted repeats, in both microbial and human DNA standards. The proportion of inverted repeats ranged from 0.18% to 7.33%, exceeding those observed in rapid and modified rapid preparations. Analysis of raw electrical signals revealed a characteristic smoothed segment at junction sites in nearly half of these chimeric reads. Based on these features, we developed a ResNet-based deep learning classifier to identify false-positive chimeric reads. The model achieved high accuracy in both human and microbial datasets and substantially reduced SV calling errors after filtering. These results demonstrate that library preparation-induced chimeric reads can be effectively detected and mitigated, improving the reliability of nanopore-based SV analysis.

Authors

Keywords

No keywords available for this article.