Multi-Dimensional Spatiotemporal Attention Neural Network for Next Generation Sequencing Basecalling

Journal: bioRxiv
Published Date:

Abstract

Next-generation sequencing (NGS) remains the most used sequencing technique in the field of genomics. Traditional basecall methods face significant challenges in decoding high density sequencing data due to inherent noise in biochemical reactions and limitations of instruments. Here, we present a multi-dimensional deep learning neural network based on spatiotemporal attention mechanism named AICall. The network skips computationally heavy but less effective steps of peak finding and brightness extraction/correction, and directly basecalls from the time sequence of multi-dimensional image stacks obtained in real time. By introducing attention mechanism, it effectively extracts spatial and time-related key information including spatial crosstalk, spectral crosstalk, phasing, base-quenching, and intensity decay, and significantly improves basecall accuracy. We demonstrate that AICall achieves an average error rate less than 0.01% and provides more reliable sequencing results for downstream analysis.

Authors

  • Kuankuan Peng; Wei Chen; Tianran Yao; Huihua Xia; Guoli Fu; Gailing Li; Yuanye Bao; Erkai Liu; Luyang Zhao; Gufeng Wang