Transformer-based deep learning for accurate detection of multiple base modifications using single molecule real-time sequencing.
Journal:
Communications biology
PMID:
40229481
Abstract
We had previously reported a convolutional neural network (CNN) based approach, called the holistic kinetic model (HK model 1), for detecting 5-methylcytosine (5mC) by single molecule real-time sequencing (Pacific Biosciences). In this study, we constructed a hybrid model with CNN and transformer layers, named HK model 2. We improve the area under the receiver operating characteristic curve (AUC) for 5mC detection from 0.91 for HK model 1 to 0.99 for HK model 2. We further demonstrate that HK model 2 can detect other types of base modifications, such as 5-hydroxymethylcytosine (5hmC) and N6-methyladenine (6mA). Using HK model 2 to analyze 5mC patterns of cell-free DNA (cfDNA) molecules, we demonstrate the enhanced detection of patients with hepatocellular carcinoma, with an AUC of 0.97. Moreover, HK model 2-based detection of 6mA enables the detection of jagged ends of cfDNA and the delineation of cellular chromatin structures. HK model 2 is thus a versatile tool expanding the applications of single molecule real-time sequencing in liquid biopsies.