Nanopore full length 16S rRNA gene sequencing increases species resolution in bacterial biomarker discovery.
Journal:
Scientific reports
Published Date:
Jul 21, 2025
Abstract
Discovery of disease-related bacterial biomarkers could be a useful approach for early prevention or diagnosis of various afflictions, such as colorectal cancer. This typically involves analyzing small regions of the 16S rRNA gene (e.g. V3V4) through short-read technologies like Illumina, obtaining genus-level results. However, recent developments in third-generation sequencing, such as Oxford Nanopore Technologies (ONT)'s new R10.4.1 chemistry and its improved basecalling models, are beginning to allow for a more complete and accessible species-level analysis through full-length 16S rRNA gene sequencing (spanning regions V1-V9). Thus, the goal of this study was to compare and evaluate both approaches, using colorectal cancer biomarker discovery as a representative case. This was achieved through the analysis of feces from 123 subjects, comparing both methods (Illumina-V3V4 with DADA2 and QIIME2 vs. ONT-V1V9 with Emu), multiple Dorado basecalling models (fast, hac and sup) and multiple databases (SILVA vs. Emu's Default database). Basecalling models broadly resulted in similar taxonomic output, but had significantly higher observed species and different taxonomic identification the lower the basecalling quality (p-value<0.05). Database choice with Emu influenced the identified species greatly, with Emu's Default database obtaining significantly higher diversity and identified species than SILVA (p-value<0.05). However, it overconfidently classified at times what should be an unknown species as the closest match due to its database structure. Bacterial abundance between Illumina-V3V4 and ONT-V1V9 at the genus level correlated well (R≥0.8). Nanopore sequencing identified more specific bacterial biomarkers for colorectal cancer than those obtained with Illumina, such as Parvimonas micra, Fusobacterium nucleatum, Peptostreptococcus stomatis, Peptostreptococcus anaerobius, Gemella morbillorum, Clostridium perfringens, Bacteroides fragilis and Sutterella wadsworthensis. Prediction of colorectal cancer through manual feature selection and machine learning resulted in an AUC of 0.87 with 14 species or 0.82 with just 4 species (P. micra, F. nucleatum, B. fragilis and Agathobaculum butyriciproducens). Full 16S rRNA V1V9 sequencing through Oxford Nanopore and its new R10.4.1 chemistry achieved accurate species-level bacterial identification, facilitating the discovery of more precise disease-related biomarkers and increasing the taxonomic fidelity of future microbiome analyses.