DoBSeqWF: a framework for sensitive detection of individual genetic variation in pooled sequencing data.

Journal: NAR genomics and bioinformatics
Published Date:

Abstract

Population screening for rare genetic diseases has the potential to increase early diagnosis and treatment, but the high cost of next-generation sequencing limits widespread implementation. Double-batched sequencing (DoBSeq) is a cost-effective method that uses two-dimensional overlapping pool sequencing to enable individual-level rare variant detection. However, the resulting high-depth, complex data require a specialized workflow for efficient, sensitive, and reproducible analysis. We developed DoBSeqWF (DoBSeq Workflow), a Nextflow-based pipeline that processes pooled sequencing data from alignment through variant calling, filtering, and final variant assignment. Using a childhood cancer cohort of 200 individuals with whole genome sequencing as a reference, we created training and validation datasets, benchmarked multiple variant callers, and implemented machine learning filters to improve rare variant detection while maintaining high sensitivity. DoBSeqWF demonstrates accurate and scalable rare variant detection within the evaluated experimental setting and provides a promising avenue for future cost-effective genetic screening programmes.

Authors

Keywords

No keywords available for this article.