CiMBA: Accelerating Genome Sequencing through On-Device Basecalling via Compute-in-Memory
Journal:
arXiv
Published Date:
Apr 9, 2025
Abstract
As genome sequencing is finding utility in a wide variety of domains beyond
the confines of traditional medical settings, its computational pipeline faces
two significant challenges. First, the creation of up to 0.5 GB of data per
minute imposes substantial communication and storage overheads. Second, the
sequencing pipeline is bottlenecked at the basecalling step, consuming >40% of
genome analysis time. A range of proposals have attempted to address these
challenges, with limited success. We propose to address these challenges with a
Compute-in-Memory Basecalling Accelerator (CiMBA), the first embedded
($\sim25$mm$^2$) accelerator capable of real-time, on-device basecalling,
coupled with AnaLog (AL)-Dorado, a new family of analog focused basecalling
DNNs. Our resulting hardware/software co-design greatly reduces data
communication overhead, is capable of a throughput of 4.77 million bases per
second, 24x that required for real-time operation, and achieves 17x/27x
power/area efficiency over the best prior basecalling embedded accelerator
while maintaining a high accuracy comparable to state-of-the-art software
basecallers.