Towards a General Approach for Bat Echolocation Detection and Classification
Journal:
bioRxiv
Published Date:
May 16, 2026
Abstract
Acoustic monitoring is a scalable approach for assessing bat populations, yet automating the detection and classification of bat echolocation calls remains challenging, particularly in data-scarce regions. Although deep learning (DL) is increasingly applied to this task, most existing approaches repurpose computer-vision architectures and generate a single prediction per spectrogram clip, offering limited robustness to variable background noise and potentially constraining generality across regions and species assemblages. Here, we develop BatDetect2, an open-source DL pipeline for the joint detection and classification of bat echolocation calls. BatDetect2 builds on a 2D convolutional architecture and incorporates two targeted modifications: (i) a temporal self-attention layer designed to capture long-range structure across call sequences, and (ii) convolutional layers augmented with frequency coordinates to explicitly encode frequency information directly. We evaluate model generality using five diverse datasets from four different regions: UK, Mexico, Australia, and Brazil, and conduct ablation analyses using a UK dataset spanning 17 bat species. We further assess whether a trained model can detect echolocation calls from species absent from the training data. BatDetect2 consistently outperforms a traditional call-parameter extraction baseline across all datasets and evaluation metrics. Ablation analyses show that the inclusion of temporal self-attention yields a substantial species classification performance gain, increasing mean Average Precision (mAP) from 0.83 to 0.88, while frequency-coordinate augmentation provides no measurable benefit. When applied to novel species assemblages without retraining, model detection performance varies across datasets, with Average Precision ranging from 0.60 to 0.98. Overall, BatDetect2 demonstrates strong and transferable performance across acoustically and taxonomically diverse regions. By jointly detecting and classifying all bat calls present in each input clip, the pipeline provides a practical and extensible tool for passive acoustic monitoring. The full training pipeline and a pretrained UK model are released through the open-source Python package batdetect2, enabling practitioners to develop and deploy models using their own data.