ANIMAL-SPOT enables animal-independent signal detection and classification using deep learning.

Journal: Scientific reports
Published Date:

Abstract

Bioacoustic research spans a wide range of biological questions and applications, relying on identification of target species or smaller acoustic units, such as distinct call types. However, manually identifying the signal of interest is time-intensive, error-prone, and becomes unfeasible with large data volumes. Therefore, machine-driven algorithms are increasingly applied to various bioacoustic signal identification challenges. Nevertheless, biologists still have major difficulties trying to transfer existing animal- and/or scenario-related machine learning approaches to their specific animal datasets and scientific questions. This study presents an animal-independent, open-source deep learning framework, along with a detailed user guide. Three signal identification tasks, commonly encountered in bioacoustics research, were investigated: (1) target signal vs. background noise detection, (2) species classification, and (3) call type categorization. ANIMAL-SPOT successfully segmented human-annotated target signals in data volumes representing 10 distinct animal species and 1 additional genus, resulting in a mean test accuracy of 97.9%, together with an average area under the ROC curve (AUC) of 95.9%, when predicting on unseen recordings. Moreover, an average segmentation accuracy and F1-score of 95.4% was achieved on the publicly available BirdVox-Full-Night data corpus. In addition, multi-class species and call type classification resulted in 96.6% and 92.7% accuracy on unseen test data, as well as 95.2% and 88.4% regarding previous animal-specific machine-based detection excerpts. Furthermore, an Unweighted Average Recall (UAR) of 89.3% outperformed the multi-species classification baseline system of the ComParE 2021 Primate Sub-Challenge. Besides animal independence, ANIMAL-SPOT does not rely on expert knowledge or special computing resources, thereby making deep-learning-based bioacoustic signal identification accessible to a broad audience.

Authors

  • Christian Bergler
    Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany. christian.bergler@fau.de.
  • Simeon Q Smeele
    Cognitive and Cultural Ecology Lab, Max Planck Institute of Animal Behavior, 78315, Radolfzell, Germany.
  • Stephen A Tyndel
    Cognitive and Cultural Ecology Lab, Max Planck Institute of Animal Behavior, 78315, Radolfzell, Germany.
  • Alexander Barnhill
    Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany.
  • Sara T Ortiz
    Max Planck Institute for Biological Intelligence, in Foundation, Seewiesen Eberhard-Gwinner-Strasse, 82319, Starnberg, Germany.
  • Ammie K Kalan
    Department of Anthropology, University of Victoria, Victoria, BC, V8P 5C2, Canada.
  • Rachael Xi Cheng
    Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315, Berlin, Germany.
  • Signe Brinkløv
    Department of Bioscience, Wildlife Ecology, Aarhus University, 8410, Rønde, Denmark.
  • Anna N Osiecka
    Department of Vertebrate Ecology and Zoology, Faculty of Biology, University of Gdańsk, 80-308, Gdańsk, Poland.
  • Jakob Tougaard
    Department of Bioscience, Marine Mammal Research, Aarhus University, 4000, Roskilde, Denmark.
  • Freja Jakobsen
    Department of Biology, University of Southern Denmark, 5230, Odense, Denmark.
  • Magnus Wahlberg
    Department of Biology, University of Southern Denmark, 5230, Odense, Denmark.
  • Elmar Noth
  • Andreas Maier
    Pattern Recognition Lab, University Erlangen-Nürnberg, Erlangen, Germany.
  • Barbara C Klump
    Cognitive and Cultural Ecology Lab, Max Planck Institute of Animal Behavior, 78315, Radolfzell, Germany. bklump@ab.mpg.de.