Deep convolutional network for animal sound classification and source attribution using dual audio recordings.

Journal: The Journal of the Acoustical Society of America

PMID: 30823820

Abstract

This paper introduces an end-to-end feedforward convolutional neural network that is able to reliably classify the source and type of animal calls in a noisy environment using two streams of audio data after being trained on a dataset of modest size and imperfect labels. The data consists of audio recordings from captive marmoset monkeys housed in pairs, with several other cages nearby. The network in this paper can classify both the call type and which animal made it with a single pass through a single network using raw spectrogram images as input. The network vastly increases data analysis capacity for researchers interested in studying marmoset vocalizations, and allows data collection in the home cage, in group housed animals.

Authors

Tuomas Oikarinen

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Karthik Srinivasan

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Olivia Meisner

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Julia B Hyman

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Shivangi Parmar

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Adrian Fanucci-Kiss

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Robert Desimone

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.
Rogier Landman

Stanley Center, Broad Institute, 57 Ames Street, Cambridge, Massachusetts 02139, USA.
Guoping Feng

McGovern Institute for Brain Research, Massachusetts Institute of Technology, 43 Vassar Street, Cambridge, Massachusetts 02139, USA.

Keywords

Animals Callithrix Neural Networks, Computer Signal Processing, Computer-Assisted Sound Spectrography Vocalization, Animal

External Resources

View on PubMed Access via DOI PubMed (30823820)

Deep convolutional network for animal sound classification and source attribution using dual audio recordings.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals