Investigating the structure of the greylag goose vocal repertoire: what can unsupervised methods tell us?

Journal: Scientific reports
Published Date:

Abstract

Defining a comprehensive signal repertoire is an important step to understanding a species' vocal communication system. Here, we investigated the vocal repertoire of a well-investigated model species in ethology: the greylag goose (Anser anser). We applied unsupervised machine learning algorithms to a large dataset of vocalisations from a free-living population of greylag geese to investigate the acoustic structure of this species' vocal signals. We extracted four types of data representations, which were projected into 2, 20 and 100 dimensions using the UMAP algorithm, and then grouped using two commonly used clustering methods. Additionally, we successfully applied a graph-based clustering approach - Leiden community detection - which, to our knowledge, has not previously been employed in bioacoustics. Our analyses revealed a partly graded vocal repertoire that broadly matched early descriptions of the greylag goose call repertoire. Audio feature vectors, rather than more commonly used spectrographic representations, revealed clusters most congruent with human labels and offered the most comprehensive visualisation of the acoustic space. Leiden community detection performed comparably to established approaches but matched the number of human-defined classes closest. These findings highlight the impact of data representation on repertoire analysis and provide the first objective, quantitative characterisation of the greylag goose vocal repertoire.

Authors

Keywords

No keywords available for this article.