Using tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics.

Journal: Philosophical transactions of the Royal Society of London. Series B, Biological sciences
Published Date:

Abstract

Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and computing costs limit the field's adoption. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting their current development to data-rich bird taxa. Here, we identify the optimum pretraining strategy for data-deficient domains, using tropical reefs as a representative case study. We assembled ReefSet, an annotated library of 57 000 reef sounds taken across 16 datasets, though still modest in scale compared to annotated bird libraries. We performed multiple pretraining experiments and found that pretraining on a library of bird audio 50 times the size of ReefSet provides notably superior generalizability on held-out reef datasets, with a mean area under the receiver operating characteristic curve (AUC-ROC) of 0.881 (±0.11), compared to pretraining on ReefSet itself or unrelated audio, with a mean AUC-ROC of 0.724 (±0.05) and 0.834 (±0.05), respectively. However, our key findings show that cross-domain mixing, where bird, reef and unrelated audio are combined during pretraining, provides superior transfer learning performance, with an AUC-ROC of 0.933 (±0.02). SurfPerch, our optimum pretrained network, provides a strong foundation for automated analysis of tropical reef and related PAM data with minimal annotation and computing costs.This article is part of the theme issue 'Acoustic monitoring for tropical ecology and conservation'.

Authors

  • Ben Williams
    Department of Engineering, University of Cambridge, Cambridge, United Kingdom.
  • Bart van Merriënboer
    DeepMind, Google Inc, Canada.
  • Vincent Dumoulin
    DeepMind, Google Inc, New York, USA.
  • Jenny Hamer
    DeepMind, Google Inc, New York, USA.
  • Abram B Fleishman
    Conservation Metrics, Santa Cruz, CA, USA.
  • Matthew McKown
    Conservation Metrics, Santa Cruz, CA, USA.
  • Jill Munger
    Conservation Metrics, Santa Cruz, CA, USA.
  • Aaron N Rice
    Bioacoustics Research Program, Cornell University, Ithaca, NY, USA.
  • Ashlee Lillis
    Sound Ocean Science, Gqeberha, Eastern Cape, South Africa.
  • Clemency White
    University of Bristol, Bristol, UK.
  • Catherine Hobbs
    University of Bristol, Bristol, UK.
  • Tries Razak
    IPB University, Bogor, Jawa Barat, Indonesia.
  • David Curnick
    Zoological Society of London, Regents Park, London, United Kingdom.
  • Kate E Jones
    Centre for Biodiversity and Environment Research, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.
  • Tom Denton
    DeepMind, Google Inc, Mountain View, CA, USA.