Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty

Journal: bioRxiv
Published Date:

Abstract

Animal-worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to classify animal behaviours across taxa and research questions. However, the widespread adoption of these methods faces challenges from imbalanced training data, unquantified uncertainties in model outputs, shifts in model performance across contexts, and noisy classifications in continuous data streams, where predicted behaviours change abruptly within a sequence. To address these challenges, we introduce an open-source approach for classifying animal behaviour from raw acceleration data. Our approach integrates machine learning and statistical inference techniques to evaluate and mitigate class imbalances, changes in model performance across ecological settings, and noisy classifications. Importantly, we extend predictions from single behaviour classifications to prediction sets: sets of behaviour labels guaranteed to contain the true behaviour with a pre-specified probability, in a framework analogous to the use of prediction intervals in statistical analyses. We evaluate our approach via simulation and highlight its utility using data collected from a free-ranging large carnivore, African wild dogs (Lycaon pictus), in the Okavango Delta, Botswana. We demonstrate significantly improved predictions along with associated uncertainty metrics in African wild dog behaviour classification, particularly for rare and ecologically important behaviours such as feeding, where correct classifications more than doubled following quality checks and data rebalancing introduced in our pipeline. Our approach is applicable across taxa and represents a key step towards advancing the burgeoning use of machine learning to remotely observe around-the-clock behaviours of free-ranging animals. Future work could include the integration of multiple data streams, such as accelerometer, audio, and GPS data, for model training and could be incorporated directly into our pipeline.

Authors

  • Medha Agarwal; Kasim Rafiq; Ronak Mehta; Briana Abrahms; Zaid Harchaoui