TRACE: End-to-end temporal inference and annotation of animal behaviors from video
Journal:
bioRxiv
Published Date:
Apr 15, 2026
Abstract
Quantitative analysis of animal behavior is fundamental to neuroscience and ethology but remains constrained by the scalability, subjectivity, and limited reproducibility of manual annotation. Most automated approaches infer behavior through predefined intermediate representations such as pose trajectories, which require task-specific design choices and often omit contextual visual information essential for behavioral interpretation. Here we introduce TRACE (Temporal Recognition of Animal Behaviors Captured from Video), an end-to-end method with a graphical user interface for detecting and annotating animal behavior from raw video. TRACE leverages a transformer-based video encoder pretrained via self-supervised learning to extract hierarchical temporal features, combined with multi-scale temporal modeling to capture behaviors spanning diverse timescales. The method jointly predicts behavioral identity and temporal boundaries from continuous video recordings with high-speed inference. Across multiple behavioral datasets spanning different species and experimental contexts, TRACE demonstrates robust and generalizable performance, enabling scalable, context-aware analysis of animal behavior directly from video.