MTevent: A Multi-Task Event Camera Dataset for 6D Pose Estimation and Moving Object Detection
Journal:
arXiv
Published Date:
May 16, 2025
Abstract
Mobile robots are reaching unprecedented speeds, with platforms like Unitree
B2, and Fraunhofer O3dyn achieving maximum speeds between 5 and 10 m/s.
However, effectively utilizing such speeds remains a challenge due to the
limitations of RGB cameras, which suffer from motion blur and fail to provide
real-time responsiveness. Event cameras, with their asynchronous operation, and
low-latency sensing, offer a promising alternative for high-speed robotic
perception. In this work, we introduce MTevent, a dataset designed for 6D pose
estimation and moving object detection in highly dynamic environments with
large detection distances. Our setup consists of a stereo-event camera and an
RGB camera, capturing 75 scenes, each on average 16 seconds, and featuring 16
unique objects under challenging conditions such as extreme viewing angles,
varying lighting, and occlusions. MTevent is the first dataset to combine
high-speed motion, long-range perception, and real-world object interactions,
making it a valuable resource for advancing event-based vision in robotics. To
establish a baseline, we evaluate the task of 6D pose estimation using NVIDIA's
FoundationPose on RGB images, achieving an Average Recall of 0.22 with
ground-truth masks, highlighting the limitations of RGB-based approaches in
such dynamic settings. With MTevent, we provide a novel resource to improve
perception models and foster further research in high-speed robotic vision. The
dataset is available for download
https://huggingface.co/datasets/anas-gouda/MTevent