TENet: Targetness entanglement incorporating with multi-scale pooling and mutually-guided fusion for RGB-E object tracking.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 39657526

Abstract

There is currently strong interest in improving visual object tracking by augmenting the RGB modality with the output of a visual event camera that is particularly informative about the scene motion. However, existing approaches perform event feature extraction for RGB-E tracking using traditional appearance models, which have been optimised for RGB only tracking, without adapting it for the intrinsic characteristics of the event data. To address this problem, we propose an Event backbone (Pooler), designed to obtain a high-quality feature representation that is cognisant of the innate characteristics of the event data, namely its sparsity. In particular, Multi-Scale Pooling is introduced to capture all the motion feature trends within event data through the utilisation of diverse pooling kernel sizes. The association between the derived RGB and event representations is established by an innovative module performing adaptive Mutually Guided Fusion (MGF). Extensive experimental results show that our method significantly outperforms state-of-the-art trackers on two widely used RGB-E tracking datasets, including VisEvent and COESOT, where the precision and success rates on COESOT are improved by 4.9% and 5.2%, respectively. Our code will be available at https://github.com/SSSpc333/TENet.

Authors

Pengcheng Shao

Josef Kittler Research Institute on Artificial Intelligence, China; Sino-UK Joint Laboratory on Artificial Intelligence, Ministry of Science and Technology, China; International Joint Laboratory on Artificial Intelligence, Ministry of Education, China; International Joint Laboratory on Artificial Intelligence, Jiangsu Province, China; School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China.
Tianyang Xu

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China; Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China.
Zhangyong Tang

Josef Kittler Research Institute on Artificial Intelligence, China; Sino-UK Joint Laboratory on Artificial Intelligence, Ministry of Science and Technology, China; International Joint Laboratory on Artificial Intelligence, Ministry of Education, China; International Joint Laboratory on Artificial Intelligence, Jiangsu Province, China; School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China.
Linze Li

College of Mechanical and Electrical Engineering, Henan Agricultural University, Zhengzhou, 450002, China.
Xiao-Jun Wu

Shandong Provincial Key Laboratory of Network based Intelligent Computing, University of Jinan, Jinan 250022, China. Electronic address: wu_xiaojun@jiangnan.edu.cn.
Josef Kittler

Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, GU2 7XH, United Kingdom. Electronic address: j.kittler@surrey.ac.uk.

Keywords

Algorithms Humans Image Processing, Computer-Assisted Neural Networks, Computer Pattern Recognition, Automated

External Resources

View on PubMed Access via DOI PubMed (39657526)

TENet: Targetness entanglement incorporating with multi-scale pooling and mutually-guided fusion for RGB-E object tracking.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals