DT-Transformer: A Text-Tactile Fusion Network for Object Recognition.

Journal: IEEE transactions on haptics
PMID:

Abstract

Humans rely on multiple senses to understand their surroundings, and so do robots. Current research in haptic object classification focuses on visual-haptic methods, but faces limitations in performance and dataset size. Unlike images, text does not have these limitations and can effectively describe objects. In our study, we introduce DT-Transformer (Double T: Tactile and Text) - a novel framework for learning from tactile and textual data. We implemented a specialized fusion mechanism based on converter networks through a multi-head attention mechanism to address the challenge of merging these different information types. This approach allows us to combine different modalities at the feature level, thus significantly improving target recognition accuracy. Our model achieves impressive recognition rates of 95.06% and 86.34% on two publicly available haptic datasets, outperforming existing algorithms. This breakthrough can be practically applied to tactile recognition and dexterous hand grasping operations.

Authors

  • Shengjie Qiu
  • Baojiang Li
    The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China. Electronic address: libj@sdju.edu.cn.
  • Xichao Wang
    The School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China; Intelligent Decision and Control Technology Institute, Shanghai Dianji University, Shanghai 201306, China.
  • Haiyan Wang
    College of Chemistry and Material Science, Shandong Agricultural University, Tai'an 271018, PR China.
  • Haiyan Ye