[Small bowel video keyframe retrieval based on multi-modal contrastive learning].

Journal: Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi
PMID:

Abstract

Retrieving keyframes most relevant to text from small intestine videos with given labels can efficiently and accurately locate pathological regions. However, training directly on raw video data is extremely slow, while learning visual representations from image-text datasets leads to computational inconsistency. To tackle this challenge, a small bowel video keyframe retrieval based on multi-modal contrastive learning (KRCL) is proposed. This framework fully utilizes textual information from video category labels to learn video features closely related to text, while modeling temporal information within a pretrained image-text model. It transfers knowledge learned from image-text multimodal models to the video domain, enabling interaction among medical videos, images, and text data. Experimental results on the hyper-spectral and Kvasir dataset for gastrointestinal disease detection (Hyper-Kvasir) and the Microsoft Research video-to-text (MSR-VTT) retrieval dataset demonstrate the effectiveness and robustness of KRCL, with the proposed method achieving state-of-the-art performance across nearly all evaluation metrics.

Authors

  • Xing Wu
  • Guoyin Yang
    School of Computer Engineering and Science, Shanghai University, Shanghai 200444, P. R. China.
  • Jingwen Li
    Cloud Computing and Big Data Research Institute, China Academy of Information and Communications Technology, People's Republic of China.
  • Jian Zhang
    College of Pharmacy, Ningxia Medical University, Yinchuan, NingxiaHui Autonomous Region, China.
  • Qun Sun
    School of Mechanical and Automotive Engineering, Liaocheng University, Liaocheng, China.
  • Xianhua Han
  • Quan Qian
    School of Computer Engineering and Science, Shanghai University, Shanghai 200444, P. R. China.
  • Yanwei Chen
    College of Information Science and Engineering, Ritsumeikan University, Shiga-Ken 525-8577, Japan.