S2P-Matching: Self-Supervised Patch-Based Matching Using Transformer for Capsule Endoscopic Images Stitching.

Journal: IEEE transactions on bio-medical engineering
PMID:

Abstract

The Magnetically Controlled Capsule Endoscopy (MCCE) has a limited shooting range, resulting in capturing numerous fragmented images and an inability to precisely locate and examine the region of interest (ROI) as traditional endoscopy can. Addressing this issue, image stitching around the ROI can be employed to aid in the diagnosis of gastrointestinal (GI) tract conditions. However, MCCE images possess unique characteristics, such as weak texture, close-up shooting, and large angle rotation, presenting challenges to current image-matching methods. In this context, a method named S2P-Matching is proposed for self-supervised patch-based matching in MCCE image stitching. The method involves augmenting the raw data by simulating the capsule endoscopic camera's behavior around the GI tract's ROI. Subsequently, an improved contrast learning encoder is utilized to extract local features, represented as deep feature descriptors. This encoder comprises two branches that extract distinct scale features, which are combined over the channel without manual labeling. The data-driven descriptors are then input into a Transformer model to obtain patch-level matches by learning the globally consented matching priors in the pseudo-ground-truth match pairs. Finally, the patch-level matching is refined and filtered to the pixel-level. The experimental results on real-world MCCE images demonstrate that S2P-Matching provides enhanced accuracy in addressing challenging issues in the GI tract environment with image parallax. The performance improvement can reach up to 203 and 55.8% in terms of NCM (Number of Correct Matches) and SR (Success Rate), respectively. This approach is expected to facilitate the wide adoption of MCCE-based gastrointestinal screening.

Authors

  • Feng Lu
    National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Dao Zhou
  • Haoyang Chen
    School of Mathematics and Statistics, Hainan Normal University, Hainan, China; School of Software, Shandong University, Jinan, China.
  • Shuai Liu
    Graduate School of Chinese Academy of Traditional Chinese Medicine, Beijing, China.
  • Xianliang Ling
  • Lei Zhu
    School of Civil and Hydraulic Engineering, Ningxia University, Yinchuan, China.
  • Tingting Gong
    Department of Gastroenterology, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
  • Bin Sheng
    MOE Key Laboratory of AI, School of Electronic, Information, and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China.
  • Xiaofei Liao
    National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Hai Jin
    National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Ping Li
    Department of Gastroenterology, Beijing Ditan Hospital, Capital Medical University, Beijing, China.
  • David Dagan Feng