A multimodal deep learning model for detecting endoscopic images of near-infrared fluorescence capsules.

Journal: Biosensors & bioelectronics
PMID:

Abstract

Early screening for gastrointestinal (GI) diseases is critical for preventing cancer development. With the rapid advancement of deep learning technology, artificial intelligence (AI) has become increasingly prominent in the early detection of GI diseases. Capsule endoscopy is a non-invasive medical imaging technique used to examine the gastrointestinal tract. In our previous work, we developed a near-infrared fluorescence capsule endoscope (NIRF-CE) capable of exciting and capturing near-infrared (NIR) fluorescence images to specifically identify subtle mucosal microlesions and submucosal abnormalities while simultaneously capturing conventional white-light images to detect lesions with significant morphological changes. However, limitations such as low camera resolution and poor lighting within the gastrointestinal tract may lead to misdiagnosis and other medical errors. Manually reviewing and interpreting large volumes of capsule endoscopy images is time-consuming and prone to errors. Deep learning models have shown potential in automatically detecting abnormalities in NIRF-CE images. This study focuses on an improved deep learning model called Retinex-Attention-YOLO (RAY), which is based on single-modality image data and built on the YOLO series of object detection models. RAY enhances the accuracy and efficiency of anomaly detection, especially under low-light conditions. To further improve detection performance, we also propose a multimodal deep learning model, Multimodal-Retinex-Attention-YOLO (MRAY), which combines both white-light and fluorescence image data. The dataset used in this study consists of images of pig stomachs captured by our NIRF-CE system, simulating the human GI tract. In conjunction with a targeted fluorescent probe, which accumulates at lesion sites and releases fluorescent signals for imaging when abnormalities are present, a bright spot indicates a lesion. The MRAY model achieved an impressive precision of 96.3%, outperforming similar object detection models. To further validate the model's performance, ablation experiments were conducted, and comparisons were made with publicly available datasets. MRAY shows great promise for the automated detection of GI cancers, ulcers, inflammations, and other medical conditions in clinical practice.

Authors

  • Junhao Wang
    Institute of Nano Biomedicine and Engineering, School of Sensing Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan RD, Shanghai, 200240, PR China.
  • Cheng Zhou
    Department of Radiology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China; Joint Laboratory of Clinical Radiology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
  • Wei Wang
    State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau 999078, China.
  • Hanxiao Zhang
    School of Accounting, Guangzhou Huashang College, Guangzhou, China.
  • Amin Zhang
    Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China. zhangamin@sjtu.edu.cn.
  • Daxiang Cui
    Department of Instrument Science and Engineering, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Engineering Research Center for Intelligent Diagnosis and Treatment Instrument, Shanghai 200240, China. Electronic address: dxcui@sjtu.edu.cn.