LUNETR: Language-Infused UNETR for precise pancreatic tumor segmentation in 3D medical image.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 40117980

Abstract

The identification of early micro-lesions and adjacent blood vessels in CT scans plays a pivotal role in the clinical diagnosis of pancreatic cancer, considering its aggressive nature and high fatality rate. Despite the widespread application of deep learning methods for this task, several challenges persist: (1) the complex background environment in abdominal CT scans complicates the accurate localization of potential micro-tumors; (2) the subtle contrast between micro-lesions within pancreatic tissue and the surrounding tissues makes it challenging for models to capture these features accurately; and (3) tumors that invade adjacent blood vessels pose significant barriers to surgical procedures. To address these challenges, we propose LUNETR (Language-Infused UNETR), an advanced multimodal encoder model that combines textual and image information for precise medical image segmentation. The integration of an autoencoding language model with cross-attention enabling our model to effectively leverage semantic associations between textual and image data, thereby facilitating precise localization of potential pancreatic micro-tumors. Additionally, we designed a Multi-scale Aggregation Attention (MSAA) module to comprehensively capture both spatial and channel characteristics of global multi-scale image data, enhancing the model's capacity to extract features from micro-lesions embedded within pancreatic tissue. Furthermore, in order to facilitate precise segmentation of pancreatic tumors and nearby blood vessels and address the scarcity of multimodal medical datasets, we collaborated with Zhuzhou Central Hospital to construct a multimodal dataset comprising CT images and corresponding pathology reports from 135 pancreatic cancer patients. Our experimental results surpass current state-of-the-art models, with the incorporation of the semantic encoder improving the average Dice score for pancreatic tumor segmentation by 2.23 %. For the Medical Segmentation Decathlon (MSD) liver and lung cancer datasets, our model achieved an average Dice score improvement of 4.31 % and 3.67 %, respectively, demonstrating the efficacy of the LUNETR.

Authors

Ziyang Shi

School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China.
Ruopeng Zhang

School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China.
Xiajun Wei

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
Cheng Yu

Department of Computer Science and Technology,Nanjing University, Nanjing 210093, China.
Haojie Xie

School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China.
Zhen Hu

Institute for Health Informatics.
Xili Chen

School of Electronic Information and Physics, Central South University of Forestry and Technology, Changsha 410004, China.
Yongzhong Zhang

Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences & Peking Union Medical College, Tian Tan Xi Li 1#, Beijing 100050, China.
Bin Xie

School of Automation, Central South University, Changsha, China. xiebin@csu.edu.cn.
Zhengmao Luo

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
Wanxiang Peng

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
Xiaochun Xie

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
Fang Li

Department of General Surgery, Chongqing General Hospital, Chongqing, China.
Xiaoli Long

Department of Radiology, Zhuzhou Hospital Affiliated to Xiangya' School of Medicine, Central South University, Zhuzhou 412002, China.
Lin Li

Department of Medicine III, LMU University Hospital, LMU Munich, Munich, Germany.
Linan Hu

Zhuzhou Central Hospital, Zhuzhou, 412000, Hunan, China.

Keywords

Deep Learning Humans Image Processing, Computer-Assisted Imaging, Three-Dimensional Language Pancreatic Neoplasms Tomography, X-Ray Computed

External Resources

View on PubMed Access via DOI PubMed (40117980)

LUNETR: Language-Infused UNETR for precise pancreatic tumor segmentation in 3D medical image.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals