MMsurv: a multimodal multi-instance multi-cancer survival prediction model integrating pathological images, clinical information, and sequencing data.

Journal: Briefings in bioinformatics

Published Date: May 1, 2025

Abstract

Accurate prediction of patient survival rates in cancer treatment is essential for effective therapeutic planning. Unfortunately, current models often underutilize the extensive multimodal data available, affecting confidence in predictions. This study presents MMSurv, an interpretable multimodal deep learning model to predict survival in different types of cancer. MMSurv integrates clinical information, sequencing data, and hematoxylin and eosin-stained whole-slide images (WSIs) to forecast patient survival. Specifically, we segment tumor regions from WSIs into image tiles and employ neural networks to encode each tile into one-dimensional feature vectors. We then optimize clinical features by applying word embedding techniques, inspired by natural language processing, to the clinical data. To better utilize the complementarity of multimodal data, this study proposes a novel fusion method, multimodal fusion method based on compact bilinear pooling and transformer, which integrates bilinear pooling with Transformer architecture. The fused features are then processed through a dual-layer multi-instance learning model to remove prognosis-irrelevant image patches and predict each patient's survival risk. Furthermore, we employ cell segmentation to investigate the cellular composition within the tiles that received high attention from the model, thereby enhancing its interpretive capacity. We evaluate our approach on six cancer types from The Cancer Genome Atlas. The results demonstrate that utilizing multimodal data leads to higher predictive accuracy compared to using single-modal image data, with an average C-index increase from 0.6750 to 0.7283. Additionally, we compare our proposed baseline model with state-of-the-art methods using the C-index and five-fold cross-validation approach, revealing a significant average improvement of nearly 10% in our model's performance.

Authors

Hailong Yang

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China.
Jia Wang

Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, Jilin, China.
Wenyan Wang

School of Electrical and Information Engineering, Anhui University of Technology, Ma'anshan 243032, China.
Shufang Shi

Department of Sciences, Geneis Beijing Co., Ltd., No. 31 Xinbei Road, Laiguangying, Chaoyang District, Beijing 100102, China.
Lijing Liu

Medical College, Hunan University of Medicine, Huaihua 418000, China.
Yuhua Yao

College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou 310018, China; School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China. Electronic address: yaoyuhua2288@163.com.
Geng Tian

Department of Sciences, Genesis (Beijing) Co. Ltd., Beijing, China.
Peizhen Wang

School of Electrical and Information Engineering, Anhui University of Technology, No. 1530 Maxiang Road, Huashan District, Ma'anshan, Anhui 243032, China.
Jialiang Yang

Department of Sciences, Genesis (Beijing) Co. Ltd., Beijing, China.

Keywords

Algorithms Computational Biology Deep Learning Humans Image Processing, Computer-Assisted Neoplasms Neural Networks, Computer Prognosis

External Resources

View on PubMed Access via DOI PubMed (40366860)

MMsurv: a multimodal multi-instance multi-cancer survival prediction model integrating pathological images, clinical information, and sequencing data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals