ZS-MNET: A zero-shot learning based approach to multimodal named entity typing.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 40010289

Abstract

The task of named entity typing (NET) on social platforms is significant as it involves identifying the various types of named entities within unstructured text. The existing methods for NET only utilize the text modality to classify the types of named entities and ignore the semantic correlation of multimodal data. Moreover, the growing number of multimodal data implies a growing type set and the newly emerged entity types should be recognized without additional training. To address the aforementioned disadvantages, we introduce a zero-shot learning based multimodal NET (ZS-MNET) model that combines textual and visual modalities to recognize previously unseen named entity types in a zero-shot manner. The proposed ZS-MNET utilizes both text and image information to bridge the semantic correlation between multimodal data and label information, as opposed to the traditional zero-shot NET (ZS-NET) models. To incorporate fine-grained multimodal representations, we utilize pre-trained models that incorporate language and vision, particularly BERT and ViT, which are founded on transformer architectures. Besides, we propose the different multimodal representations to focus on fine-grained features for modeling semantic correlation between multimodal data and entity types in a fusion way. The experimental results underscore the utility of multimodal data in the NET field, while our approach surpasses previous ZS-NET models in performance.

Authors

Baohang Zhou

College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: zhoubaohang@dbis.nankai.edu.cn.
Ying Zhang

Department of Nephrology, Nanchong Central Hospital Affiliated to North Sichuan Medical College, Nanchong, China.
Kehui Song

School of Software, Tiangong University, Tianjin 300387, China. Electronic address: songkehui@dbis.nankai.edu.cn.
Xuhui Sui

College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: suixuhui@dbis.nankai.edu.cn.
Yu Zhao

College of Agriculture, Shanxi Agricultural University, Taigu, Shanxi, China.
Xiaojie Yuan

College of Computer Science, VCIP, TMCC, TBI Center, Nankai University, Tianjin 300350, China. Electronic address: yuanxj@nankai.edu.cn.

Keywords

Humans Machine Learning Neural Networks, Computer Semantics

External Resources

View on PubMed Access via DOI PubMed (40010289)

ZS-MNET: A zero-shot learning based approach to multimodal named entity typing.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals