A dual-channel language decoding from brain activity with progressive transfer training.

Journal: Human brain mapping
PMID:

Abstract

When we view a scene, the visual cortex extracts and processes visual information in the scene through various kinds of neural activities. Previous studies have decoded the neural activity into single/multiple semantic category tags which can caption the scene to some extent. However, these tags are isolated words with no grammatical structure, insufficiently conveying what the scene contains. It is well-known that textual language (sentences/phrases) is superior to single word in disclosing the meaning of images as well as reflecting people's real understanding of the images. Here, based on artificial intelligence technologies, we attempted to build a dual-channel language decoding model (DC-LDM) to decode the neural activities evoked by images into language (phrases or short sentences). The DC-LDM consisted of five modules, namely, Image-Extractor, Image-Encoder, Nerve-Extractor, Nerve-Encoder, and Language-Decoder. In addition, we employed a strategy of progressive transfer to train the DC-LDM for improving the performance of language decoding. The results showed that the texts decoded by DC-LDM could describe natural image stimuli accurately and vividly. We adopted six indexes to quantitatively evaluate the difference between the decoded texts and the annotated texts of corresponding visual images, and found that Word2vec-Cosine similarity (WCS) was the best indicator to reflect the similarity between the decoded and the annotated texts. In addition, among different visual cortices, we found that the text decoded by the higher visual cortex was more consistent with the description of the natural image than the lower one. Our decoding model may provide enlightenment in language-based brain-computer interface explorations.

Authors

  • Wei Huang
    Shaanxi Institute of Flexible Electronics, Northwestern Polytechnical University, 710072 Xi'an, China.
  • Hongmei Yan
    Ministry of Education Key Laboratory of Metabolism and Molecular Medicine, Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, China.
  • Kaiwen Cheng
    School of Language Intelligence, Sichuan International Studies University, Chongqing, China.
  • Yuting Wang
    Respiratory Department, Dongzhimen Hospital Affiliated to BUCM, Beijing, China.
  • Chong Wang
    Shandong Xinhua Pharmaceutical Co., Ltd., No. 1, Lu Tai Road, High Tech Zone, Zibo 255199, China.
  • Jiyi Li
    MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
  • Chen Li
    School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
  • Chaorong Li
    The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.
  • Zhentao Zuo
    State Key Laboratory of Brain and Cognitive Science, Beijing MR Center for Brain Research, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China. ztzuo@bcslab.ibp.ac.cn.
  • Huafu Chen
    Key laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology and Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu 610054, PR China. Electronic address: chenhf@uestc.edu.cn.