PhyTransformer: A unified framework for learning spatial-temporal representation from physiological signals.

Journal: Neural networks : the official journal of the International Neural Network Society

Published Date: May 19, 2025

Abstract

As a modal of physiological information, electroencephalogram (EEG), surface electromyography (sEMG), and eye tracking (ET) signals are widely used to decode human intention, promoting the development of human-computer interaction systems. Extensive studies have achieved single-modal signal decoding with substantial structural differences but consuming mass computing resources and development costs. Considering the similarity in data structure and features, this work proposed a unified framework called PhyTransformer that extracts temporal dynamic and complex channel relationships to decode the physiological signals generally. Concretely, PhyTransformer uses a stacked distillation convolution to capture the complementary temporal dynamic representation from local to global. Considering the information fusion between different channels, our method regards the temporal dynamic of each channel as a token and feeds them into the multi-head attention network to model the complex channel relationship. Subsequently, to measure the channel contributions and fuse the representations from different convolution kernels, PhyTransformer adopts a depth-wise and a separable-wise convolution to extract the final spatial-temporal representation. The proposed method has been evaluated on six publicly benchmarked datasets for physiological signal classification, namely THU and GIST for EEG, Ninapro DB 1 and 6 for sEMG, and GazeCom and HMR for ET. Experiment results illustrate that the proposed method PhyTransformer has the ability to learn robust spatial-temporal representations from multiple modal physiological signals. The code is available at https://github.com/Tammie-Li/PhyTransformer.

Authors

Hongxin Li

Department of Cardiovascular Surgery, Shandong Provincial Hospital, Jinan, Shandong, China.
Yaru Liu

Department of Acupuncture, The Third Affiliated Hospital of Beijing University of Chinese Medicine, Beijing 100029, China.
Kun Liu

Department of Anesthesiology, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai 200030, China.
Yuke Qu

College of Intelligence Science and Technology, National University of Defense Technology, Deya Road 109, Changsha, 410073, Hunan, China.
Wei Dai

Department of Intensive Care Unit, The First Affiliated Hospital of Jiangxi Medical College, Shangrao, Jiangxi, China.
Jingsheng Tang

College of Intelligence Science and Technology, National University of Defense Technology, Deya Road 109, Changsha, 410073, Hunan, China.
Zongtan Zhou

College of Intelligence Science and Technology, National University of Defense Technology, Deya Road 109, Changsha, 410073, Hunan, China.

Keywords

Electroencephalography Electromyography Eye-Tracking Technology Humans Neural Networks, Computer Signal Processing, Computer-Assisted

External Resources

View on PubMed Access via DOI PubMed (40414150)

PhyTransformer: A unified framework for learning spatial-temporal representation from physiological signals.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

PhyTransformer: A unified framework for learning spatial-temporal representation from physiological signals.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals