Swallowing Assessment using High-Resolution Cervical Auscultations and Transformer-based Neural Networks.
Journal:
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PMID:
40040030
Abstract
Swallowing assessment is a crucial task to reveal swallowing abnormalities. There are multiple modalities to analyze swallowing kinematics, such as videofluoroscopic swallow studies (VFSS), which is the gold standard method, and high-resolution cervical auscultation (HRCA), which is a noninvasive technique that uses a triaxial accelerometer attached to the patient's neck. Deep learning models play an essential role in data driven analysis of swallowing landmarks using VFSS and/or HRCA as input data. Most of these models utilize convolutional and recurrent neural networks. Here, we investigate the ability of transformers to analyze swallowing kinematics; specifically upper esophageal sphincter opening and laryngeal vestibule closure using HRCA signals. We tested the model using an independent test dataset to assess the generalizability of the proposed network. The proposed network achieved an average detection accuracy higher than 90% and 85% for both segmentation tasks, which outperform the hybrid neural networks from the literature, and the model obtained high-performance measures for the independent dataset, showing the transformers' ability to generalize on unseen data.