IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society
39255187
OBJECTIVE: Speech brain-computer interfaces (speech BCIs), which convert brain signals into spoken words or sentences, have demonstrated great potential for high-performance BCI communication. Phonemes are the basic pronunciation units. For monosylla...
Neural networks : the official journal of the International Neural Network Society
39255636
Single-channel speech enhancement primarily relies on deep learning models to recover clean speech signals from noise-contaminated speech. These models establish a mapping relationship between noisy and clean speech. However, considering the sparse d...
OBJECTIVES: Maintenance of oral muscle functions is important for survival and communication. Utilizing Artificial Intelligence (AI) as a self-health-management material has shown promise. Here we developed a functional and AI-enabled smartphone e-Or...
Emotion recognition through speech is a technique employed in various scenarios of Human-Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being mo...
This study advances the automation of Parkinson's disease (PD) diagnosis by analyzing speech characteristics, leveraging a comprehensive approach that integrates a voting-based machine learning model. Given the growing prevalence of PD, especially am...
Given an orthographic transcription, forced alignment systems automatically determine boundaries between segments in speech, facilitating the use of large corpora. In the present paper, we introduce a neural network-based forced alignment system, the...
Traditional English corpora mainly collect information from a single modality, but lack information from multimodal information, resulting in low quality of corpus information and certain problems with recognition accuracy. To solve the above problem...
This paper describes an original dataset of children's speech, collected through the use of JIBO, a social robot. The dataset encompasses recordings from 110 children, aged 4-7 years old, who participated in a letter and digit identification task and...
Adolescence is a significant period for developing skills and knowledge and learning about managing relationships and emotions by gathering attributes for maturity. Recently, Depression arises as a common mental health issue in adolescents and this a...
IEEE transactions on pattern analysis and machine intelligence
39437301
Visual Speech Recognition (VSR) aims to infer speech into text depending on lip movements alone. As it focuses on visual information to model the speech, its performance is inherently sensitive to personal lip appearances and movements, and this make...