Proceedings of the National Academy of Sciences of the United States of America
35921434
Understanding spoken language requires transforming ambiguous acoustic streams into a hierarchy of representations, from phonemes to meaning. It has been suggested that the brain uses prediction to guide the interpretation of incoming input. However,...
Computational intelligence and neuroscience
36248946
With the emergence of the information age, computers have entered the homes of ordinary people and have become essential daily appliances for people. The integration of people and computers has become more popular and in-depth. Based on this situatio...
Computational intelligence and neuroscience
36059405
Bone-conducted microphone (BCM) senses vibrations from bones in the skull during speech to electrical audio signal. When transmitting speech signals, bone-conduction microphones (BCMs) capture speech signals based on the vibrations of the speaker's s...
Computational intelligence and neuroscience
35942440
This study is aimed at improving the accuracy of oral English recognition and proposing evaluation measures with better performance. This work is based on related theories such as deep learning, speech recognition, and oral English practice. As the l...
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
36086160
Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based mod...
Silent communication based on biosignals from facial muscle requires accurate detection of its directional movement and thus optimally positioning minimum numbers of sensors for higher accuracy of speech recognition with a minimal person-to-person va...
Journal of voice : official journal of the Voice Foundation
36376192
OBJECTIVES: Machine learning (ML) methods allow the development of expert systems for pattern recognition and predictive analysis of intervention outcomes. It has been used in Voice Sciences, mainly to discriminate between healthy and dysphonic voice...
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the ...
This paper investigates multimodal sensor architectures with deep learning for audio-visual speech recognition, focusing on in-the-wild scenarios. The term "in the wild" is used to describe AVSR for unconstrained natural-language audio streams and vi...
Almost half a billion people world-wide suffer from disabling hearing loss. While hearing aids can partially compensate for this, a large proportion of users struggle to understand speech in situations with background noise. Here, we present a deep l...