AI Medical Compendium Topic

Explore the latest research on artificial intelligence and machine learning in medicine.

Speech Perception

Showing 91 to 100 of 111 articles

Clear Filters

Computational framework for fusing eye movements and spoken narratives for image annotation.

Journal of vision
Despite many recent advances in the field of computer vision, there remains a disconnect between how computers process images and how humans understand them. To begin to bridge this gap, we propose a framework that integrates human-elicited gaze and ...

A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.

The Journal of the Acoustical Society of America
Deep learning based speech separation or noise reduction needs to generalize to voices not encountered during training and to operate under multiple corruptions. The current study provides such a demonstration for hearing-impaired (HI) listeners. Sen...

EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition.

Cognitive science
Despite the lack of invariance problem (the many-to-many mapping between acoustics and percepts), human listeners experience phonetic constancy and typically perceive what a speaker intends. Most models of human speech recognition (HSR) have side-ste...

Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses.

Journal of speech, language, and hearing research : JSLHR
Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based appr...

A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.

The Journal of the Acoustical Society of America
For deep learning based speech segregation to have translational significance as a noise-reduction tool, it must perform in a wide variety of acoustic environments. In the current study, performance was examined when target speech was subjected to in...

Talker change detection: A comparison of human and machine performance.

The Journal of the Acoustical Society of America
The automatic analysis of conversational audio remains difficult, in part, due to the presence of multiple talkers speaking in turns, often with significant intonation variations and overlapping speech. The majority of prior work on psychoacoustic sp...

Vision-referential speech enhancement of an audio signal using mask information captured as visual data.

The Journal of the Acoustical Society of America
This paper describes a vision-referential speech enhancement of an audio signal using mask information captured as visual data. Smartphones and tablet devices have become popular in recent years. Most of them not only have a microphone but also a cam...

Improving the performance of hearing aids in noisy environments based on deep learning technology.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
The performance of a deep-learning-based speech enhancement (SE) technology for hearing aid users, called a deep denoising autoencoder (DDAE), was investigated. The hearing-aid speech perception index (HASPI) and the hearing- aid sound quality index ...

Evaluating automatic speech recognition systems as quantitative models of cross-lingual phonetic category perception.

The Journal of the Acoustical Society of America
Theories of cross-linguistic phonetic category perception posit that listeners perceive foreign sounds by mapping them onto their native phonetic categories, but, until now, no way to effectively implement this mapping has been proposed. In this pape...

The combined use of virtual reality and EEG to study language processing in naturalistic environments.

Behavior research methods
When we comprehend language, we often do this in rich settings where we can use many cues to understand what someone is saying. However, it has traditionally been difficult to design experiments with rich three-dimensional contexts that resemble our ...