Speech Perception - AI Medical Compendium

On training targets for deep learning approaches to clean speech magnitude spectrum estimation.

The Journal of the Acoustical Society of America May 1, 2021

Estimation of the clean speech short-time magnitude spectrum (MS) is key for speech enhancement and separation. Moreover, an automatic speech recognition (ASR) system that employs a front-end relies on clean speech MS estimation to remain robust. Tra...

Deep Learning Signal-To-Noise Ratio Speech Intelligibility Speech Speech Perception Noise

View on PubMed DOI

Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input.

Proceedings of the National Academy of Sciences of the United States of America Feb 9, 2021

Before they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get bet...

Phonetics Language Development Models, Neurological Humans Speech Recognition Software Natural Language Processing Speech Perception

View on PubMed DOI

A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions.

The Journal of the Acoustical Society of America Sep 1, 2020

Speaker separation is a special case of speech separation, in which the mixture signal comprises two or more speakers. Many talker-independent speaker separation methods have been introduced in recent years to address this problem in anechoic conditi...

Speech Speech Perception Algorithms Deep Learning

View on PubMed DOI

Computational framework for fusing eye movements and spoken narratives for image annotation.

Journal of vision Jul 1, 2020

Despite many recent advances in the field of computer vision, there remains a disconnect between how computers process images and how humans understand them. To begin to bridge this gap, we propose a framework that integrates human-elicited gaze and ...

Neural Networks, Computer Databases, Factual Female Humans Male Adolescent Eye Movements Young Adult Adult Speech Perception Semantics Data Curation

View on PubMed DOI

A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.

The Journal of the Acoustical Society of America Jun 1, 2020

Deep learning based speech separation or noise reduction needs to generalize to voices not encountered during training and to operate under multiple corruptions. The current study provides such a demonstration for hearing-impaired (HI) listeners. Sen...

Hearing Loss, Sensorineural Algorithms Deep Learning Speech Intelligibility Speech Perception Hearing Humans

View on PubMed DOI

EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition.

Cognitive science Apr 1, 2020

Despite the lack of invariance problem (the many-to-many mapping between acoustics and percepts), human listeners experience phonetic constancy and typically perceive what a speaker intends. Most models of human speech recognition (HSR) have side-ste...

Speech Perception Phonetics Models, Neurological Semantics Neural Networks, Computer Computer Simulation Female Male Humans Speech

View on PubMed DOI

Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses.

Journal of speech, language, and hearing research : JSLHR Mar 25, 2019

Purpose Speech-evoked neurophysiological responses are often collected to answer clinically and theoretically driven questions concerning speech and language processing. Here, we highlight the practical application of machine learning (ML)-based appr...

Evoked Potentials, Auditory Humans Speech Perception Machine Learning

View on PubMed DOI

A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.

The Journal of the Acoustical Society of America Mar 1, 2019

For deep learning based speech segregation to have translational significance as a noise-reduction tool, it must perform in a wide variety of acoustic environments. In the current study, performance was examined when target speech was subjected to in...

Signal-To-Noise Ratio Speech Intelligibility Speech Perception Hearing Loss, Sensorineural Hearing Aids Middle Aged Deep Learning Speech Recognition Software Aged Male Humans Female

View on PubMed DOI

Talker change detection: A comparison of human and machine performance.

The Journal of the Acoustical Society of America Jan 1, 2019

The automatic analysis of conversational audio remains difficult, in part, due to the presence of multiple talkers speaking in turns, often with significant intonation variations and overlapping speech. The majority of prior work on psychoacoustic sp...

Speech Recognition Software Speech Intelligibility Psychoacoustics Adult Humans Speech Perception Male Female Natural Language Processing

View on PubMed DOI

Vision-referential speech enhancement of an audio signal using mask information captured as visual data.

The Journal of the Acoustical Society of America Jan 1, 2019

This paper describes a vision-referential speech enhancement of an audio signal using mask information captured as visual data. Smartphones and tablet devices have become popular in recent years. Most of them not only have a microphone but also a cam...

Humans Male Speech Perception Speech Recognition Software Female Image Processing, Computer-Assisted Signal-To-Noise Ratio Adult Natural Language Processing

View on PubMed DOI

AIMC Topic: Speech Perception

On training targets for deep learning approaches to clean speech magnitude spectrum estimation.

Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input.

A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions.

Computational framework for fusing eye movements and spoken narratives for image annotation.

A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.

EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition.

Machine Learning Approaches to Analyze Speech-Evoked Neurophysiological Responses.

A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.

Talker change detection: A comparison of human and machine performance.

Vision-referential speech enhancement of an audio signal using mask information captured as visual data.

Popular Topics

Recent Journals

AIMC Topic: Speech Perception

Don't Miss the Future of Medicine

Popular Topics

Recent Journals