AIMC Topic: Speech

Clear Filters Showing 51 to 60 of 395 articles

Multimodal learning-based speech enhancement and separation, recent innovations, new horizons, challenges and real-world applications.

Computers in biology and medicine
With the increasing global prevalence of disabling hearing loss, speech enhancement technologies have become crucial for overcoming communication barriers and improving the quality of life for those affected. Multimodal learning has emerged as a powe...

A Tunable Forced Alignment System Based on Deep Learning: Applications to Child Speech.

Journal of speech, language, and hearing research : JSLHR
PURPOSE: Phonetic forced alignment has a multitude of applications in automated analysis of speech, particularly in studying nonstandard speech such as children's speech. Manual alignment is tedious but serves as the gold standard for clinical-grade ...

Enhancing target speaker extraction with Hierarchical Speaker Representation Learning.

Neural networks : the official journal of the International Neural Network Society
Target speaker extraction aims to obtain the speech of the specific speaker from a mixture of multiple voices. The conventional approach exploits the target speaker embeddings from a pre-recorded speech segment as auxiliary information, providing pri...

SuperM2M: Supervised and mixture-to-mixture co-learning for speech enhancement and noise-robust ASR.

Neural networks : the official journal of the International Neural Network Society
The current dominant approach for neural speech enhancement is based on supervised learning by using simulated training data. The trained models, however, often exhibit limited generalizability to real-recorded data. To address this, this paper inves...

Harnessing emotion and intonation in speech to improve robot acceptance.

Science robotics
The use of emotional words and expressive voices in robots alters the attribution of agency and experience by humans.

Building a Gender-Bias-Resistant Super Corpus as a Deep Learning Baseline for Speech Emotion Recognition.

Sensors (Basel, Switzerland)
The focus on Speech Emotion Recognition has dramatically increased in recent years, driven by the need for automatic speech-recognition-based systems and intelligent assistants to enhance user experience by incorporating emotional content. While deep...

Exploring emotional climate recognition in peer conversations through bispectral features and affect dynamics.

Computer methods and programs in biomedicine
BACKGROUND AND OBJECTIVE: Emotion recognition in conversations using artificial intelligence (AI) has gained significant attention due to its potential to provide insights into human social behavior. This study extends AI-based emotion recognition to...

A multi-dilated convolution network for speech emotion recognition.

Scientific reports
Speech emotion recognition (SER) is an important application in Affective Computing and Artificial Intelligence. Recently, there has been a significant interest in Deep Neural Networks using speech spectrograms. As the two-dimensional representation ...

Machine learning-assisted wearable sensing systems for speech recognition and interaction.

Nature communications
The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sen...

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.

Nature human behaviour
This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals acr...