AIMC Topic: Speech

Clear Filters Showing 31 to 40 of 368 articles

SuperM2M: Supervised and mixture-to-mixture co-learning for speech enhancement and noise-robust ASR.

Neural networks : the official journal of the International Neural Network Society
The current dominant approach for neural speech enhancement is based on supervised learning by using simulated training data. The trained models, however, often exhibit limited generalizability to real-recorded data. To address this, this paper inves...

Harnessing emotion and intonation in speech to improve robot acceptance.

Science robotics
The use of emotional words and expressive voices in robots alters the attribution of agency and experience by humans.

Building a Gender-Bias-Resistant Super Corpus as a Deep Learning Baseline for Speech Emotion Recognition.

Sensors (Basel, Switzerland)
The focus on Speech Emotion Recognition has dramatically increased in recent years, driven by the need for automatic speech-recognition-based systems and intelligent assistants to enhance user experience by incorporating emotional content. While deep...

Exploring emotional climate recognition in peer conversations through bispectral features and affect dynamics.

Computer methods and programs in biomedicine
BACKGROUND AND OBJECTIVE: Emotion recognition in conversations using artificial intelligence (AI) has gained significant attention due to its potential to provide insights into human social behavior. This study extends AI-based emotion recognition to...

A multi-dilated convolution network for speech emotion recognition.

Scientific reports
Speech emotion recognition (SER) is an important application in Affective Computing and Artificial Intelligence. Recently, there has been a significant interest in Deep Neural Networks using speech spectrograms. As the two-dimensional representation ...

Machine learning-assisted wearable sensing systems for speech recognition and interaction.

Nature communications
The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sen...

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.

Nature human behaviour
This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals acr...

Linguistic cues for automatic assessment of Alzheimer's disease across languages.

Journal of Alzheimer's disease : JAD
BackgroundMost common forms of dementia, including Alzheimer's disease, are associated with alterations in spoken language.ObjectiveThis study explores the potential of a speech-based machine learning (ML) approach in estimating cognitive impairment,...

MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion.

Scientific reports
Speech emotion recognition has seen a surge in transformer models, which excel at understanding the overall message by analyzing long-term patterns in speech. However, these models come at a computational cost. In contrast, convolutional neural netwo...

Multi-source sparse broad transfer learning for parkinson's disease diagnosis via speech.

Medical & biological engineering & computing
Diagnosing Parkinson's disease (PD) via speech is crucial for its non-invasive and convenient data collection. However, the small sample size of PD speech data impedes accurate recognition of PD speech. Therefore, we propose a novel multi-source spar...