AIMC Topic: Speech

Clear Filters Showing 31 to 40 of 352 articles

Automated segmentation of child-clinician speech in naturalistic clinical contexts.

Research in developmental disabilities
BACKGROUND: Computational approaches hold significant promise for enhancing diagnosis and therapy in child and adolescent clinical practice. Clinical procedures heavily depend n vocal exchanges and interpersonal dynamics conveyed through speech. Rese...

Endpoint-aware audio-visual speech enhancement utilizing dynamic weight modulation based on SNR estimation.

Neural networks : the official journal of the International Neural Network Society
Integrating visual features has been proven effective for deep learning-based speech quality enhancement, particularly in highly noisy environments. However, these models may suffer from redundant information, resulting in performance deterioration w...

Prompt Tuning of Deep Neural Networks for Speaker-Adaptive Visual Speech Recognition.

IEEE transactions on pattern analysis and machine intelligence
Visual Speech Recognition (VSR) aims to infer speech into text depending on lip movements alone. As it focuses on visual information to model the speech, its performance is inherently sensitive to personal lip appearances and movements, and this make...

Artificial intelligence empowered voice generation for amyotrophic lateral sclerosis patients.

Scientific reports
Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that can result in a progressive loss of speech due to bulbar dysfunction, which can have significant negative impact on the patient's mental well-being. Alternative Augmentative Comm...

DSTCNet: Deep Spectro-Temporal-Channel Attention Network for Speech Emotion Recognition.

IEEE transactions on neural networks and learning systems
Speech emotion recognition (SER) plays an important role in human-computer interaction, which can provide better interactivity to enhance user experiences. Existing approaches tend to directly apply deep learning networks to distinguish emotions. Amo...

Computing nasalance with MFCCs and Convolutional Neural Networks.

PloS one
Nasalance is a valuable clinical biomarker for hypernasality. It is computed as the ratio of acoustic energy emitted through the nose to the total energy emitted through the mouth and nose (eNasalance). A new approach is proposed to compute nasalance...

Momentary Depression Severity Prediction in Patients With Acute Depression Who Undergo Sleep Deprivation Therapy: Speech-Based Machine Learning Approach.

JMIR mental health
BACKGROUND: Mobile devices for remote monitoring are inevitable tools to support treatment and patient care, especially in recurrent diseases such as major depressive disorder. The aim of this study was to learn if machine learning (ML) models based ...

Early detection of high blood pressure from natural speech sounds with graph diffusion network.

Computers in biology and medicine
This study presents an innovative approach to cuffless blood pressure prediction by integrating speech and demographic features. With a focus on non-invasive monitoring, especially in remote regions, our model harnesses speech signals and demographic...

A pilot study for speech assessment to detect the severity of Parkinson's disease: An ensemble approach.

Computers in biology and medicine
BACKGROUND: Changes in voice are a symptom of Parkinson's disease and used to assess the progression of the condition. However, natural differences in the voices of people can make this challenging. Computerized binary speech classification can ident...

Speech-based personality prediction using deep learning with acoustic and linguistic embeddings.

Scientific reports
This study introduces a novel method for predicting the Big Five personality traits through the analysis of speech samples, advancing the field of computational personality assessment. We collected data from 2045 participants who completed a self-rep...