AIMC Topic: Speech

Clear Filters Showing 1 to 10 of 352 articles

LSTM autoencoder based parallel architecture for deepfake audio detection with dynamic residual encoding and feature fusion.

Scientific reports
With the rapid advancement of synthetic speech technologies, detecting deepfake audio has become essential for preventing impersonation and misinformation. This study aims to enhance detection performance by addressing limitations in existing models,...

End-to-end feature fusion for jointly optimized speech enhancement and automatic speech recognition.

Scientific reports
Speech enhancement (SE) and automatic speech recognition (ASR) in real-time processing involve improving the quality and intelligibility of speech signals on the fly, ensuring accurate transcription as the speech unfolds. SE eliminates unwanted backg...

MS-EmoBoost: a novel strategy for enhancing self-supervised speech emotion representations.

Scientific reports
Extracting richer emotional representations from raw speech is one of the key approaches to improving the accuracy of Speech Emotion Recognition (SER). In recent years, there has been a trend in utilizing self-supervised learning (SSL) for extracting...

Speech imagery brain-computer interfaces: a systematic literature review.

Journal of neural engineering
Speech Imagery (SI) refers to the mental experience of hearing speech and may be the core of verbal thinking for people who undergo internal monologues. It belongs to the set of possible mental imagery states that produce kinesthetic experiences whos...

A novel Swin transformer based framework for speech recognition for dysarthria.

Scientific reports
Dysarthria frequently occurs in individuals with disorders such as stroke, Parkinson's disease, cerebral palsy, and other neurological disorders. Well-timed detection and management of dysarthria in these patients is imperative for efficiently handli...

AI-powered remote monitoring of brain responses to clear and incomprehensible speech via speckle pattern analysis.

Journal of biomedical optics
SIGNIFICANCE: Functional magnetic resonance imaging provides high spatial resolution but is limited by cost, infrastructure, and the constraints of an enclosed scanner. Portable methods such as functional near-infrared spectroscopy and electroencepha...

Exploring voice as a digital phenotype in adults with ADHD.

Scientific reports
Current diagnostic procedures for attention deficit hyperactivity disorder (ADHD) are mainly subjective and prone to bias. While research on potential biomarkers, including EEG, brain imaging, and genetics is promising, it has yet to demonstrate clin...

Single-microphone deep envelope separation based auditory attention decoding for competing speech and music.

Journal of neural engineering
In this study, we introduce an end-to-end single microphone deep learning system for source separation and auditory attention decoding (AAD) in a competing speech and music setup. Deep source separation is applied directly on the envelope of the obse...

A Dataset of Real and Synthetic Speech in Ukrainian.

Scientific data
This work is dedicated to the analysis and evaluation of the DRSSU dataset: A Dataset of Real and Synthetic Speech in Ukrainian, created to support research in the field of natural language processing and speech recognition. The dataset contains a un...

A Multimodal Approach for Early Identification of Mild Cognitive Impairment and Alzheimer's Disease With Fusion Network Using Eye Movements and Speech.

IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society
Detecting Alzheimer's disease (AD) in its earliest stages, particularly during an onset of Mild Cognitive Impairment (MCI), remains challenging due to the overlap of initial symptoms with normal aging processes. Given that no cure exists and current ...