AIMC Topic: Speech Recognition Software

Clear Filters Showing 1 to 10 of 94 articles

End-to-end feature fusion for jointly optimized speech enhancement and automatic speech recognition.

Scientific reports
Speech enhancement (SE) and automatic speech recognition (ASR) in real-time processing involve improving the quality and intelligibility of speech signals on the fly, ensuring accurate transcription as the speech unfolds. SE eliminates unwanted backg...

MS-EmoBoost: a novel strategy for enhancing self-supervised speech emotion representations.

Scientific reports
Extracting richer emotional representations from raw speech is one of the key approaches to improving the accuracy of Speech Emotion Recognition (SER). In recent years, there has been a trend in utilizing self-supervised learning (SSL) for extracting...

A novel Swin transformer based framework for speech recognition for dysarthria.

Scientific reports
Dysarthria frequently occurs in individuals with disorders such as stroke, Parkinson's disease, cerebral palsy, and other neurological disorders. Well-timed detection and management of dysarthria in these patients is imperative for efficiently handli...

A study on phonemes recognition method for Mandarin pronunciation based on improved Zipformer-RNN-T(Pruned) modeling.

PloS one
In recent years, empowered by artificial intelligence technologies, computer-assisted language learning systems have gradually become a hot topic of research. Currently, the mainstream pronunciation assessment models rely on advanced speech recogniti...

A Dataset of Real and Synthetic Speech in Ukrainian.

Scientific data
This work is dedicated to the analysis and evaluation of the DRSSU dataset: A Dataset of Real and Synthetic Speech in Ukrainian, created to support research in the field of natural language processing and speech recognition. The dataset contains a un...

Automatic development of speech-in-noise hearing tests using machine learning.

Scientific reports
Understanding speech in noisy environments is a primary challenge for individuals with hearing loss, affecting daily communication and quality of life. Traditional speech-in-noise tests are essential for screening and diagnosing hearing loss but are ...

A study on innovation resistance of artificial intelligence voice assistants based on privacy infringement and risk perception.

PloS one
As a vital tool for human-computer interaction, artificial intelligence (AI) voice assistants have become an integral part of individuals' everyday routines. However, there are still a series of problems caused by privacy violations in current use. T...

Machine learning-assisted wearable sensing systems for speech recognition and interaction.

Nature communications
The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sen...

Machine learning tools match physician accuracy in multilingual text annotation.

Scientific reports
In the medical field, text annotation involves categorizing clinical and biomedical texts with specific medical categories, enhancing the organization and interpretation of large volumes of unstructured data. This process is crucial for developing to...

Prompt Tuning of Deep Neural Networks for Speaker-Adaptive Visual Speech Recognition.

IEEE transactions on pattern analysis and machine intelligence
Visual Speech Recognition (VSR) aims to infer speech into text depending on lip movements alone. As it focuses on visual information to model the speech, its performance is inherently sensitive to personal lip appearances and movements, and this make...