Speech enhancement (SE) and automatic speech recognition (ASR) in real-time processing involve improving the quality and intelligibility of speech signals on the fly, ensuring accurate transcription as the speech unfolds. SE eliminates unwanted backg...
Extracting richer emotional representations from raw speech is one of the key approaches to improving the accuracy of Speech Emotion Recognition (SER). In recent years, there has been a trend in utilizing self-supervised learning (SSL) for extracting...
Dysarthria frequently occurs in individuals with disorders such as stroke, Parkinson's disease, cerebral palsy, and other neurological disorders. Well-timed detection and management of dysarthria in these patients is imperative for efficiently handli...
Cross-corpus speech emotion recognition (CCSER) aims to develop robust models capable of accurately identifying a speaker's emotional state across diverse datasets. This task is challenged by variations in dataset characteristics, such as differences...
In recent years, empowered by artificial intelligence technologies, computer-assisted language learning systems have gradually become a hot topic of research. Currently, the mainstream pronunciation assessment models rely on advanced speech recogniti...
This work is dedicated to the analysis and evaluation of the DRSSU dataset: A Dataset of Real and Synthetic Speech in Ukrainian, created to support research in the field of natural language processing and speech recognition. The dataset contains a un...
Understanding speech in noisy environments is a primary challenge for individuals with hearing loss, affecting daily communication and quality of life. Traditional speech-in-noise tests are essential for screening and diagnosing hearing loss but are ...
As a vital tool for human-computer interaction, artificial intelligence (AI) voice assistants have become an integral part of individuals' everyday routines. However, there are still a series of problems caused by privacy violations in current use. T...
The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sen...
In the medical field, text annotation involves categorizing clinical and biomedical texts with specific medical categories, enhancing the organization and interpretation of large volumes of unstructured data. This process is crucial for developing to...
Join thousands of healthcare professionals staying informed about the latest AI breakthroughs in medicine. Get curated insights delivered to your inbox.